Top Document: [FAQ] FileMaker Pro - database for Macintosh and Windows Previous Document: 4 Field Formats and Index Values Next Document: 6 Find See reader questions & answers on this topic! - Help others by sharing your knowledge 5.1 Convert repeating fields to non repeating fields * Import all data in a clone of the database, splitting repeating fields to single records. If repeating fields don't seem to split properly, check that they are real repeated fields. Contents that are separated by <returns> may look like separate repetitions, but are in fact within the same field. (by Steve Rothman <steve_rothman@rdm.scitex.com>) * Create a script to copy from repeating field entries to separate fields. * Create the single fields as calculation fields from the repeating fields, change the format from calculation to the format desired. <gopher://spinaltap.micro.umn.edu/00/computer/Claris/FileMaker/Splitting the Contents of a Repeating Field into Separate Fields> 5.2 Convert nonrepeating fields to repeating field Create a script: <gopher://spinaltap.micro.umn.edu/00/computer/Claris/FileMaker/Moving Data to Repeating Fields Using a Script> * Export as text, convert record delimiter to repeating field delimiter (ASCII-29) and reimport. The fields of comma-separated text will be stored in a way like: repeatingfield 1<ASCII-29> 2<ASCII-29>3 To move separate fields A, B, C to a single repeating field, - export as text: A, B, C - use a standard text editor or a conversion utility to replace the comma with <ASCII-29> - reimport to a repeating field This works with merge format and tab separated text as well. * Use commercial software (MakeRepeats) 5.2.1 How to create ASCII-29 The character ASCII-29 is not on the keyboard and is invisible most often. However, some editors will display it, e.g. as a small box. Editors like MS Word or Alpha allow for input of decimal code to create this character. The easiest solution is to open any exported database with a suitable editor, copy the ASCII-29 character and paste it where necessary. For reusage it's appropriate to put it in the scrapbook. The very easiest solution is to select this "" - if you are lucky, it's still there. 5.3 Merge separate records to a single repeating field * Create a script to copy from several records to a repeating field. * Export as text, convert delimiter to ASCII-29, reimport The procedure is similar to merging separate fields, but requires some additional effort. Example: Consider a database with the records spearated by Carriage Return <CR>, exported as comma delimited, with fields: Name, Text "A","1"<CR> "B","1"<CR> "A","2" It should be merged to a database "A","1"<ASCII-29>"2"<CR> "B","1" This can be done by sorting the records and replacing the <CR> with <ASCII-29>. To separate between the records A and B some additional criterion is required. Unfortunately FMPro doesn't allow the comparison of a record with the previous one. So one chance is to make this comparison externally, e.g. via a spreadsheet application. Another solution is: create the fields: serial number SN (number): Auto-enter serial number serial number lookup SNL(number): Copy SN if Name matches mark record MR (calculation): If( SN = SNL, Name, "") The sorted database now looks like this: fields: Name, Number, SN, SNL, MR "A","1",1,1,"A"<CR> "A","2",3,1,<CR> "B","1",2,2,"B" Export the fields MR and Number, replace <CR>," with <ASCII-29>" and import. Further more you can find all non-blank MR, sort them similar and import all other fields. For further improvements think about "replace all with new serial number", "relookup" and "sort, import to clone". 5.4 Removing duplicates or Sychronizing databases First, you should make sure what a duplicate is. FMP2.1 may treat "firstname lastname" and "lastname, firstname" as duplicate since both entries contain the same words. Even a later record that only contains "lastname" may get recognized as a duplicate of the first ones. Check the sections for indexing to understand why. FMP3 introduced another type of relation that is based on lines (total max. 60 characters) instead of words (multiple, up to 20 characters each). A find for duplicates is still based on the indexed words. In order to remove duplicates, you have to identify them as a duplicate. You can do so on a single field that should be unique (such as the Social Security Number), a combination of multiple or all fields (e.g. FirstName & LastName) or a combination of all (e.g. LastName & Left(FirstName, 3) & Length(comments)) This is a "duplicate key" suitable for comparisons. You then may identify all records by a find for "!", a lookup or a FMP3 relation. A further step may be the comparison whether the other fields are really identical, e.g. by comparing all fields by calculation or script. Here is a simple example to mark every record that has a matching duplicate record with the value "1". * use a key field "serial number" (number): auto-enter serial number This field holds a unique identifier for every record * define a lookup field "unique lookup" (lookup, number): if "duplicate key" fields are equal, then copy the serial number This will copy the serial number of the matching record. If there is an already existing record, it will get another serial number than the own one. Due to the indexing mechanism it will return for newly created, not yet matching entries a zero value (nothing, 0 or a dummy value when defined so). * define a calculation field "unique calculation" (calculation, number) = if(serial number = unique lookup, 0, 1) Thus every record gets flagged by a number or any other specified text that there is another, duplicate record. You may use this flag to find and remove duplicates or to display a warning. Original and duplicate may be within the same or different files. Lookups only show proper values at the point of creation. They get updated if the actual record is modified or a relookup is performed, but not if the first, original record is removed. For FMP3 one may use relations instead. 5.4.1 Removing duplicates - another approach Here's a more detailed description by Jason Hirschhorn <jason@hirschhorn.com> that uses the "!" as find duplicate command instead. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= FileMaker Pro 2.x Removing duplicate information from a database is frequently necessary, especially after updating a central database from another database. Here is a method to remove duplicate records from a FileMaker Pro 2.0 file that takes advantage of the new ScriptMaker features of FileMaker Pro 2.0. Before you begin, save a copy of your database as a backup and make a clone as a working copy of the database. Fill the working copy with several duplicate records. Read the User's Guide and, using the working copy, practice making scripts until you understand and are comfortable with using ScriptMaker. You will also need a field with a unique value that will uniquely identify an individual or item and its duplicates but will not find non-duplicate records. Some examples of this are: * a field with a social security number (unique to an individual and will find all duplicate records of the person) * a field with a unique code number (unique to an item and that will find all duplicates of a part, customer or other item number; do not use record number as this is a changing value) * a field that combines the contents of several fields to create a unique value. (this unique value must identify duplicate records without finding any other records. A word about the unique field If you do not have a field that already contains a value unique to the individual person, company, or item, you will have to create one. For example, if you have an address book database you would not want to remove all the people that share the last name "Jones," just duplicates of each individual. And, if the database contains the fields FirstName, LastName, Address, City, State, Zip, and PhoneNumber, you can create a calculation field that concatenates (combines) two or more of these fields to create a unique field. You can call your new unique field DuplicateCalc: DuplicateCalc (calculation, text result) = FirstName & LastName & Zip The result will be a single text string that looks this: FirstNameLastNameZip. Some examples might look like this: JohnSmith10825 MadelineKwan60610 PierreMyers33976 This approach works well for most databases, but it's still possible to have more than one "John Smith" in the same zip code. In this case you may want to add another field, such as a phone number, to be sure that this DuplicateCalc field will always be unique und er any circumstance. Just make sure the maximum length of the unique field is less than 20 characters; FileMaker Pro 2.x does not index any single text string beyond that limit. This limit of 20 characters is for every single word. You may use multiple words, but both "word1 word2" and "word2 word1" get indexed the same, and "word1" alone may relate to both. FileMaker Pro 3.0 takes instead a line of 60 characters as index for relations and lookups. Setting up the scripts Now that you have a field you can depend on to match duplicate records, you are ready to set up the scripts that will delete duplicates. In your working copy of the database create a new layout (call it "Find Duplicates") that will show only the unique field and the fields used to define it. Advantages & Disadvantages The advantage of this process is that it automates the task of removing duplicates by using the scripting capability of FileMaker Pro. The drawback is that it slows down as the database increases in size and the first occurrence of a record is retained and all others are deleted, not necessarily the one with the latest entered information. Duplicate records with spelling errors will NOT be found using this method. Creating the scripts The scripts must be created from the last to the first. The script that finds and deletes the duplicates must be created before the cleanup script. If you must edit the scripts after creating them do not replace the "Find Requests" or repeat step 1 before editing the script. 1. Choose Find from the Select menu, and type an exclamation point (!) into the "DuplicateCalc" field created above, and click Find. Review the records to confirm that only duplicates have been found. To fully review the records go to a layout that shows all the information in each record. Be sure to return to the "Find Duplicates" layout before continuing. 2. Open ScriptMaker from the Scripts menu and create a new script called "Delete Loop". This script should have the following steps: Perform Find [Restore] Go to Next Record/Request [Exit script after last] Go to Next Record/Request [Exit script after last] Go to Record/Request [No dialog, 1] Copy [Select, "DuplicateCalc"] Enter Find Mode [ ] Paste [Select, "DuplicateCalc"] Perform Find [ ] Omit Delete Found Set [No dialog] Perform Script [Sub-scripts, "Delete Loop"] With that script completed, now create the second script Remove Duplicates. This script will perform some setup work for the "Delete Loop" and do some clean up work afterwards. This script is the one that you will select from the Scripts menu to perform the remove duplicate process. [Note: For FMP3 you may use the newer loop functions instead] 3. Open ScriptMaker from the Scripts menu and create a new script called "Remove Duplicates". Include this script in the menu. For convenience wehave used the name "First Field", which is the first field that shows in the Find Duplicates layout, but it cannot be the DuplicateCalc field. This script should have the following steps: Enter Browse Mode [ ] Go to Layout [Find Duplicates] New Record/Request Go to Field ["First Field"] Paste Literal ["ABC123"] Duplicate Record/Request Perform Script [Sub-scripts, "Delete Loop"] Enter Find Mode [ ] Go to Field ["First Field"] Paste Literal ["ABC123"] Perform Find [ ] Delete Found Set [No dialog] You have now successfully created a set of scripts that will delete all the duplicate records in your database. Note: using the value "ABC123" will work even if the "First Field" is defined as a number. [Note: For FMP3 you do not have to create a dummy duplicate record "ABC123" but may check whether 0 records were found] Before testing the script, make sure you have a backup copy in case something goes wrong, then choose "Remove Duplicates" from the Scripts menu. FileMaker will begin identifying and removing duplicate records from your file. You are now ready to work on the data from the original database. 1) In your working copy find all records. 2) Delete all records (use the Delete Found Set option from the Edit menu). 3) Import records (it does not matter if you replace or add to an empty database) from the original database. 4) Choose "Remove Duplicates" from the Scripts menu. Review the data to confirm that no valuable information has been lost, i.e., that the deleted records did not have more information than the saved record. If you have lost information do steps 1 through 3 listed above and verify that the information was there in the first place. Then manually find the duplicate(s) of the full information record and delete them manually. You can now choose "Remove Duplicates" from the Scripts menu. User Contributions:Top Document: [FAQ] FileMaker Pro - database for Macintosh and Windows Previous Document: 4 Field Formats and Index Values Next Document: 6 Find Single Page [ Usenet FAQs | Web FAQs | Documents | RFC Index ] Send corrections/additions to the FAQ Maintainer: traut@th-darmstadt.de (Martin Trautmann)
Last Update March 27 2014 @ 02:11 PM
|
Comment about this article, ask questions, or add new information about this topic: