Sharing Your Data with Others
We’re all about being helpful when it comes to genealogical research, but it often takes extra effort to share our most important work.
Nothing is more characteristic of genealogists than their enthusiasm for helping others. Their help may be as little as a casual suggestion to the person at the next microfilm reader, or as much as compiling a huge collection of data that is useful to other researchers. These projects usually start with information they have collected from unindexed or unpublished sources, after they recognize that some of the “good stuff” is about the friends, neighbors, and associates of their own folks, and that it might be helpful to other people looking for them. Often that leads to abstracting all the names and other significant data from an entire volume or series of records, or compiling the names from it into an alphabetical index.
Now that we have computers and the Internet, sharing genealogical information is easier than ever. The casual suggestion today might as easily be in response to a bulletin board or mailing list posting from halfway around the world, as to a comment from the person at the next reader. Large compilations of data can be shared over the Internet at little cost to the author.
Computers have also simplified the task of extracting information from records. No longer does making an index mean copying all the names and page numbers on index cards and arranging them in alphabetical order. All we need to do is enter them once in a computer program, and the program can place them in alphabetical order or search for a wanted item without rearranging them.
Now anyone can easily extract, compile, and share large amounts of data. Thousands of people are doing so, not only individually through their own e-mails and websi tes, but in large organized projects such as USGenWeb and local and special-interest genealogical groups.
The purpose is to be helpful, and furthest from genealogists’ minds is shortchanging other users, but it can easily happen when we lose some of the helpful information before we share it with others.
The information content of a piece of genealogical data is in three separate parts and each has its own risks for information loss:
1. Data Elements
Usually this includes a personal name or names, together with a place, date, number, characteristic or relationship, often in the form of a statement.
2. Source of the Information
This is both the record or publication where the information was found and its source (the record, or the informant that originally provided the information).
3. The Position
The position of the information in relation to other similar items.
Remember that an index, abstract, or extract is primarily a finding aid. When prepared with care, it serves two purposes: 1) it allows a researcher to decide whether the item has anything to do with a particular question, and 2) if it does, then it allows a researcher to find the reference in the original record without a tedious page-by-page search.
We know that many researchers will take the compiled data as it is presented without ever checking the original record, so our accuracy will help guard against starting a new chain of errors like those that are now prevalent on many websites.
Data Elements
The two most serious risks to an element of data are omitting it and failing to copy it accurately. Resist the urge to correct what appear to be misspellings or other errors. Part of the information content is the fact that the earlier writer entered the mistake. If yo ur format allows you to enter an explanatory note about the error, make sure it’s plainly set off from the actual data by using square brackets, for example, or by placing it in a column or field clearly labeled “Editorial Notes.”
Generally, useful genealogical information will consist of several associated elements—a name and surname, for instance, with a date and relationship to another name, or perhaps to an event and place. Each is equally important and contributes to the item’s usefulness when it is considered with other related information about the same name or place.
The Source of the Information
Anyone who’s serious about getting his or her family history right soon learns the importance of citing sources. Books have been written about how best to do this to meet the needs of others who use our information. The current standard for American genealogy is Evidence! Citation and Analysis for the Family Historian by Elizabeth Shown Mills (Baltimore: Genealogical Publishing Co., 1997), but almost a quarter century ago the late Richard S. Lackey pioneered in this area with Cite Your Sources (New Orleans: Polyanthos, 1980).
The rule of thumb in identifying sources is to name the record where it was found, and if that was a derivative from some other record, such as a photocopy, microfilm, carbon copy, transcription, or abstract, to also name the original.
The idea is to let those who see the citation assess both the reliability of the original record and the potential for loss or change along its route of transmission.
It’s even better when we can identify the informant who provided the information for the original record. Standard birth and death certificates for some periods actually name the informant. In other cases, presumptions are possible.
For example, the information for a Roman Catholic baptismal record was generally provided by g odparents or the father; it wasn’t usual for the mother to be present. Similarly, on a census record where the head of household was employed and his wife was listed as “keeping house,” she was in all likelihood the informant. But a teenage child listed “at home” rather than in school or employed can’t be ruled out as the source of the information. Once we know who may have given the information, we can estimate how likely they were to have accurate knowledge of the facts.
Position
Position is all too often neglected as an information item, but look at what it can tell us. Position of households on a census schedule shows which
families were neighbors and which were more distant. Position on the page of a baptismal register usually shows the order in which entries were made. An entry not in chronological sequence may have been made when knowledge of the event was less recent, and perhaps less reliable.
These are simple one-dimensional positions—the order in a single sequence of similar information. A record of that sequence is an important element of information to preserve, so we can tell the relationship of an entry to other entries. Fortunately, many records already assign sequence numbers we can use (e.g., the census by dwelling and household numbers, many church registers by sequence numbers for the page, month, or year).
If the original record has no sequence numbers, the transcriber should assign them so the position information is not lost if the information is randomly or alphabetically stored in a database. Similarly, the order in which loose papers are arranged in a file folder should be noted, for it often indicates the order in which the information was recorded, received, acted upon, or placed in the file and is not always indicated on the document itself.
If position numbers in a single one-dimensional sequence are significant, position information is even more important when two dimensions are involved, as with locations of gravestones in a cemetery, or houses on a city block. Two houses on an assessor’s list two pages apart may seem unrelated, but a tax map may show that they are back-to-back on the same block, though facing different streets. Similarly, if gravestones are read row-by-row, without information about how the rows relate to each other, it would be easy to overlook people buried in adjacent rows of the same lot who may be members of the same family.
Occasionally, locational information in three dimensions will need to be preserved, as with apartments in multi-story buildings, or tiers of penitentiary cells or warehouse bins, but most significant genealogical information doesn’t need all three dimensions of length, width, and height to be recorded.
Where position or locational information is part of a record, the transcriber should always preserve it as an essential part of the information content. If it is missing, it should be added, so far as it can be determined, but clearly indicating that it was not part of the original record.
Omissions
Omissions have occurred so often in printed data collections that we have all become a little hardened to it. It is not uncommon to find gravestone inscriptions alphabetized to make them user-friendly, or chronological vital records abstracts from newspapers rearranged by type of event, and/or alphabetized so they’ll be easier to use.
With the capabilities of computer search engines, physical rearrangement is no longer necessary to make records readily accessible by name or by any other element. However, data stored digitally is subject to random rearrangement. To preserve position information it may be necessary to add a data element showing where it was located in relation to other items in the original source, unless that information is already part of the record.
We can avoid shortchang ing the users of our compiled genealogical information by making sure we preserve all the data elements present in each separate item, as well as its position in relation to other items, the source where we found it, and when known, where it originated.
Donn Devine, CGSM, CGISM, a genealogical consultant from Wilmington, Delaware, is an attorney for the city and archivist of the Catholic Diocese of Wilmington. He is a former National Genealogical Society board member, currently chairs its Standards Committee, and is a trustee of the Board for Certification of Genealogists®.
Return to May/June 2004 issue of Ancestry Magazine.
Email This Post