Easy Ways to Spot Bad Data Online
Save yourself valuable research time by avoiding the “bad stuff,” both online and off.
We learned the hard way when we started our personal family research some twenty years ago that research time is at a premium. One of the basics of research is to get the most out of your time. And nothing can waste valuable research time like bad data. We have followed bad leads many times and have spent an inordinate amount of time poring over someone else’s weak research.
Even in this new world of genealogical research, there is a great deal of “bad” genealogical data, including incorrect and undocumented information, even poorly constructed formats. In fact, because of the Internet and the ease of sharing data, there seems to be even more bad data now than twenty years ago. Bad data can lead even the best researcher down the wrong path, thus “burning” those few available hours he or she may have.
Our goal in this article is to share some tips that will enable you to spot potentially bad information and thus save you time, effort, and frustration.
Absence of documentation. Whenever you discover new information, a fundamental question should come to mind. Is this information documented? The documentation should be visible. It should be obvious. It should be understandable. If there is no source material, a red flag should go up immediately, and the information should be treated skeptically. Use any such information only as a clue to what you are seeking.
Lack of organization. Another clue to spotting bad data is the absence of standard organization. Once you locate data that appears to have potential, look immediately for a table of contents, an explanatory cover page, or a home page if the data comes from a website. Then try to locate some clue as to the organization of the material. If the potential source is a book, there should be an index. If the potential source is an online database, it should be “searchable.” Look for some type of descriptive paragraph explaining how the material has been organized. If this information is absent, a red flag should go up again. Write the location of the potential information in your research log but indicate the apparent weakness. Also, mark the item as something you might return to in the future.
Data format is difficult to decipher. In the world of genealogy, there are standard forms that are used to present information, and we recognize and even rely on these forms. Whether it is a pedigree cha rt, a family group sheet, a drop chart, or an article presented in the format of the National Genealogical Society, we soon learn to recognize and appreciate these forms. They are easy to use and they contain well-ordered data. Unfortunately, there are non-standard forms as well. It is these that can consume an inordinate amount of time with no guarantee of accuracy. While these forms can prove useful, you should apply skepticism and approach them with caution.
Data is presented with cumbersome charts. We marvel at how many of our students have received printed material from a family member. They bring this material to class because they need help in understanding it. In spite of having seen dozens of examples of such material, we marvel at how difficult it is to decipher such research. There are often multiple-layer foldouts with elaborately drawn lines connecting the names of individuals. Quite often, the information is continued on the reverse side or even on another page. We had one such case where a student brought in one of these bizarre data forms that had been given to him by an uncle. We commented that it looked like the work of a rocket scientist applying his mind-set to genealogy. We asked our student what his uncle had done for a living and we learned that he had in fact been a rocket scientist! Go figure. While the complexity alone does not mean it necessarily contains bad data, it is a warning sign.
Data presented uses an obscure numbering system. Closely related to the cumbersome charts are those sources that use numbering systems that challenge our understanding. Many times in our book research we have located a published family history that contains what appears to be good data but we have difficulty tracing backward or forward through the generations. The overall format is fine (it could be recognized) but the numbering system is clunky. The reason this type of data should send up a red flag is that the person who created the reporting system prob ably didn’t know a great deal about standard genealogy or he or she would have used a more universal numbering system. Approach this information with caution and use it with discretion.
Information contains conflicting or incorrect data. There are different types of conflicting data, any one of which should tip you off that there is potentially bad data. The following examples apply equally to Web research and conventional book research.
• You have proven, accurate data from your own research and the source you are looking at has undocumented, contradictory information.
• You turn to the index of a book, find the name of the ancestor you are researching, then turn to the appropriate page only to find no reference to that person anywhere on that page. A bad or poorly indexed source is a sure sign of potentially bad data.
• The preface of the source claims that the author is descended from a famous individual or historical character and the work is an attempt to prove that connection. Quite often, these books approach genealogy backward. They work from the past to the present–the opposite way genealogy should be done. Be skeptical!
• In previewing the source, you discover entries that contain glaring errors or repeated typos. We have encountered books and websites with marriage dates that preceded birth dates; death dates that would make the deceased 187 years old; and even reference to “the following five children” when only two were listed. One must wonder about the accuracy of such research.
The work is self-published. There is nothing wrong with self-publishing your work. But it seems that a great deal of the bad research we have bumped into has been the result of a vanity press. It is almost as if the authors are more intent on completing the book and getting it into print than they are on documenting their sources and corroborating the information. A quick scan of the self-published m aterial, whether online or in print, should help determine its usefulness.
The research is limited in scope. The sources that cover only a very limited topic might not be viable. For example, when researching the published genealogy section of a library or searching the Internet for the same family, you might encounter an entry entitled “Bert Jones, My Grandfather.” While the name is similar to what you are researching, this reference might be too limited in any usable detail. Again, note this in your research log.
You receive an offer in the mail. Of course, the absolute best hint of potentially bad genealogy information is that offer you receive in your mailbox. For only a nominal fee, some company will send you a list of all your relatives and provide a detailed history of your family name. We have all received such an offer. The best advice is to file it away–in the circular file.
With these bad-data tips in mind, you should be able to spot potentially bad information and steer clear of it. Then you can be assured that you are well on your way to getting the most out of your research time.
Terry and Jim Willard hosted the ten-part PBS “Ancestors” series. They have researched their family history fifteen generations back on both sides.
Email This Post