Saturday, September 08, 2007

Did Ancestry Violate the Copyright Law? . . . Prologue

Part I of A Legal Analysis of the Late Controversy

By now, the brouhaha over's "Internet Biographical Collection" has largely blown over. Ancestry has said that they will permanently remove the database and the genealogical community is ready to move on.

The passage of a little time and the cooling of passion on the issue permits some calm and reasoned examination of the legal issues raised by the episode. This examination is necessary, I think, to inform the community and to provide a background against which we may judge the next episode. This examination also, I hope, will provide some ideas about how to protect, yet share our efforts, in the interests of genealogical research.

Full Disclosure: GeneaBlogie was one of the blogs the content of which appeared in the "Internet Biographical Collection" (which I will hereafter refer to as "IBC"--lawyers love abbreviations and acronyms). My initial reaction was like that of others whose content was thus appropriated--anger.

The Facts: Every legal analysis must begin with the facts of the particular situation. The application of the law depends on specific factual circumstances. Unfortunately, in this case, some of the relevant facts are known only to We do not have the benefit of discovering all of those facts as we would in litigation. So we present here the facts as we know them:

In the last week of August, 2007, added the IBC to its site. Ancestry described the IBC as "a compilation of genealogy information across the web." In fact, on the day I first viewed the IBC, having been alerted by another member of the community, the IBC appeared to be copies of websites. Ancestry said, "The site is also displaying a live link back to the source site where the information was extracted." Ancestry described the IBC as a "search engine" and stated, "We cached individual Web pages in an effort to preserve history – if a Web page featuring important family history information were taken down in the future, a cached version would still be available." For several days, the information was in the paid-subscriber section of Ancestry's site. Later, Ancestry announced, "Based on community response to the addition of the Internet Biographical Collection, has decided to make the database free." The company said that "the goal behind the collection is to help surface genealogical information that many people would not be able to find easily because it is often scattered among numerous websites across the Internet."

When Ancestry announced the removal of the IBC, they said, "We had hoped to provide a way for you to be able to search the entire web easily for genealogically-relevant pages and provide for preservation of sources for future generations." The announcement added that the company "hope[d] that someday we’ll be able to provide a free web search engine that links directly back to the live web pages, and can become a useful tool to the genealogical community."

Each of these actions and statements by has legal significance.

Below are some concepts necessary to our legal analysis.

The Basic Copyright Law: Title 17 of the United States Code is the copyright law in the United States. It protects "original works of authorship," both published and unpublished. Among other things, the law protects the author's right to reproduce the work in copies; to prepare derivative works based upon the work; to distribute copies of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending; and to display the work publicly.

No publication or registration is required to secure copyright. Copyright protection subsists from the time the work is created in fixed form. The copyright in the work of authorship immediately becomes the property of the author who created the work. Only the author or those deriving their rights through the author can rightfully claim copyright.

Violation of any of the rights provided by the copyright law to the owner of copyright is contrary to law and may subject the violator to civil or criminal penalties.

A significant limitation on the exclusive rights of a copyright holder is the doctrine of fair use. We discussed this concept several months ago in this post.

The Digital Millenium Copyright Act: In 1998, the Congress enacted the Digital Millenium Copyright Act. There are a number of aspects to this statute, but here, the relevant matter is in Title II of the Act, which is known as the “Online Copyright Infringement Liability Limitation Act.” This part of the law creates limitations on the liability of online service providers for copyright infringement when engaging in certain types of activities.

A Computer Science Concept--Caching: A "cache" is a collection of data duplicating original values stored elsewhere or computed earlier. In other words, a cache is a temporary storage area where frequently accessed data can be stored for rapid access. Once the data is stored in the cache, future use can be made by accessing the cached copy rather than re-fetching the original data. On-line service providers make local copies of Web pages so that the pages don't have to be fetched again and again. Another purpose of caching Web pages is to preserve copies for retrieval in the event the original cannot be accessed.

Prologue to The Legal Analysis

A number of bloggers and commenters opined that what Ancestry had done was a form of "caching" no different than that which Google or any other search engine does. However, there appear to be factual differences between Ancestry's IBC as Ancestry itself described it and Google's search engine operations. The issue is whether these factual differences have any legal significance.

In several commentaries on the Ancestry IBC, I saw the case of Field v. Google cited. Preliminarily, I would observe that there are some unique facts in that case that may have affected the outcome. Second, I would note that as a decision of a federal district court, a trial court, it has no precedential value; that is, it need not be followed by any other court. [It was, however, cited by the federal district court in Pennsylvania which decided a case called Parker v. Google].

Having said that, I do think that both Field and Parker provide convenient frameworks for analyzing this issue. We'll post on that tomorrow. And I'll try to keep it understandable.

TOMORROW: The Legal Analysis--Part I

Notice: The information in this writing is intended for educational use only and is not intended nor should it be construed as legal advice. If you have a legal problem, consult a lawyer admitted to practice in your state of residence. I am an active member of the bar of the State of California and am admitted to practice before the United States Supreme Court and various other federal courts. I am not licensed to practice in any other state. I am not presently soliciting or accepting new clients in the matters discussed above.


Moultrie Creek said...


Thank you for doing this. The Prologue has me hooked and I'm looking forward to the entire series.

Janice said...


You've posted a wonderfully succinct summation of the IBC issue.

I do have two comments. I'm not sure it changes things in a legal sense, but Ancestry also provided an option (to subscribers only, and even after IBC became "free") to click and save the cached page to their "Shoebox"--a holding area of documents that subscribers are interested in.

Also, the initial source description calls the IBC a "database-online," not a search engine (I have a screen shot of that if you need it).

Also, there were several people who argued in commentary on various blogs and message boards that we, as bloggers and web sites owners, should have known that Ancestry would be doing this, due to various announcements and press releases they made, and the burden was on each of us to place a robots.txt file or some sort of HTML coding to prevent from caching our sites. Is the burden truly on the blogger or web site owner, even if they are not commercial (i.e., the "mom and pop" web sites and blogs).

No need to reply here if you intend to cover those areas in your future posts.

Thanks Craig for this post, and I wait anxiously to see part II.


Becky Wiseman said...

Thank you Craig. I'm looking forward to this series and your analysis of the issues involved.

Jeff Scism said...

Ancestry through a spokesperson clearly stated what the intent was initially, "the websites have VALUE. And even if the site owners were to remove the contents, the pages would remain available through"

That indicates to my simple mind that, since only paying customers had original access, that Ancestry had premeditated their intent to take and sell the content, despite what the site owners decided to do with their creations. Their obvious attempt to actually hide the source pages at first and only provide a sanitized copy of the data- removing source website info, and identifying graphics, and copyright notices, shows that the intent was to steal and sell the content.

Another issue not addressed is that Family Tree maker, a genealogy program they sell, still has this search built in, and provides the data directly for merging into your family file, sourcing it as Ancestry's collection, and no direct reference to the authors.

Craig Manson said...

Thanks, Jeff, for the additional info. If they're using it through FTM, that raises serious new issues that we'll cover in our "fair use" post coming up.

Erian Phelyn said...

Hi Craig,

Nice to meet a fellow genealogyNut. Your family history is indeed very interesting. Nice to see a legal angle on the buzz. I've had challenges with their practices for some time. Good to see someone holding them accountable.

Brian Phelps