Ancestry explains its new collection thusly:
This database contains a sampling of biographical sketches found on English language web pages throughout the entire World Wide Web. Web pages can vary greatly in the amount of information they contain about a given person, and in the number of related and unrelated people mentioned on the same page. The information source and the central topic of each page will also vary greatly. Given facts should be verified using other sources. One unique and valuable feature of this web-based collection is the number of hyperlinks leading from each page in the collection to other web pages of possible interest on related topics.
There are, it seems to me, several problems with this. First, what Ancestry has done is more than "sample" the "biographical sketches;" they've cached whole pages and sites. And they admit implicitly that the data is not theirs. This is just as if they had reproduced, without permission, pages from a printed book. In the case of GeneaBlogie, they've reproduced numerous pages relating to almost every one of the surnames I've researched.
Ancestry.com's actions are made all the more galling by the attitude they've taken with respect to what is arguably fair use of their intellectual property. And where in marketing school do they teach that it's a good idea to rip off and piss off some of your best customers?
The marketing missteps of The Generations Network are unbelievable. Do they have any lawyers? You can bet I'm going to find out.
The genea-blogsphere is in an uproar over this. See Miriam at AnceStories; Amy Crooks of Untangled Family Roots; Chris Dunham of The Genealogue; Kimberly Powell of About.com Genealogy, Becky Wiseman of kinnexions, Randy Seaver of Genea-musings,
and Susan Kitchens (who's got a great parody!).
Susan Kitchens has isolated the bot that is being used to scrape the sites.