In this article, Rebecca Blakely, a government documents librarian at McNeese State University, describes the process by which she used Google Books and the Internet Archive to supplement the McNeese Library government documents collection. The collection fared badly in Hurricane Rita, suffering water damage and mold. Blakely eventually stumbled upon some full-text government documents in Google Books while helping a patron, and it occurred to her that digitized materials could compensate somewhat for the library's loss. She describes the search methods she used to find government documents in both Google Books and the Internet Archive, and compares the strengths and weaknesses of the two.
For Blakely, the best feature in Google Books is the "my library" option, which can be used to compile items and share them with other users. She started compiling full-text government documents she found using that option - her collection is available here. (Because of extensive tagging, in some ways her small digital library is much more easily browsable than physical collections of government documents.) Blakely notes that it's also possible to create RSS feeds to point out new items added to the collection. She mentions that the quality of scanning and metadata varies, but praises the range of viewing options: zooming, one or two page display, plain-text display, thumbnails or full screen. Her biggest complaint is that Google Books only provides limited viewing of many government documents, even though the great majority of them are in the public domain. (Google responded to an email about this by explaining that rather than taking the time to figure out the rights status, they just add materials in limited view until the status can be determined for sure. Hopefully this means that more government documents will be available in full view later on.) It is for this reason that Blakely prefers the Internet Archive.
This fits in pretty well with the comparisons drawn by Kalev Leetaru in an article I wrote about previously. The Internet Archive doesn't post books until they've determined that materials are in the public domain or secured permission from the rightsholder in question. They also take a lot more time to produce high-quality scans. As a result, there's significantly less there, but what's there isn't as messy as Google Books. Additionally, the Internet Archive allows users to upload materials to the collection -- Blakely notes that some materials have been uploaded by users who originally downloaded them from Google Books. The Internet Archive allows users to bookmark items, which can then be shared via RSS feeds. The site also offers a "bookmark explorer," which allows users to view items bookmarked by others.
This article illustrates a pretty neat use of these two large digital repositories, and provides good examples of the differences between the two, in terms of features and underlying philosophies. The Internet Archive, while growing, looks like a finished product, while Google Books is very much constantly in progress. I came across an interesting blog post by Ed Felton recently, discussing another blog post about the metadata problems at Google Books; it addresses this point:
"What's most interesting to me is a seeming difference in mindset between critics like Nunberg on the one hand, and Google on the other. Nunberg thinks of Google's metadata catalog as a fixed product that has some (unfortunately large) number of errors, whereas Google sees the catalog as a work in progress, subject to continual improvement. Even calling Google's metadata a "catalog" seems to connote a level of completion and immutability that Google might not assert. An electronic "card catalog" can change every day -- a good thing if the changes are strict improvements such as error fixes -- in a way that a traditional card catalog wouldn't."I think one of the biggest reasons for the backlash against Google Books by librarians stems from overlooking this. They feel like they've turned over a bunch of their best materials to be digitized, but it's been done sloppily in terms of scanning or metadata, and no one knows exactly what the final shape of Google Books will look like, once the settlement is (or isn't) finalized. I think it's a good point, but I'm also skeptical about the plausibility of fixing all these errors. Is Google planning to rescan everything that's blurry, or all the pages with visible scanning hands? I think the "beta" label is a good explanation for some problems, but isn't Google digging itself kind of a deep hole by doing so much so quickly and imprecisely?
Either way, Blakely's article serves as a great example of the flexibility that digitization allows. We've read a great deal this semester about the complicated nature of digital preservation, but in cases like this, digitized documents are certainly preferable to moldy ones.
Thanks for a great blog post! Glad you found good use of my article. :-)
ReplyDelete~Rebecca Blakeley