Wednesday, October 28, 2009

Resource review #2: Open Content Alliance and Google Books

Leetaru, K (2008, October 6). Mass book digitization: the deeper story of Google Books and the Open Content Alliance. First Monday, 13 (10).

In this article, Kalev Leetaru offeres a nuanced perspective regarding the similarities and differences of Google Books and the Open Content Alliance (OCA). He focuses primarily on the technical aspects of their work, their willingness to reveal information about their technical processes, their approaches to copyright and user access, and their use of metadata. Although OCA formed as a reaction to the commercial and secretive nature of Google Books, Leetaru points out that they have not quite delivered on their promise of transparency. While Google has released technical reports about innovations they have made and revealed information about their processing in speeches, very little is known about OCA's technical process. Based on information gathered about both organizations, Leetaru suggests that the two projects are conducted using similar methods. However, OCA spends more time on quality control, while Google Books focuses on increasing output and efficiency. Additionally, Google's PDFs are bitonal, which make them easier to view (even with limited bandwidth) than the full color scans provided by OCA.

Another difference is found in the search options -- Google offers full-text searching, while OCA allows searching only in title and description fields. The two organizations also differ in their approaches to copyright. Google scans copyrighted material, but only allows users to view limited portions in search results. OCA focuses on scanning out-of-copyright materials, but scans in-copyright materials if given permission by the publisher. Possibly the most striking difference described by Leetaru is the approach to restrictions on use of the materials. Public domain materials on Google Books can be downloaded in full. Members of OCA can set their own restrictions on use of the materials they contribute to the project, which means that restrictions vary from item to item. This can apparently get pretty complicated. Google provides metadata explaining the rights policy of each item; OCA does not.

This article provides useful information comparing Google Books to a similar mass digitization project. It's interesting to evaluate OCA's attempt to provide an alternative approach to digitization. Leetaru offers a pretty convincing argument suggesting that OCA hasn't been too successful in meeting its stated goals of transparency and open access. This article also includes pretty thorough descriptions of the process of digitization. Leetaru makes a point of differentiating between the goal of preservation digitization and access digitization. The latter is focused primarily on providing user access to materials, rather than gathering and preserving that material. He argues that both OCA and Google Books are attempts at access digitization, which largely negates much of the criticism directed at Google's quality control standards. If large-scale access is the goal, Leetaru suggests that some level of attention to detail will be lost in order to provide access to more materials. This is an interesting perspective that I hadn't come across before.

No comments:

Post a Comment