Monday, April 14, 2008

Preservation Microfilming versus Digital

One of the popular methods of preserving books that are in a state of deterioration, or brittle books, is to transfer the content to microfilm. However, with the onset of digital preservation this method has fallen out of favor with the National Endowment for the Humanities (NEH). NEH has been the driving force behind a 20-year program of transferring three million brittle books to microfilm. However, with significant budget cuts the NEH has pulled funding from this project, known as the Brittle Books Program, reducing it from $12-million in 1989 to less than $3-million in 2001 (Marcum, 2002). Research into the root causes for deterioration have yielded methods of delaying that deterioration; these methods have been enacted by many libraries lessening the librarians concern over preservation of the materials. This reduction in concern coupled with the reduction in funding means the goal of the Brittle Books Program may not be reached by 2009. It is human nature to procrastinate when there is a long road ahead. However, we are hiding our heads in the sand by assuming de-acidification techniques and improved storage techniques will fix the problem. Books are fragile and their content must be preserved.

Now on to an interesting twist in Marcum's article (Marcum, 2002). As just stated, books are fragile. However, if you look at the dependency of digital media on specific technology you find digital media may be even more fragile than the printed sources. Marcum argues that microfilming is still the best method of preserving information because microfilm readers are simple, easily manufactured and are currently in use. Other forms of digital media may be dependent on specific software or hardware that is already in line for obsolescence. Whereas this issue has been analyzed as part of the preparation and implementation of the NDIIPP program, it is still an interesting arguement.

Marcum then switches gears by saying that even though microfilming should be the preferred method of preservation access to the preserved material is critical and digital forms are the best in that category. This can be a little confusing until you get to the heart of the article which seems to be a call for duplicate preservation, microfilming for long-term and digital for access. In addition, there are programs such as the Digital Library of Freedom that are working to provide a centralized location for the digital materials. This would prevent the cases of several libraries all preserving the same materials.

I can see the benefits of preserving material in multiple forms - something like saving your research paper on a thumbnail drive and printing out a hard copy. However, in times when funding has dried up significantly for these types of projects I do not see this as a logical approach. The most cost effective, efficient, and durable method must be decided upon along with standards for preservation so materials are preserved in a consistent manner. Libraries must continue to prolong the life of books by improved handling and storage. Microfilming of volumes specific to the individual library would also be prudent. But for the multitudes of volumes that are common to many libraries, digitization should be tackled by the programs that have the funding and the project planning to successfully accomplish the task.

Reference

Marcum, D., & Kenney, A. (2002, March 8). The preservation of our brittle books must also preserve access. The Chronicle of Higher Education, 48(26), B20.

Sunday, April 13, 2008

Models for Information Preservation

The widely accepted model for digital preservation is the OAIS Reference Model, shown below (Hitchcock, 2007). This model provides general guidance for archiving digital information.







A detailed explanation of the reference model may be found in the CCSDS publication (CCSDS, 2002). However, in keeping with my most recent posting, there is also a model that represents my concept of where the act of preservation resides in the library process, be it newly arrived materials or re-evaluation of materials. This model is shown below:





References


Consultative Committee for Space Data Systems. (January 2002). Recommendation for space data system standards: Reference model for an open archival information system (OAIS). CCSDS Blue Book (650.0-B-1). Retrieved March 2, 2008 from http://public.ccsds.org/publications/archive/650x0b1.pdf


Hitchcock, S., & Brody, T., & Hey, J., & Carr, L. (2007, May/June). Digital service provider models for institutional repositories. D-Lib Magazine, 13(5/6). Retrieved March 2, 2008 from http://www.dlib.org/dlib/may07/hitchcock/05hitchcock.html#OAIS

All Is Not Digital

Information preservation is not limited to digital formats. The main focus today is to identify and preserve the electronic information that is created in enormous quantities minute to minute throughout the world. This focus also includes the preservation of printed materials through transference to digital format. However, there are other methods of preservation that should be enacted every time a print source is handled whether that be in a large corporation, small public library, or someone’s home. This posting focuses on those methods that involve handling, storage, and evaluation for possible conservation.

Handling

Some never think about it but whatever is on your hands is transferred to the pages of any book, journal, picture or other print source. In addition, the methods used to rack books, retrieve them from the racks, the use of book drops and methods of handling books when photocopying are all potential sources of damage. There are many articles that list the DON’Ts of handling books. Here are a few tips as stated by Palmer (Palmer, 2004):
▪ Never shelve a book on its fore edge. This also applies to how books are handled when preparing them for reshelving, such as on a book cart.
▪ For books taller than shelf space allows, shelve them spine down. A flag may be used to identify the call number.
▪ Large books should be shelved flat. If there is a significant difference in book sizes, the larger books should be placed at the bottom of the stack to prevent warping of the cover.
▪ Do not pull on the headcap, or top portion of the spine, of the book when removing it from the shelf. This may damage the spine.
▪ Avoid flattening a book when you are making a photocopy. Flattening the book may damage the spine. This method of handling is difficult to avoid since most libraries are not outfitted with copying machines designed specifically to copy books.

Part of the library cataloging and organizing process is to label the books, periodicals and other publications with their call numbers and other means of identification. There are some out there that see these actions as damaging particularly to rare, first editions. The author of one particular article stated he incited an uproar when he ventured to accuse librarians of book abuse. Many responded by saying that the purpose of the book was to be used and that readers did not care about the condition of the book as long as they could get the information they needed (Cox, 2000).

Environmental Conditions

Another source of material deterioration may be traced back to the environmental conditions the material is stored under. What level of UV light are they exposed to and for what lengths of time? What is the temperature and relative humidity of the storage area? Are there fluctuations in either or both? These conditions are often scrutinized when dealing with a rare collection however they often have a damaging effect on any book or periodical, rare or not. Fluctuations in humidity may result in the growth of mold spores or, in the case of too little humidity, may result in the paper drying out and crumbling. High exposure to UV light may also result in deterioration of the paper as well as discoloration of the paper and cover (Basset, 2007).

Evaluation

Many of the root causes for deterioration may be transported into the facility along with the resource, depending on where the materials came from. Therefore, it is important to quarantine the materials until their condition is assessed and, if necessary, the corrective action determined (Basset, 2007). These corrective actions may include a simple cleaning to remove mold spores or dust, repairing damaged covers or torn end sheets, or they could involve sending the material to a professional conservator. As preventive maintenance materials should be evaluated on a regular basis to ensure handling and storage conditions are not negatively affecting the materials. It is important to train staff in the proper handling methods however unless patron’s use of the materials is monitored it is impossible to prevent all damage. A regularly scheduled evaluation of materials will lengthen their usable life by repairing damage while it is cost effective to do so (Leverette, 2005).

Conclusion

The act of preservation should be, as Basset stated, day-to-day work. While the worldwide projects are necessary to ensure the longevity of digital media they should not be the only focus. Simple actions taken at the individual library level can significantly lengthen the useful life of print sources.



References

Basset, T. (July 2007). Preventive preservation, a day-to-day work. International Preservation News, 41, 9-12.

Cox, S. (March 2000). Do librarians treat books as second-class citizens? American Libraries, 31(3), 54-55.

Leverette, A., & McGough, S., & Starmer, M. (Fall 2005). Rare condition: Preservation assessment for rare book collections. RBM, 6(2), 91-106.

Palmer, S. (Spring, 2004). Preservation perspectives: Book handling. Kentucky Libraries, 68(2), 20-21.

Saturday, February 16, 2008

Digital Preservation

Advances in technology have brought about an area of great concern for librarians around the globe. How do we preserve digital information in order to ensure it is around for future generations? Why is this information at-risk? What information needs to be preserved? Who needs to be involved in the preservation? How are we going to accomplish this task? In December 2000 the United States Congress, in recognition of the need to address the issue of preserving our digital heritage, approved $100 million for a national strategy initiative, the National Digital Information Infrastructure and Preservation Program. The Congress tasked the Library of Congress with the development and execution of this preservation plan in partnership with other organizations and institutions, both public and private. The Library of Congress’ mission, “to make its resources available and useful to Congress and the American people to sustain and preserve a universal collection of knowledge and creativity for future generations” (“Importance of digital preservation”, n.d.). However, this is not just an issue for North America. Archival institutions throughout the world are working on this same issue. The challenge is to develop rationales, protocols, and methods that may be applied and utilized by institutions now and in the future. In addition, the methods must be robust and yet flexible enough to allow progressive preservation or migration of material to new preservation formats as new technologies are discovered. This posting addresses the four questions stated earlier: why is some material at-risk, what material should be preserved, who is involved with this preservation, and how will it be accomplished.

Why is Digital Preservation Needed

Much of the information available on the Internet has been created solely in a digital format. Websites containing information on an upcoming election, maps used for political redistricting or utilities management, surveys on health care, or the war in Iraq – all of these are considered as at-risk objects. Studies have shown that “13% of Internet sources cited in three prestigious journals were not retrievable from the original hyperlink 27 months after publication” (Fenton, 2006). Add to this the increasing number of electronic sources that are more conveniently accessed and easily searched than their print counterparts. Due to that ease of searching for and accessing information via electronic resources, many publishers are moving to electronic publication of media and foregoing paper copies. If this information is not successfully preserved by migrating the information into multiple storage formats it may not be around for future generations to use. The same fate may be in store for other types of media such as videotapes, phonograph records, and cassette tapes to name a few. Because they rely on specific equipment or software for access, once the equipment or software is obsolete the data is lost. In addition to the loss of information there is a cost involved with not standardizing a method of preserving digital information. In 2003-2004 a survey of libraries completed by the Association of Research Libraries revealed that on average 31% of library expenditures were for licensing of electronic resources. However, because of concerns with authenticity of electronic preservation many librarians not only pay for the electronic resource but also for subscriptions to print copies, where they are available (Fenton, 2006).

What Needs to Be Preserved

Even though there is agreement that digital information must be preserved, it is not necessarily agreed upon what that information should be. Article 7 of the United Nations Educational, Scientific and Cultural Organization (UNESCO) charter on the Preservation of Digital Heritage adopted in 2003 emphasizes the need for selection criteria on the basis of ‘significance and lasting cultural, scientific, evidential or other value” (Lusenet, 2007). The term cultural heritage may be found throughout the charter. However, cultural heritage may apply to movies, television shows, as well as state and local government statistics. On the other side of the world, in 2004 partners of the National Digital Information Infrastructure and Preservation Program agreed to collect and preserve specific types of information which do not exist is print, namely social science data sets, public television programming, political websites, Geospatial data, and the history of the Dot.com era of the 1990s. In effect this leads to two approaches: everything goes or selective preservation. The everything goes is referred to as a harvesting method where everything in the national domain is gathered. The mindset is to save everything and then let the future generations determine whether they want to keep it or not. This is criticized as an act of storage rather than preservation. On the other hand, selective preservation involves collecting only those pieces of information that are foreseen of interest to future generations. This method involves judgment and selection by professionals familiar with the information they are reviewing. There is a third group with a much narrower focus on archiving objects that are digital variations of print documents such as journals. Surprisingly enough the technical aspect of how to go about preserving large amounts of information has not proven to be the most difficult part of any of the projects. The most difficult part is deciding what to preserve and getting agreement from project partners as well as information owners.

Who is Responsible and How Can They Accomplish It

There are numerous partners in the quest to preserve the world’s digital heritage. The information contained in this paper highlights but a few of those involved and what portion of the overall project they are focusing on.
1) Stanford University Libraries
In 2006 the Library of Congress entered into a three-year agreement with Stanford University funding the CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) digital archiving pilot. This project is intended to provide a secure, long-term archiving solution that is decentralized. CLOCKSS is based on another Stanford program, LOCKSS (Lots of Copies Keep Stuff Safe) described as an “open-source software that provides libraries with an easy and inexpensive way to collect, store, preserve, and provide access to the own local copy of authorized content” (NDIIPP, 2006a).
2) Library of Congress (LOC)
One of the several programs under the leadership of the Library of Congress is the National Digital Information Infrastructure and Preservation Program (NDIIPP). The goal of this program, as dictated by the United States Congress, is to identify a national network of libraries and other organizations with responsibilities for collecting digital materials that will provide access to and maintain those materials. Secondly, to set forth, in concert with the Copyright Office, the policies, protocols and strategies for the long-term preservation of such materials, including the technological infrastructure required at the Library of Congress. Lastly, to advance digital preservation methods and determine the best practice. Another LOC program is IRENE (Image, Reconstruct, Erase Noise, Etc.), a conservation technology that creates digital audio files by taking high-resolution images of media such as early phonograph records. In the same vein is SAMMA (System for the Automated Migration of Media Assets) which is a robotic system that creates preservation-quality digital files from cassette-based media. The Library of Congress has also undertaken the Preserving Creative America initiative along with eight partners. This initiative, part of NDIIPP, targets digital preservation of creative media such as movies, sound recordings, digital photography, and video games (LC, 2007). Finally, there is the Web Capture Program which creates archives of websites that include information such as Supreme Court nominations; Hurricane Katrina; papal transition following the death of John Paul II (NDIPP, 2006a); the Iraq war; elections of 2000, 2002, 2004, upcoming 2008; 9/11 remembrance; 107th Congress; Winter Olympic games of 2002 (Web capture, n.d.).
3) North Carolina State University Libraries in partnership with
In February 2003 the NC OneMap was unveiled as a combined state, federal, and local initiative focused on providing view access to geographical data across North Carolina. In addition it allows users to search for and download data, view and query metadata, and identify who is in possession of what data (Morris, 2006). Later that year the North Carolina State University Libraries in partnership with the North Carolina Center for Geographic Information and Analysis and NC OneMap announced the North Carolina Geospatial Data Archiving Project which will focus on the collection and preservation of digital Geospatial data resources from state and local government agencies in North Carolina. The project objectives are to identify resources using OneMap, gather at-risk data, develop a reliable method for storing the data, enhance the metadata for better identification, and develop a model for data archiving. The initial project plan is to retain the data objects in the format received, and then export the content into a more reliable commercial vector format.
4) University of California at Santa Barbara and Stanford University
One of the eight projects funded for the NDIIPP is the National Geospatial Digital Archive (NGDA) whose goal is to design repository infrastructures at each university and to collect materials across a broad spectrum of geographic formats (Sweet-kind, 2006). Geospatial data is particularly complex when it comes to archiving due in part to the multiple file format layers that accompany each object. Each layer is needed in order to view the file but each may be stored in a completely different format than the main file. The NGDA will include prototype archives for data housing as well as a Geospatial format registry that will describe the stored data.
5) PREMIS – Preservation Metadata Implementation Strategies
PREMIS is a data dictionary developed by the company PREMIS as a specification with the goal of creating an set of core preservation metadata elements (Guenther, 2007). Metadata elements are information on digital objects such as identifiers, size, relationships with other objects, creating application information. The intent is to establish a dictionary to be used by archiving institutions in order to adequately and uniformly identify the objects that are being archived. The dictionary is not only meant for ‘after’ creation but for ‘during’ creation as well. If objects are created with metadata that sufficiently describe what they are and how they came to be preservation of those objects is greatly simplified.
6) JSTOR in partnership with Ithaka, The Andrew W. Mellon Foundation, and The Library of Congress
JSTOR and partners have launched a project, Portico, which is a not-for-profit electronic archiving service established in order to address the scholarly community’s critical and urgent need for a robust, reliable means to preserve electronic scholarly journals (Fenton, 2006). Portico’s mission is to preserve scholarly literature published in electronic form and to ensure that these materials remain accessible to future generations of scholars, researchers, and students (NDIIPP, 2006b).
7) UNESCO – United Nations Educational, Scientific and Cultural Organization
The UNESCO Charter on the Preservation of Digital Heritage, adopted in 2003 as one way of safeguarding documentary heritage, is closely connected to the Memory of the World Programme which aims to preserve and promote cultural heritage through digitization projects, the publication of guidelines, and the Memory of the World Register of over a hundred works of exceptional importance. The charter defines cultural heritage as cultural, educational, scientific, administrative, technical and medical resources created in digital format or converted into digital format from print sources. Resources include text, databases, still and moving images, audio, graphics, software, and web pages (Lusenet, 2007). The charter is important because it affirms the role of archival institutions and extends existing preservation systems to include digital media.
8) The British Library (BL)
The British Library is working with Internet Archive on using the Heritrix crawler and harvester tool. The focus of the BL is to create a digital archive infrastructure that would allow storage of information harvested from websites in the UK domain (Hawkins, 2007).
9) International Internet Preservation Consortium (IIPC)
The IIPC involves the Library of Congress and the national libraries of Australia, Canada, Denmark, Finland, France, Iceland, Italy, Norway, Sweden, The British Library, and the Internet Archive (USA) (Mission, n.d.). The consortium goals are to collect Internet content, develop common tools, standards and techniques for archiving, be a strong advocate for initiatives and legislation that support archiving Internet content, support preservation of Internet content by libraries, archives, museums, and cultural heritage institutions around the world.

Digital preservation is a complex undertaking both from the perspective of archiving data created in obsolete formats and preparation of newly created data objects for archiving. Because of the complexity and the enormity of data that does or will need archiving it is a process that requires partnerships in order to divide and conquer. There are many hurdles for the digital preservation project: establishing criteria to determine what data objects will be preserved, obtaining copyright agreements that will allow duplication of information, and establishing a strong foundation that future preservation actions will build upon. With project deadlines of 2010 through 2015 the projects will have to clear these hurdles in order to deliver a repeatable, reliable, and secure method for long-term preservation of the world’s digital heritage.


References

Fenton, E. (2006, April). Preserving electronic scholarly journals: Portico. Ariadne 47. Retrieved December 5, 2007 from http://www.ariadne.ac.uk/issue47/fenton/intro.html

Guenther, R. (2007, April). PREMIS what it stands for: Preservation metadata implementation strategies [Electronic version]. Computers in Libraries, 27(4), 19.

Hawkins, D. (2007, May). The incredible digital journey [Electronic version]. Information Today, 24(5), 22-23.

Importance of digital preservation. (n.d.). Library of Congress website. Retrieved December 2, 2007 from http://www.digitalpreservation.gov/importance/

LC announces digital preservation partnerships [Electronic version]. (2007, September). American Libraries, 38(8), 40.

Library of Congress. (2007). Sustainability of digital formats, Planning for the Library of Congress collections. Retrieved December 5, 2007 from http://www.digitalpreservation.gov/formats/sustain/sustain.shtml )

Lusenet, Y. (2007, Summer). Tending the garden or harvesting the fields: Digital preservation and the UNESCO charter on the preservation of the digital heritage [Electronic version]. Library Trends 56(1), 164-182.

Mission statement. (2007, August). International Internet Preservation Consortium website. Retrieved December 8, 2007 from http://netpreserve.org/about/index.php

Morris, S. (2006, Fall). Geospatial web services and geoarchiving: New opportunities and challenges in geographic information services [Electronic version]. Library Trends, 55(2), 285-303.

NDIIPP supports CLOCKSS: Library makes preservation award to Stanford [Electronic version]. (2006, July/August). Library of Congress Information Bulletin, 65(7/8), 176.

NDIIPP supports Portico: Nonprofit electronic archiving service receives award [Electronic version]. (2006, January). Library of Congress Information Bulletin, 65(1), 13.

New LC audiovisual center boasts state-of-the-art technology [Electronic version]. (2007, September). American Libraries, 38(8), 40.

Sweetkind-Singer, J., & Larsgaard, M., & Erwin, T. (2006, Fall). Digital preservation of geospatial data [Electronic version]. Library Trends, 55(2), 304-314.

Web capture. (n.d.). Library of Congress website. Retrieved December 8, 2007 from http://www.loc.gov/webcapture/index.html