Nancy Bell is head of research at the National Archives in the United Kingdom. In this role she is charged with developing and implementing a program of research, including conservation research. The conservation science agenda at the National Archives is focused on developing and translating predictive modeling techniques in order to address large-scale challenges in collections. Nancy is particularly interested in fostering better understanding of the role science can play in interpreting material culture.
Jana Kolar is head of the Institute for Cultural Heritage at the National and University Library, Slovenia. She is coordinator of the European Commission co-funded project PaperTreat, which aims to establish the effectiveness of various paper deacidification methods and cool storage. She also serves as coordinator of the ICOM-CC Working Group on Graphic Documents, as editor of e-Preservation Science and as a member of the editorial board of the journal Restaurator.
Dianne van der Reyden is the director for preservation at the U.S. Library of Congress, responsible for conservation, binding, and mass deacidification programs, as well as environment and storage; a reformatting program for digitizing newspapers; a research program studying longevity of traditional, audiovisual, and digital materials; and new technologies to digitally reformat damaged audio recordings. An author and speaker, she was an early lecturer for the "School for Scanning" lecture series organized by the Northeast Document Conservation Center.
They spoke with Jeffrey Levin, editor of Conservation, The GCI Newsletter, and James Druzik, a senior scientist with the GCI.
Jeffrey Levin: Each of you is very involved in large collections of material that require a variety of ways to care for them. How well are libraries and archives coping with the challenge of dealing with large amounts of paper material that have become extremely fragile, such as brittle books?
Dianne van der Reyden: An assessment on that topic was recently done in the United States—the Heritage Health Index—which looked at a cross section of institutions to see what the needs were. The British Library has sent out a questionnaire that seeks information from national libraries around the world on how they are dealing with some of these issues. One of the first steps, of course, is to get an assessment of the scope of the problem, and thats being done by a lot of places. There are also movements to share resources and to get help with funding mass treatments. Of course, the most cost-beneficial mass treatment you can do is appropriate storage. Whether we deacidify books or we digitize them, we still have to store things. And people forget that physical storage of electronic media is also important. The other important point here is the distinction between material that is born digital and material that has been converted to digital. Conversion is simply adding to the needs.
Levin: Once something has been converted to digital media, from that point onward, doesn't it have the same issues as born-digital materials?
Van der Reyden: Right. But for every dollar that is spent converting something, you've added to the challenge.
Nancy Bell: The National Archives, to some extent, is taking a risk-based approach toward paper material. We've undertaken a three-year risk assessment following Robert Waller's protocols. The brittle paper problem is not as acute here in the National Archives as it is in some libraries.
Jana Kolar: The National and University Library of Slovenia is a relatively small national library compared to some other national libraries, so it's somewhat easier for us to handle all of these materials than it is for bigger institutions. We've done a condition survey of our books collection, and the survey showed that about a third of the material is in poor condition. Per decade, roughly 5 percent of the books in our collection, printed on acidic paper, is expected to reach this condition over the next eighty years. This gives us the approximate time we have to solve this issue, either by transferring information to other media or by prolonging the life span of the originals. We are trying to use all of the approaches. We would like to deacidify books, which we think is a valid option, but we do not have all the information we would like—such as the cost-effectiveness of a variety of available processes—so we proposed that the European Commission co-fund a research project, and weve been working on the project for the last two years with eleven partners. The results should give us the cost-effectiveness of various mass deacidification processes, but they will also tell us the effect of storing a variety of paper-based materials at lower temperatures. So it will be easier for us to plan new storage areas, as well as to decide which materials to deacidify. We will be building a new library, presumably next year, where cool storage is planned for the majority of books.
Levin: Are you suggesting that with improvements in storage, some materials slated for deacidification would be less likely to need that immediately?
Kolar: I think at the end of our research project—which runs until August 2008—we will decide on a combined approach for our library. We ought to keep as many books at as low a temperature as possible, and at the same time, we should deacidify those books that still have some mechanical strength and those books which by law we are bound to preserve. We could do that by deacidifying about six thousand books per year for the next twenty-five years.
Van der Reyden: We have a thirty-year plan funded by Congress annually, and as part of that plan, we deacidify over a million manuscript sheets a year. In addition, we're doing at least a quarter million books a year. Combined, we plan by the end of the thirty-year period to have done about forty million items.
Levin: Has your planning been affected by consideration of digitization for some of these materials?
Van der Reyden: Thus far it has not been either-or but, rather, whats best for the problem. There is a program at the library, which Im not involved with, called the National Digital Information Infrastructure Preservation Program, which is looking at best practices for digitization. We do some digitization as part of the Preservation Directorate, which includes a preservation reformatting division that does large volumes of microfilming of newspapers. We also have the National Digital Newspaper Program that is digitizing newspaper content not from the original newspapers but from microfilm. The newspapers themselves are kept and boxed. This particular project is a model because we have the originals, we have the "preservation copy" with microfilm, and we have the access version, which is digital. You can do searches with it that you could never do with the microfilm, so its much more functional and much more accessible. But the microfilm is the preservation copy.
Levin: The implication of what you've said is that you don't consider digitization as preservation.
Van der Reyden: Digitization in libraries has more to do with information management and sustainability of information than with the materials science of substrates and media. When we in the Preservation Directorate talk about preservation of digital assets, we're talking about tapes, CDs, and DVDs. Eventually we'll look at flash drives and other sorts of storage media and substrates. When libraries talk about digital preservation, they are often referring to metadata and migration. One way to think about the distinction is that in the Preservation Directorate, we're dealing with the hardware aspects of things—the actual matter, not the development of software solutions.
Bell: I can agree with Dianne. Were engaged in massive digitization projects, but we do not see it as a preservation tool in and of itself. Because we are the archive of the central government, we have a statutory obligation to keep the original as well. One of the massive problems that we're going to face in terms of digitization and the preservation of born-digital material is just the sheer volume of it all. For example, just one of the many digitization projects we have going is the 1911 census, which is digitizing twenty-six thousand volumes of census material to be released in 2011.
Kolar: I agree with Dianne and Nancy with respect to the role of digitization in preservation. Digitization is, at the moment, still quite expensive, although the costs are decreasing rapidly. We are dealing with masses of materials, and that will probably be solved in the future with machines that will automatically scan books and create digital data from books when, for example, the customers return them into storage. Currently for our library, there is additional cost in maintaining this digital information because we still need to preserve the originals. We are bound by law to keep a copy of books written in our language, written by Slovenian authors, or published by Slovenian publishing companies in our national library as a legal deposit.
Bell: Certainly our digitization program aims to improve access. It's the National Archives' aim to have our online services as good as our on-site services. That means digitizing the most popular classes of material and also having the support online to help people unlock those collections. So, yes, digitization is not an end in itself. It's another tool.
Levin: What you all are saying is that digitization is really for access and that it has not cut costs associated with preservation.
Van der Reyden: Yes, although I need to add one point. Having digitization viewed as a preservation activity has allowed foundations to target digital projects for funding under a preservation umbrella. I think my colleagues would agree that whenever a digital surrogate eliminates the need to access the original object, you have helped preserve the object.
Bell: I couldn't agree more. It's preservation from that perspective.
Van der Reyden: The interesting thing about that is that studies have shown that once people know about the materials, then they want to access the originals.
Kolar: It's like the Mona Lisa in the Louvre. Most people have seen a digital copy. And they all want to see the original.
Van der Reyden: Right. And this might be a good time to distinguish between the digital surrogate and the original. Sometimes people refer to the original substrate and media as the containers of content—content being the intellectual information that you can get from it. This is not completely accurate because there's much more to the original than meets the eye. No matter how well you digitize or scan, you're not going to capture the chemistry of the object. I once had a long conversation with a webmaster about this, and he said, "Our resolution is so good we can see the paper fibers." But you can't see the chemistry and the fingerprints of every author who wrote on the original. There's no way you're going to be able to digitize that. But one day, science will be able to access those fingerprints from the original object. We're not talking about the symbolic importance of the original. There's evidential value to the original, and it's not just the legal evidential value.
Bell: Artifactual value is the term I use. The original exists as an artifact, and that artifact conveys value that is important to a range of people and communities. And they all interpret the artifact in different ways. Too often people say it's only information that's required, but we all interpret in different ways—whether it's a file in an archive or a book or a photograph. That original conveys all sorts of meanings to all sorts of people at different times.
Levin: So something critical is lost by removing access to the original object.
Van der Reyden: Yes. The entire authenticity of the piece is lost.
Bell: I'm very interested in virtual collections and how they are interpreted. The three of us are of a generation that's used to looking at real objects. But we're working with a new generation of researchers who sometimes only know about an artifact through a virtual image. Theyre going to have a completely different interpretation of it. The context in which it is set will be lost. Color rendition isnt always the same. All sorts of artifactual values will be different. For the next generation, there will be a very different interpretation of material culture.
Van der Reyden: There is debris on these objects, evidence of what they've been through—pollen or smoke from fires or gunpowder from battles. Digitization simply cannot capture those three-dimensional chemical components of the materials, which have to be extracted through examination.
Bell: I'm concerned that context can be lost. The relationship of one item to the next—say, how a collection of prints and drawings is put together or how an archive is put together. If you look at a single image in isolation, you potentially lose how that single item relates to that which came before it or after it.
Van der Reyden: That associational value is very important. And done the right way, digitization can foster that. There have been some programs at the library that have tried hard to do things based on themes and topics. But every researcher will have a different context. You can't predict which associational value a particular researcher will want to extract out of a collection.
Levin: One question we haven't addressed is the problem of data preservation and migration, which is posed by this large digitization of material.
Van der Reyden: I was at a conference once where they asked what would it take for us to have confidence in digital preservation, and I thought to myself that what it would take is that the process be self-replicating, self-sustaining, and self-correcting. It's a medium like DNA—and there is DNA computing that's being done. Whether or not it will ever serve as a substrate remains to be seen. When I ask people about that, some say that it's twenty years in the future—as though thats so far off. The point is that we haven't reached the end in terms of types of substrates and systems. The digital era is just one period of time, and with biotechnology and development of wetware and things like that, there are all sorts of potential solutions out there. So I'm not pessimistic about storage mediums or substrates for digital-type information. We're going through a period that has certain problems, but it will be a finite period. I'm not sure that there's motivation at the moment to look at new technologies that go outside the realm of the ones they're using now and to change the whole paradigm.
Bell: A phenomenal amount of work has been done on digital mediums and migration and understanding systems. We've gone far in twenty years. But now we're facing new technologies. For example, government decisions are being sent by email. Decisions are being made through Facebook and text messaging. How do we capture another area of born-digital material, which is the history of the future? It's not just a question about preserving it—how do we get it in the first place? What are we going to keep? What is the record? Its hugely complex, and there will be black holes in our history if we don't face these things.
Van der Reyden: I agree. The born-digital issue is just what the research and resources should be going into. These are all machine-dependent technologies. When you start looking at the things that are being created on the Internet and how all these things inter-operate together, youre faced with huge complications.
Bell: We're all looking at questions of the preservation of Web sites, because increasingly they are the documentary evidence of our history. They have aspects to them that we want to preserve, but this presents huge storage requirements and is therefore resource draining in a major way.
Kolar: And they continue to change.
Van der Reyden: Exactly—what you capture today is not what's happening tomorrow.
Kolar: The Netherlands National Archives made a calculation on the cost per year of storing a terabyte of information, and their estimate is that it costs about ten thousand euros per year. Now, the number of images in a terabyte depends, of course, on the format and the quality of the image. But it's quite expensive.
Bell: I can add one staggering statistic. They're using 24 bit TIFF files in digitizing this 1911 census of twenty-six thousand volumes. The census currently takes up two kilometers of paper. The DVDs will also occupy two kilometers of shelving and require an estimated 512 terabytes of storage. And that's one tiny collection. I know there's talk about using digital video storage and retrieval. These things are very much driven by competition, so maybe future competition will create more alternatives.
Van der Reyden: That's a really good point. Right now the competition isn't looking at a stable substrate because manufacturers are more focused on new innovations in functionality. One thing we need to make clear—we all love the aspect of accessibility provided by digitization; they give you a lot of information very quickly and very conveniently, and there's never been anything like that in history. It's wonderful. But we're concerned because its also so ephemeral.
Bell: And then there's this whole question of volume. In the National Libraries we're taking in books at a great rate, but were also taking in data that's going to be very much a part of the collection. For example, we have police collecting all sorts of data from cameras that need to be used for criminal cases. These cameras are on the road, they're used for interrogation, and they even have them on dogs that go out sniffing. In the end, it's a selection process—and we really haven't tackled the selection element of it all.
Levin: Hasn't selection always been an issue for custodians of libraries and archives?
Van der Reyden: It always has been. It hasn't always been dealt with well, and the problems are even more massive now.
Levin: So how are those decisions being made?
Van der Reyden: What we recommend—and I'm not saying that people follow that recommendation—is to look at value, use, and risk. The value can be associational value, monetary value, research value, evidential value—there's a whole range of values one needs to think about. Examples of the risk could be video materials that are deteriorating or CDs being stolen. And use encompasses use by researchers, use for exhibition, use for digital conversion. We like people to consider at least those three things when making preservation selections.
James Druzik: If the selection criteria are as simple as value, use, and risk, what stops retention policy from saving everything forever?
Bell: Well, I don't think it's that simple. It's enormously complicated.
Druzik: Those three simple words allow everything in the universe to be either valued by someone, used for some theoretical or practical purpose, and be at risk because—as we know in preventive conservation—there are more risks than we have solutions for.
Van der Reyden: But each of those three factors can have hierarchal rankings that are dependent on the institution or the curator or the librarian.
Kolar: And then, of course, you have to get all of them to agree on the same priorities.
Van der Reyden: Yes. Wouldn't life be a lot more efficient if that would happen? That's exactly the problem.
Druzik: And that's my point. The problem with agreeing is that you would like everyone to agree on what the values, uses, and risks are, but then you end up with an unresolvable problem.
Van der Reyden: Agreement isn't what's necessary—it's understanding the consequences. As we all know, there's not a perfect solution here. There's probably not even a good solution. But there can be a solution, to the best of our abilities, if we understand the consequences.
Kolar: And there's also a good chance that because we're looking at the present value, use, and risk of a certain material, we're missing something. Because the value of something may turn up later—for example, in twenty years—and we've not preserved it because no one thought it was valuable.
Levin: How often are those criteria effectively employed in making decisions about what you're going to preserve and what you're going to let go?
Van der Reyden: In the Preservation Directorate at the library, it is how we decide what work we do. We have to work with the curators to determine their collections' values, uses, and risks, and then we have to look at all the things that fall out from that in terms of priorities. We have to make decisions, and we have to be able to justify them for the work plans of the preservation staff.
Bell: We're slightly different in that we have very strict appraisal and selection processes and guidelines in place. There are representatives from the National Archives working in government departments and selecting material following standard archival selection protocols. By the time their material gets into the archive itself, its there for permanent retention. Then we have to balance the preservation equation within that framework.
Levin: In terms of what you're receiving, have you seen a dramatic shift in the balance between materials that are on the old-fashioned medium of paper and materials that are digitally born? Can you quantify it roughly in percentages?
Bell: It's hard to say off the top of my head. The emphasis of the organization is very much toward digital preservation at the moment, because we have such a long history of dealing with paper-based or analog records. As I said, we are facing such acute problems with electronic communications—that's where the resources and thinking are focused; its about what we are going to select, capture, and preserve. Running parallel to that is preservation of paper-based collections. But at the moment, the balance has shifted toward digital, and it's always an issue keeping the problems of paper-based collections in the forefront of people's minds. We still do have problems with some modern materials, particularly photographs and the vulnerabilities of some photographic processes. We're getting more things generated in the last thirty years—for example, records with plastic components or architectural drawings on plastics.
Kolar: Preservation of library and archival materials is becoming so complex with all the new media, and there's no simple answer. We have one approach to paper-based collections, and we have another approach to preservation of digital information—what to collect, how to preserve, and so on. Regarding the paper-based collection, we would like a preservation program that allows us to preserve as much as we can with the given funds. And that's why we still do materials research. There are some key questions that have not been answered yet—like rate of degradation at room temperature of a variety of materials—and we absolutely need that data in order to prepare effective preservation programs that involve those materials. When it comes to books produced from the 1990s onward, we're not dedicating any particular attention to those because most of the paper is alkaline. The technology has changed, and those books are stable.
Van der Reyden: Can I say something here? First, the amount of print material is increasing, not decreasing. While the number of digital materials is increasing—I think its about 1 percent right now at the Library of Congress—the number of print materials is not being replaced or decreased by those media. A new problem that is looming with paper-based materials is recycled paper. While we have papers that are more alkaline, they may have a larger percentage of shorter fibers. This could be a problem in the future because of green technology. There's a move to use greater and greater amounts of weaker fibers, which is good for ecology, but it might not be good for document history.
Bell: One thing this dialogue hasn't discussed is the debate about the validity of international standards for the storage of paper-based collections in view of the impact of climate change. We all know things can be saved longer in colder temperatures—we're there on that one—but which is the greater risk to collections worldwide: global warming and the impact of that and the potential for 100 percent loss of archival collections, or perhaps taking a more flexible approach, based on a sound understanding of materials science, and reducing carbon emissions? We're going to need to address this at some point.
Kolar: This is a very important issue—and not just for library and archival materials. It is for museums and historical houses as well. In our case, as I mentioned, we are going to start building a new library, and storage is planned underground, where we estimate that the naturally ambient temperature will be about fifteen degrees centigrade. Because there will be an automatic delivery system, we can afford not to heat the naturally cool storage underground. But we will need to dry the air. So we do hope to really decrease energy consumption.
Van der Reyden: We also have an underground storage facility, and many other agencies are moving in that direction. You can create a good, fairly controlled passive environment without a whole lot of mechanical overhead, and that will help you preserve things that are already on a somewhat stable substrate. The problem with digital materials is that they're already on an inherently unstable substrate, and they require huge mechanical intervention both to preserve them and to access the data on them. So that's an interesting cost differential to think about in the future as well.
Levin: We talked a bit about things that might offer promise for dealing with some of these issues in the future. Are there any other things that have the potential for confronting some of these problems?
Van der Reyden: I think dialogues like this help a lot.
Bell: The first word that comes to my mind is global. One of the really positive things that has happened in this profession is that none of us thinks local anymore. Whether we're in London or Washington or Los Angeles or Slovenia, we have very similar problems. There might be different solutions, but we're all grappling with the same problems, so it's a very positive step having dialogues in this way.
Kolar: I think there are many more international projects now than there were in the past because, as Nancy said, we're dealing with similar problems. It's absolutely the way to go.
Van der Reyden: Yes. Because these problems are going to take big groups to solve them.