Future Historical Collections: Archiving the 2014 Ebola Outbreak

Christie Moffatt spoke today at the National Library of Medicine on “Future Historical Collections: Archiving the 2014 Ebola Outbreak.” Ms. Moffatt is an Archivist & Manager of the Digital Manuscripts Program in the History of Medicine Division of the National Library of Medicine. Circulating Now interviewed her about her work.

Circulating Now: Tell us a little about yourself. Where are you from? What do you do? What is your typical workday like?

Christie in the NLM HMD reading room.Christie Moffatt: I was born in Kitchener-Waterloo, Ontario, where my father was attending University and my mother worked as a nurse. My father’s work brought the family to Atlanta, Georgia and I grew up there and attended the University of Georgia, studying history and Spanish. A few years later I earned my Master’s Degree in Library Science at the University of North Carolina at Chapel Hill, with a concentration in archives and manuscripts.  I worked there as a graduate assistant on an early digitization/text-encoding project called Documenting the American South. When I graduated from UNC I moved to Washington, DC and was fortunate to obtain a position as an archivist at NLM in the History of Medicine Division’s Digital Manuscripts Program. I started and continue to work on NLM’s Profiles in Science, an online archive featuring the manuscript collections of 20th century leaders in science, medicine, and public health. I very much see web archiving as a natural extension of this work, and feel fortunate to be a part of preserving today’s history for the future.

My typical workday involves balancing work with the correspondence, photographs, and laboratory notebooks of health workers and scientists of the last hundred years, with the web content created by and about health workers and scientists and their work today. Both types of collections serve to document stories of learning, collaboration, discovery, and challenge, but the workflows and strategies to collect and preserve them are quite different and continue to evolve. The preservation of web and social media is indeed a major concern of the digital stewardship community, and identified by the National Digital Stewardship Alliance (NDSA) as one of four specific types of content that pose “urgent challenges to stewardship” in their 2015 National Agenda.

CN: In your talk “Future Historical Collections: Archiving the 2014 Ebola Outbreak” you explained how NLM took the initiative to capture and preserve selected born-digital web content documenting the 2014 Ebola outbreak. What motivated this initiative?

Archived CDC homepage running 4 features on Ebola and one on Enterovirius D68.

CDC homepage on October 2, 2014
NLM Web Archive

CM: In October 2014, as the volume of news and information on the Web about the Ebola outbreak grew—we saw this especially as the crisis hit home here in the United States—the need to gather and preserve this information for future researchers became increasingly clear. It’s hard to imagine how one might fully explore and understand reactions and response to this crisis without access to the web and social media. Web and social media was both how news was communicated and the means by which our reactions and experiences were shared. Unlike printed news publications, paper correspondence, and diaries of the past (some of which are lost to history too!), it was unlikely that this type of material would ever make its way into the Library for future researchers.  Workflows for gathering and preserving these kinds of documents were, and continue to be, in development. So we began to collect content that October, even though the scope of collecting was not yet fully understood: Did we want to document Ebola in the United States or the epidemic more broadly? The epidemic and its aftermath? Through the rebuilding of a healthcare infrastructure in West Africa? What about the development of the vaccine? What would researchers of this archive want to see? We knew the story was big, but we didn’t know how big it would get, or for how long it would go on. Deciding when to begin collecting around future events was also a consideration.

While we followed the news to a great extent, especially in the beginning, we ultimately decided that we needed more than the presence and frequency of Ebola in the news cycle to determine our boundaries around collecting. We are in fact still collecting, and the situation with Ebola continues to evolve. We’ve  since decided that the World Health Organization’s declaration of a Public Health Emergency of International Concern (PHEIC) would be a trigger for collecting. This had been declared on August 8, 2014 for Ebola. Identifying triggers such as this enabled us to act more quickly around the Zika Virus when it, too, was declared a PHEIC, on February 1, 2016. We were ready to take action that very day.

CN: The Library’s historical collections cover a wide range of physical media: postcards, films, scrapbooks, newspapers, diaries—what is unique about born-digital content?

CM: There are features and functionality essential to the understanding and use of born-digital materials that require the content to be maintained and preserved in their original form (or close to it).  And as with all digital content (born-digital and digitized), this content is at high risk for loss as the distance between the time that an object was created and the time of donation to a library or archive: web and social media content disappears, content is intentionally or unintentionally deleted, storage media deteriorate, and the software needed to support access to earlier formats is no longer available.

What I see as the real challenge for libraries and archives, beyond the important task of managing a range of formats and files over time, is the very basic identification and acquisition of the content to begin with. For the NLM manuscript collections we’ve worked with for Profiles in Science, we have collected content, for the most part, at the end of scientists’ careers, long after they established themselves their fields. But what about the scientists and health workers who we would like to learn more about in the future who are just now starting their careers? How is their work being documented? How are they storing their data and engaging in conversations about their work? What will be left of their work when they retire decades from now? While I’m sure that paper is still being used to some extent, there is also evidence of much scientific discourse happening online. Access to this content in the future will depend on actions we take now, through web archiving efforts, through engagement with the creators of content to preserve their content, and through larger collaborative efforts aimed at building national strategies for digital preservation, such as the National Digital Stewardship Alliance.

CN: How do you select web content to archive? Is choosing this material similar to selecting other types of materials for the collection?

Screenshot of the archived webpage.

Archive of the MSF UK site, October 28, 2014
NLM Web Archive

CM: Once we decided to collect around the Ebola outbreak, we were able to get up to speed fairly quickly by leveraging content identified by NLM’s Disaster Information Management Research Center (DIMRC) for their Information Resource page on the Ebola outbreak, which they continued to update throughout the epidemic. We knew early on, however, that we wanted to go beyond this list and include even more perspectives on the outbreak in the web archive collection, so we identified blogs and individual blog posts of health workers, photo journalists, public health policy researchers, scientists, and others who were communicating experiences in the field or work and other reflections on Ebola from afar. We also wanted to capture some of the news documenting the epidemic.  For this additional content we followed the news, set up relevant Google Alerts, and welcomed recommendations from other archivists, historians, and librarians in the Library. We also added content referenced in online articles, blogs, and tweets, often discovered in the quality review process.

CN: Web archiving is a relatively recent endeavor, what areas of growth do you see coming in the near future? What do you find most exciting?

CM: I find the development of case studies and tools for research and use of web archives really exciting, and hope that the more we can see examples of how researchers are able to use web resources to understand the past, the more these resources will be recognized and preserved as valuable parts of our cultural heritage.

Christie Moffatt’s presentation was part of our ongoing history of medicine lecture series, which promotes awareness and use of NLM and other historical collections for research, education, and public service in biomedicine, the social sciences, and the humanities. All lectures are live-streamed globally, and subsequently archived, by NIH VideoCasting. Stay informed about the lecture series on Twitter at #NLMHistTalk.