Acreenshot from a YouTube video by by ASLMeredith of an American Sign Language instructional video.

COVID-19 Web Collecting: Reflections at One Year

By Christie Moffatt ~

A screen capture of the CDC website on COVID-19 from January 2020.
U.S. Centers for Disease Control and Prevention captured January 30, 2020. Archived pages available at: https://wayback.archive-it.org/4887/*/https://www.cdc.gov/coronavirus/2019-nCoV/

One year ago, on January 30, 2020, the World Health Organization declared the current coronavirus disease (COVID-19) pandemic a Public Health Emergency of International Concern.  The disease was yet to be officially named and there were 5 positive cases in the United States.  With this designation, following National Library of Medicine Collection Development Guidelines, the NLM Web Collecting and Archiving Working Group began collecting web and social media to document the emerging global pandemic.  While we had no idea for how long the pandemic would last, nor the extent of its impact, we began this effort with a belief that web content would be a significant part of the primary historical record for future research and understanding of this historic time.  As the nature of web content is ephemeral, future access to the web and social media documenting the pandemic—its impact, our reactions, and response, and how it evolved over time—depends on actively identifying and collecting material in real time, before it changes or disappears entirely.

Screen capture from the WHO Twitter account naming the new disease.
World Health Organization Tweet announcing the official name of the disease caused by the novel coronavirus, February 11, 2020

Over the course of the past year, Working Group members met weekly (virtually) to discuss the evolving story of the pandemic and web sources to include in the collection.  We began collecting the COVID-19 sites of the U.S. Centers for Disease Control and Prevention, the National Institutes of Health (of which NLM is a part), and the World Health Organization, and expanded our scope to related content from a wide range of public health, medical, and scientific government and non-government organizations sharing information on the web about the virus and the disease for health professionals, individuals, and their communities.  We also selected news articles documenting the story of the virus spreading to and across the United States and the significant challenges we continue to face.  Our scope is broadly focused: on the many social dimensions, efforts to control and prevent the spread, health disparities, the impact on vulnerable populations, tracking the disease, vaccine development, health worker and patient experiences, misinformation, and more. We also aim to select content documenting the many stories of innovation, creativity, and hope in the face of this crisis.

Acreenshot from YouTube of an American Sign Language instructional video.
“How to sign QUARANTINE, SELF-ISOLATE, and STAY HOME in American Sign Language for Beginners” by ASLMeredith, April 26, 2020. Archived in July 2020

With input and recommendations from across NLM, as well as from colleagues in the web archiving, museum, and library communities, including the Office of NIH History, as well as from the general public, we’ve archived nearly 6,000 resources (over 1 Terabyte of data) so far, including websites, blogs, news articles, and video recordings.  Recommendations reflect personal and professional interests from diverse perspectives because the impacts of COVID-19 are intensely personal as well as collective.  As we all have our own view of what is important to preserve, broadening opportunities for input has enriched the collection in many ways. Among the thousands of items in the collection, content includes videos on how to prevent the spread of COVID-19 in sign language to deaf communities, blog posts documenting the experiences of medical trainees, and remembrances of lives lost.

Homepage offering essays, poetry, clinical vignetters, and writings from medical students and professionals around COVID-19.
Pandemic Perspectives: The COVID-19 Journal for Medical Trainees. Archived November 04, 2020

Identifying and reviewing content for inclusion in the collection is an important step, but much work follows to prepare the collection for future researchers.  Working Group members have been actively crawling content using the Internet Archive’s Archive-It service and Conifer, reviewing for quality, notifying content owners when needed according to NLM policies, and adding descriptions to support discovery.  This work is ongoing as we continue to identify content related to the vaccine distribution, new and evolving policies regarding prevention and control, and the long-term impacts of the disease and the incredible toll it is taking on our society.  As we approach January 30, 2021, we see over 25 million confirmed cases in the United States, and many more around the world, and mourn the loss of over 400,000 lives. We hope that this web collecting effort around COVID-19, along with the work of the International Internet Preservation Consortium and many others, serves to document this historic pandemic and support research on a broad range of important issues and questions that will no doubt be raised for many years to come.

Page from COVID Memorial Quilt project describing its origins as a school project.
Covid Memorial Quilt. Archived January 2021

We invite you to read more about NLM’s COVID-19 web archive collection, part of a larger Global Health Events web archive collection, in the recently published Journal of the Medical Library Association article “The National Library of Medicine Global Health Events web archive, coronavirus disease (COVID-19) pandemic collecting,” and the broader context of documenting the pandemic published in Nature on December 17 “What are COVID archivists keeping for tomorrow’s historians?”  We have also written a number of posts about NLM web archiving on the Ebola Outbreak, Zika Virus, and the Opioid Epidemic, available here on Circulating Now.  NLM continues to welcome suggestions for content to include in the collection at nlmwebcollecting@nlm.nih.gov.

To learn more about NLM’s broader role in the response to COVID-19 please visit https://www.nlm.nih.gov/

Get the latest public health information from CDC: https://www.coronavirus.gov
Get the latest research information from NIH: https://covid19.nih.gov
Learn more about COVID-19 and you from HHS: https://combatcovid.hhs.gov

An informal portrait of Christie Moffatt.Christie Moffatt is Manager of the Digital Manuscripts Program in the History of Medicine Division at the National Library of Medicine and Chair of NLM’s Web Collecting and Archiving Working Group.

The NLM Web Collecting and Archiving Working Group includes NLM Associate Fellow Brianna Chatmon, Delia Golden, Marielle Gage (History Associates Incorporated), Christie Moffatt, Katie Platt (HAI), John Rees, Susan Speaker, Caitlin Sullivan, Erica Williams (HAI), and Kristina Womack.

9 comments

  1. This is an important effort….is there a particular email if we have suggestions about content?

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.