A montage of portraits of black men and women wearing surgical masks.

COVID-19 Web Collecting: Reflections at Two Years

By Christie Moffatt ~

Screen capture of the CDC webpage for refugees arriving in the United States.
U.S. Centers for Disease Control and Prevention captured March 22, 2021| View Archived Pages

The NLM Web Collecting and Archiving Working Group began collecting web content to document the Coronavirus Disease (COVID-19) pandemic in January, 2020, when the World Health Organization declared the pandemic a Public Health Emergency of International Concern. This collection is part of NLM’s broader Global Health Events web archive that includes web and social media documenting the 2014 Ebola Virus Outbreak, the 2016 Zika Virus Outbreak, and more. You can read about NLM’s COVID-19 pandemic web collecting in previous posts at the start, in 2020, and one year on.

The working group is collecting web content according to NLM Collection Development Guidelines, understanding that this content changes frequently and is valuable as part of the primary historical record of this time. Throughout the pandemic there has been a large volume of web and social media communications on a variety of topics representing a wide range of perspectives. The collection features new and evolving guidelines and recommendations, websites of organizations serving specific populations and concerns, news and information documenting research and progress in biomedicine, resources documenting unique and personal experiences, and ongoing discussion and debate regarding public health measures. Members of the group, including archivists, historians, librarians, and subject-matter experts, have selected a vast amount of content during the course of their collaborative work.

A montage of portraits of black men and women wearing surgical masks.
Black Coalition Against COVID-19 captured July 20, 2021 | View Archived Pages
Captured page of the NIH Director's Blog
NIH Director’s Blog, captured August 30, 2021 | View Archived Page

In this second year, the working group collected content on many important new topics, including the transition to a new presidential administration, the approval of vaccines and treatments, the progress of vaccination efforts, and the availability of at-home testing kits. Likewise, we tracked the impact of WHO variants of concern, especially Delta and Omicron. Our ongoing collecting around the impacts on vulnerable populations, education and safety, mental health, misinformation, and global health equity continued as well. We anticipate this content will be meaningful to future research and understanding of this historic event.

As the group embarks on its third year of collecting web content to document the pandemic, we continue to select web content in key areas, and to run, review, and describe archival crawls of web content. We also continue to collaborate with colleagues from across the Library and in the Office of NIH History, members of the history of medicine museum and library community, fellow web archivists in Federal libraries and archives and in the Archive-It community, the International Internet Preservation Consortium, as well as with members of the public.

Ad Council's homepage featuring a montage of people wearing masks.
Ad Council Homepage captured September 1, 2021 | View Archived Pages

The challenges of web archiving are ongoing, and are both technical (some types of content, such as video and social media can be difficult to collect) and intellectual (what, really, will future researchers want to examine?). So far, the Global Health Events COVID-19 pandemic web archive collection features nearly 12,000 resources (just over 3 TB of data). The group will continue to collect content while the pandemic remains an international public health emergency and beyond, as we will likely be facing the effects of the pandemic well into the future.

We invite you explore NLM’s Global Health Events Coronavirus Disease Outbreak Collection and let us know what you think! What resources do YOU think would be valuable for future research and understanding of the pandemic? The NLM Web Collecting and Archiving Working Group continues to welcome recommendations for content to archive at nlmwebcollecting@mail.nlm.nih.gov.

To learn more about NLM’s broader role in the response to COVID-19 please visit https://www.nlm.nih.gov/

Get the latest public health information from CDC: https://www.coronavirus.gov
Get the latest research information from NIH: https://covid19.nih.gov
Learn more about COVID-19 and you from HHS: https://combatcovid.hhs.gov

An informal portrait of Christie Moffatt.Christie Moffatt is Manager of the Digital Manuscripts Program in the History of Medicine Division at the National Library of Medicine and Chair of NLM’s Web Collecting and Archiving Working Group.

The NLM Web Collecting and Archiving Working Group includes  Delia Golden, Marielle Gage (History Associates Incorporated), Christie Moffatt, Katie Platt (HAI), John Rees, Shirleon Sharron (HAI), Susan Speaker, Caitlin Sullivan, Erica Williams (HAI), and Kristina Womack.

2 comments

Leave a Reply to Dale C Smith Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.