By Christie Moffatt
The research underway on the 1918 flu epidemic described in Dr. Thomas Ewing’s recent post here on Circulating Now is a fascinating example of the long-term research value of news communications. Dr. Ewing describes how health officials conducted a publicity and education campaign in the newspapers to prevent the spread of the disease. The digitization of these papers and their accessibility online opens this large body of data to researchers working on all kinds of interesting questions, some of which the authors and publishers of the day might never have imagined.
We now see important news information surfacing routinely and as a matter of course on the web and in social media. At the end of April of this year, information about the threat of a new flu epidemic, the Avian Influenza A (H7N9) Virus, began to circulate via Twitter, blogs, and the web sites of the U.S. Government and news organizations. While there were certainly communications in print media, the information shared on the web was a unique expression and documentation of the public health concern. The web communications on H7N9 could become a treasure trove for researchers seeking to better understand the spread of the virus, its impact, and the nation’s response to the threat in the early 21st century.
Knowing that web information can disappear quickly and without notice, the NLM Web Collecting and Archiving Working Group set out to identify and collect web content documenting the H7N9 virus, with a particular focus on the U.S. Department of Health and Human Services (HHS) response. The Working Group conducted web crawls on 86 selected URLs using the Internet Archive’s Archive-It service, beginning shortly after the Centers for Disease Control (CDC ) issued a health advisory on April 5 and continuing through the CDC announcement via Twitter June 18 that it was deactivating its Emergency Operations Center response. NLM collected the transcript and audio of the April 5, 2013 CDC Telebriefing on H7N9 Influenza Cases, Food and Drug Administration’s Medical Devices and Flu Emergencies emergency authorizations, and consumer health information distributed through the National Library of Medicine in both English and Spanish. Efforts are currently underway to analyze the content and the quality of what was collected, and to add descriptive information to facilitate access and research.
And just as those who decided to keep and preserve the newspapers of 1918 may not have predicted the future uses of the content they were saving, the long-term value of the web content collected by NLM and by many other organizations may not be known until later down the road. Collecting and preserving this content now will make it possible for future researchers to explore a variety of important research questions, and perhaps even find the answers they are seeking for the benefit of medicine, science, and public health.
Learn more about the NLM Web Collecting and Archiving Working Group through this guest post on The Signal Digital Preservation blog of the Library of Congress.