Circulating Now welcomes guest bloggers E. Thomas Ewing, PhD, Anna Pletch, and Brooke Breighner from Virginia Tech to share their research on French statistician Jacqes Bertillon’s data driven investigation into how many deaths could be associated with the 1889–1890 influenza epidemic in Paris.
Using maps to reveal pandemic data is a contemporary practice with historical antecedents that include French statistician Bertillon’s efforts to analyze and visualize an influenza pandemic which spread across Europe in late 1889 and 1890.
Jacques Bertillon (1851–1922) was the chief statistician of Paris best known for his work on disease classification, a precursor to the International Classification of Causes of Sickness and Death, which is now called the International Classification of Diseases. Bertillon’s report, “La Grippe in Paris and in Other Cities in France and Abroad in 1889–1890” (“La Grippe a Paris et Dans Quelques Autres Villes de France et de l’Etranger en 1889–1890”) first appeared in the 1890 edition of Annuaire statistique de la ville de Paris, released in 1892. This report was also published separately as a pamphlet, available from the National Library of Medicine. This historical document anticipates one of the important developments observed during the Covid-19 pandemic, as government agencies, health organizations, and media platforms have used tables, charts, and maps to alert the public to the danger of the disease outbreak, document the number of victims, and predict possible outcomes.
The main contribution of Bertillon’s study was the attempt to identify the number of excess deaths that could reasonably be attributed to the epidemic. According to Bertillon’s calculations, the total number of deaths in Paris from December 15, 1889 to January 31, 1890 increased to 12,500, as compared to the average of 7,500 for the same period over the previous three years. This increase coincided with the worst stages of the influenza pandemic, yet only 243 deaths were attributed directly to La Grippe, amounting to less than 5% of the 5,000 excess deaths during these six weeks. Bertillon thus argued that the number of excess deaths was a more reliable indicator of the actual impact of the epidemic, because it included many diseases with increased death totals which could be reasonably associated with the epidemic.
As indicated in this chart, drawn from a table in Bertillon’s report, diseases with a very high number of excess deaths included pneumonia, bronchitis, tuberculosis, and many other respiratory diseases. Based on these statistics, Bertillon concluded that while influenza is a direct cause of a small proportion of deaths, the real toll of the epidemic could be measured by counting deaths from a range of diseases worsened by the conditions of the pandemic. This method of calculating the actual costs of a pandemic is familiar to us now, as health statisticians compare the number of deaths directly attributable to Covid-19 to the much larger number of excess deaths recorded from spring 2020 through summer 2022. These calculations have led the World Health Organization to estimate the full death toll of Covid-19 at nearly 15 million in May 2022, while data from the Centers for Disease Control and Prevention reveals several peaks during the Covid-19 pandemic when excess deaths from all causes greatly exceeded predicted averages.
Both the statistical volume and then pamphlet version of the report included a map visualizing Bertillon’s method of measuring excess deaths, shown above. The map’s legend provides a detailed explanation of how data was analyzed and then displayed using visual markers that conveyed essential information. The “deaths attributable to la grippe” were calculated by subtracting from the total number of deaths in November 1889, December 1889, and January 1890 from the total deaths in the same three months in the preceding year. This approach indicates the “excess number” of deaths that can reasonably be attributed to the disease epidemic. The number of deaths was then calculated as a ratio per 100,000 population. The number of lines under the name of the city indicates the rate of excess deaths: one line is 50–99 deaths per 100,000 population, two lines is 100–149 deaths, three lines is 150–199, and so on. Each of the thirteen small circles under the name of the city indicates a week during the pandemic, with four circles in November 1889, four circles in December 1889, and five circles in January 1890. A circle indicates that the death rate for the week is only slightly above the rate for the same thirteen weeks in the preceding year. A circle with a line indicates that the death rate is one-third higher than the comparable weeks earlier. A solid black circle indicates the death rate is double the rate in the preceding year.
A closer look at several cities indicates how these symbols can be read to tell a story of the epidemic. In St. Petersburg (which is framed by a box, indicating that it is not in the actual geographic location), the two lines indicate a death rate of 100–150 deaths per 100,000 above normal for these three months, while the single circle with a line indicates that the only week when deaths were one-third above average was the final week of November. In Berlin, the two lines indicate that the death rate was 100–150 above normal for the epidemic, while five circles with lines indicate several weeks in December and January with deaths one-third above normal rates. London has only one line, indicating just 50–100 deaths above the normal rate, while the four circles with lines indicate that the death rate increase of one-third above normal happened in January 1890. In contrast to these other three capital cities, Paris has four lines, indicating that the rate was more than 200 deaths per 100,000 higher than the same period in the preceding year, while the four solid black circles indicate that the final weeks of December and first weeks of January had deaths at double the rate of the preceding year.
When looking at this map, of course, the cities that stand out are those with the most number of lines, suggesting death rates were several times the average in the preceding year. Zara (now Zadar in Croatia), located along the Adriatic coast, has more than a dozen lines (the lines are difficult to count because they run together), suggesting a remarkable rate of more than 500 deaths above the average. Other cities in the Balkans, such as Laibach (now Ljubljana in Slovenia) and Temesvar (now Timișoara in Romania), have almost as many lines. Cities with at least five lines, indicating death rates at least 300 over the rate in the preceding year are spread more broadly across Europe.
The use of death rates, rather than totals, provides a basis for making comparisons between cities about the relative impact of the pandemic. Missing from this map, of course, is any information about the size of the population of the cities. Several cities which indicated very high death rates were also relatively small, including cities in the Austro-Hungarian empire such as Zara, with an estimated population of 20,000 people; Laibach, with an estimated population of 31,000; or Temesvar, with an estimated population of 46,000. By contrast, the largest cities in Europe tended to have lower death rates, including London (approximately 5 million), Paris (3 million), Berlin (1 million), and St. Petersburg (1 million). One likely cause of this range of outcomes may have been variations in the ways governments and public health organizations collected, categorized, and published statistical records, which remains a challenge in current efforts to deal with epidemics. It seems equally plausible, however, that the high rates were a result of the relative size of the population in these cities, as smaller samples usually produce more variability, a pattern known as the “law of large numbers.” The fact that the Croatian city of Pola, with a population of 30,000 and located just eighty miles along the coast from Zara, was marked with just two lines, suggesting a death rate one-fifth that of Zara, illustrates the potential variability of data for cities with relatively small populations.
The use of maps was a common approach by scholars after the outbreak of influenza in late 1889 to understand the spread of disease, including maps by the German scholar Adolph Baginsky (see analysis in Circulating Now), the British scholars Frank Parsons and Theophilius Thompson, and the Russian military statistical office. Each of these maps adopted a distinctive approach to visualizing the spread of the disease across a broad geographical region and during a set period of time. In an era when data visualizations, such as those associated with John Snow and Florence Nightingale, were increasingly influential on public health policies, these maps of the influenza epidemic demonstrated the value of understanding the ways that infectious diseases spread across time and space.
During the Covid-19 pandemic, visualizations of data about disease outcomes have become an inescapable part of the American information ecosystem, published by government agencies, health organizations, and media platforms. Maps showing the number of cases, hospitalizations, vaccinations, and deaths by country, state, county, city, and even community have made it possible for health officials, medical staff, government organizations, and citizens to observe the course of the disease, understand the current situation, and anticipate future developments. Data visualizations have also been used to mobilize both agencies and individuals to comprehend potential dangers, adopt protective measures, and make decisions about appropriate behaviors. Whereas most maps of the Russian influenza appeared after the pandemic was over, the global response to the Covid-19 used data visualizations in real time. Future historical research will reveal the value, and the limitations, of using data visualizations during a pandemic to predict possible outcomes.