The CDC’s COVID death tracker has received a major software update
In 2020 and 2021, COVID-19 has become the third leading cause of death in the United States. In May, the country passed the grim milestone of one million known COVID deaths. Although fewer people are dying from the virus now than at the height of Omicron’s surge this winter or previous waves, new strains have continued to claim lives.
As the pandemic drags on, understanding how many people are dying and who is most vulnerable remains crucial to efforts to prevent further deaths. To that end, the Centers for Disease Control and Prevention (CDC) recently updated the software they use to process all mortality data nationwide. The change, powered by advanced computing techniques like machine learning, could provide health officials and the public with more up-to-date information about the disease.
“Civil registration of births and deaths and understanding the causes of death are really essential for a functioning health system,” says Emily Smith, assistant professor of global health at George Washington University. “There are many ways to use this information.”
Tracking the leading causes of death in a community and identifying the concentration of those deaths helps public health officials direct resources, she adds. During a crisis like the COVID pandemic, having timely information is especially crucial. But the national statistics system has been slow to process and publish mortality figures. When the United States surpassed one million deaths from the virus earlier this year, the CDC tracker was still weeks behind.
“An effective response to epidemics is about getting the right resources, whether it’s drugs, vaccines or prevention programs, to the right people at the right time,” Smith says. “Data helps us do that.”
The CDC upgrade represents a significant step forward. “It’s great to see the United States moving forward,” Smith said. “More transparent and faster data is a big step forward.”
For decades, the CDC relied on computers to analyze death certificates and assign four-digit codes to each report based on the underlying cause so they could be tracked by the national statistics system. of civil status.
However, only about 70-75% of the country’s death certificates could be automatically coded; the rest have been flagged for review, meaning a staff member would have to manually enter the cause of death into the system. “When you’re dealing with 2 to 3 million dead [every year]25 to 30 percent of records is a pretty large number and resource-intensive,” says Robert Anderson, chief of the mortality statistics branch at the National Center for Health Statistics.
The updated cause-of-death coding system, known as MedCoder, can handle a greater proportion of these records: it currently codes 85% of records automatically and, with continued improvements, “has the potential to code better than 90% of records,” Anderson said. “These records can be automatically coded in minutes, while manual review can take weeks,” he adds. “It just means more information is available faster.”
MedCoder is better able than earlier systems to deal with variations in the terms doctors, medical examiners and coroners use to describe deaths, Anderson says. The computer assigns one of 10,000 possible cause of death codes to a record. For example, when COVID is mentioned on a death certificate, it chooses U07.1. To improve the results, Anderson and his team used machine learning techniques that drew on a decade of national death certificate data to train MedCoder to recognize errors and other aberrations. So when a doctor fills in the death certificate with “Coronavirus 2019”, “SARS-CoV-2”, “Delta variant” or another name for the disease, the computer still codes it as U07.1. “The old system would say, ‘I can’t find that term in the dictionary,’ and put it out there for someone to look at,” Anderson explains. “[Now] the computer says, “Okay, I know what to do with this and what code to assign.”
[Related: AI confirms the pandemic bummed people out]
When installing the upgrades between June 6 and June 24, the National Center for Health Statistics suspended its processing of state-reported death data and did not update COVID surveillance datasets on the public page of the National Vital Statistics System. Counts from previous weeks in 2022 may temporarily appear low while the system catches up and reprocesses those records, the agency’s website notes.
“Once we get over this backlog here, the system will work pretty much like the old system,” Anderson says. “I don’t want people to worry that the data we release now is not comparable to the data we release before. It’s comparable; it’s just going to be a little more timely.
Mortality figures matter
It is unusual for death certificates to mention which variant of SARS-CoV-2 afflicted the deceased. But looking for patterns in more precise mortality data can help health experts understand how dangerous a new strain may be and whether additional precautions are needed.
“If deaths increase, it increases urgency,” Anderson says. “If the data isn’t as timely, then our situational awareness degrades by a week or two or maybe three.”
[Related: Omicron variants keep getting better at dodging our immune systems]
It’s also possible that faster data allowed the United States to recognize that it had reached 1 million COVID-19 deaths earlier. “Having better real-time data should hypothetically matter on many different fronts,” Smith says. “It’s important for public perception; it is important for political will.
Reported deaths tend to lag behind other warning signs such as an increase in positive COVID tests or hospitalizations. However, these measurements can be difficult to interpret. An increase in hospitalizations may indicate that more people are becoming seriously ill, but may not capture the full extent of the problem because not everyone with serious illness has access to hospitals.
“These are softer outcomes that incorporate both disease severity and other social and economic factors, whereas death is a hard outcome.” said Smith. “Mortality is the ultimate indicator – it’s black and white.”