Machine learning spots language disparities to improve COVID-19 tracing
Thanks to researchers at Stanford University, public health officials in Stata Clara County, California, can better predict individual’s language needs, helping contact tracers resolve cases faster.
Among the challenges posed by the pandemic were language barriers public health agencies faced as they struggled to trace infection spread among Latino communities.
In Santa Clara County, Calif., only 25% of the population is Latino, but it accounted for more than 56% of the state’s COVID cases. That put Spanish-speaking contact tracers – who call patients with diagnoses, identify and notify their contacts and assist with isolation and quarantine – in high demand.
These Spanish-speaking contact tracers have been key to reaching potentially infected individuals as quickly as possible, but with thousands of cases per day -- and limited numbers of Spanish speakers and interpreters -- it can take days to alert a patient’s contacts.
An additional challenge is that Spanish-speaking residents may be reluctant to talk with government employees asking for complex, personal information -- especially through someone not fluent in the language, according to a report in Stanford University’s Human-Centered AI News.
To improve contact tracing in the Latino community, Santa Clara County health officials partnered with experts from Stanford University’s RegLab -- a group that designs and evaluates programs, policies and technologies to modernize government. They wanted to see if they could predict when a contact speaks only Spanish or has limited English proficiency. With that insight, the county could then assign the patient to one of the county’s native Spanish speakers.
A team led by Daniel Ho, faculty director of the RegLab and associate director of the Stanford Institute for Human-Centered Artificial Intelligence, used machine learning to predict people’s language needs, helping contact tracers resolve cases faster and narrowing the health gap between the county’s Latino and other communities.
Contact tracers usually start with only the most basic information about the people they call, such as the patient’s name, address, date of birth and test result.
Researchers combined that bare-bones data with demographic information from the census and other administrative data. A machine learning algorithm analyzed and weighed data like census block group, age and name-based race and ethnicity information from census and mortgage data and identified patterns that would predict a language preference. Contacts were scored as to which language they would likely to prefer before they were assigned to a tracer.
To test the algorithm’s effectiveness, the RegLab worked with Santa Clara County to conduct a test that randomly routed half of the cases to a “language specialty team” with bilingual speakers and treated the other half with the county’s typical process.
In just two months, the benefits became clear. In the test group, the time it took to complete cases dropped by nearly 14 hours over the control. Same-day completions rose by 12%, and the number of people refusing to be interviewed dipped by 4%.
“Based on the results and success of this trial, Santa Clara County has expanded language matching to all of [Santa Clara Public Health Department’s Case Investigation and Contact Tracing], and the state of California is contemplating adoption in the statewide system,” the authors said in in their paper.
“When we connect with people in their preferred language, it makes a huge difference in their willingness to share information about themselves, their health, and their families and friends,” said study co-author Dr. Sarah Rudman, director of contact tracing for the County of Santa Clara Public Health Department.
The new approach has not only improved people’s willingness to engage in the process, but it has also allowed Rudman to ensure the county’s bilingual tracers could be assigned to the contacts most likely to need them.
“Before the algorithm you could hear frustration in the voices of our tracers when they would get mismatched with a contact,” Rudman said. “After the algorithm, there would be talk of the families they had connected with, many of whom stayed on the phone only because the tracer spoke Spanish and pronounced their name correctly.”
When every missed contact can mean additional infections, these are significant improvements, Ho said, noting that the partnership between people and machine was a surprising -- and refreshing -- outcome for him.
“There’s much worry in the AI community about whether machines will displace human judgment,” he said, “But, this case is a model for how machines and people can integrate in complex ways that make both better.”
The full study is available in the Proceedings of the National Academies of Science.