Google has rolled out AI technology capable of predicting early signs of disease based on sound. HeAR is a bioacoustic foundation model trained on 300 million pieces of audio data from around the world that can detect subtle differences in cough patterns. The AI can be loaded onto a smartphone and could become a critical tool in early detection of diseases, such as tuberculosis, in areas where expensive diagnostic hardware isn’t available. Google is also researching a model that uses ultrasound for early breast cancer detection.
Earlier this year, Google introduced HeAR, a bioacoustics LLM designed for researchers building task-specific models that learn to listen to sounds and flag early signs of disease. The LLM was trained on 300m pieces of diverse multinational audio data, able to distinguish 100m cough sounds.
The goal is to deploy the tech on smartphones that track populations in underserved geographies, replacing expensive diagnostic means such as X-ray and radiologists, tackling the world’s top infectious killer – undiagnosed Tuberculosis (TB).
The target business model is to offer the AI to health workers with low training to triage patients for further investigation and treatment.
Yet there are several projects out there that apply digital biomarkers, all suffering from the same limitations on the path to real-life impact as many software medical devices do. They lack evidence that proves the clinical robustness of the tools. At the same time, the translational chasm must be bridged to leverage the significant knowledge mobilization potential of LLM technology.
- Accessing diverse and large data sets is critical to deploy and refine their advanced learning capabilities with several connected downstream applications where anchoring can take place. Specifically to:Test the foundation model with different clinical data sets/ patient contexts/ demographics – e.g., to test an ability to distinguish the “TB sounds” from other aetiologies associated with underlying comorbidities that may be “masking” sounds.
- Offer the opportunity to test the model’s clinical efficacy in specific downstream applications submitted for marketing approval as medical devices, at the same time improving the performance of the LLM with reference to specific vulnerable groups.
- Produce the volume and quality of real-world clinical evidence that can be used to assess the technology for cost-effectiveness and add it to reimbursable clinical pathways.
- Ensure that the LLM does not collapse or hallucinate with the synthetic data it produces while learning and that specific patient groups are not negatively racialized and thus excluded from the early treatment the LLM-powered devices implement.
For this purpose, the HeRA team aims to collaborate with several clinical ecosystems, including The StopTB Partnership, a UN-hosted organization that aims to end TB by 2030.