Skip to main content

Tracking COVID-19 using online search data

Home > New and events > Tracking Covid 19 Using Online Search Data

i-sense researchers from University College London, led by Dr Vasileios Lampos, in collaboration with Public Health England, Microsoft Research, and Harvard Medical School are looking at ways of tracking COVID-19 using online search data to better understand the true extent of community spread.

Their current analysis, which uses machine learning models to make predictions of potential prevalence of COVID-19 in a population, focuses on a number of countries, including the United Kingdom, United States of America, Canada, Australia, France, and Italy.



How does the model identify prevalence of COVID-19?

The model analyses a set of specific key search terms from Google search queries on a daily basis. It proposes unsupervised machine learning models for COVID-19 based on weighted symptom categories taken from the National Health Service (NHS) first few hundred (FF100) survey as well as an expanded version of that incorporating the symptom of ‘loss of smell’ and generic terms about COVID-19.

It also proposes a preliminary approach for minimising the effect that news media may have on online searches.  

In addition, the report describes a transfer learning method, whereby a supervised model stemming from daily confirmed cases and search query frequencies from a country that is further along in the epidemic curve (e.g. Italy) is mapped to other countries.

This research project is ongoing and will continue to evolve over the outbreak, with results posted on a biweekly basis at https://github.com/vlampos/covid-19-online-search.

How does tracking online search data support surveillance?

This research could help to better understand community spread by identifying potential positive cases from individuals that may never present to their doctor. Outcomes of this project are given directly to Public Health England on a weekly basis.

A recent opinion piece in New York Times also discussed the importance of this research paradigm, particularly for understanding disease spread in parts of the world with poor testing infrastructure, and how this research can be used to better understand symptom patterns of the disease.

This work builds on an existing i-sense project, i-sense Flu Detector, which makes estimates about the prevalence of influenza-like-illness in England.

Acknowledgements

The team working on this research is an interdisciplinary group of researchers, including Dr Vasileios Lampos, Dr Simon Moura, and Prof Ingemar Cox (UCL Computer Science), Prof Rachel McKendry (UCL London Centre for Nanotechnology and Division of Medicine), Dr Michael Edelstein (Public Health England), Dr Elad Yom-Tov (Microsoft Research), and Maimuna Majumder (Harvard Medical School).

Related links