• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
Analytic Strategy Partners

Analytic Strategy Partners

Improve your analytic operations and refine your analytic strategy

  • Home
  • Blog
  • Books
  • About
  • Services
  • Contact Us

COVID-19 prediction models

Why Didn’t AI Contribute More to COVID-19 Research?

July 14, 2021 by Robert Grossman

The staircase of understanding data.

Over the last several months, a number of reports and technical publications have appeared describing the lack of success applying machine learning and AI to COVID-19 research. Although many people are surprised, the readers of this blog should not be. Getting data is hard, understanding data is hard, building models is hard, embedding AI models in systems is hard, and extracting value from AI systems is hard. This is one of the themes of this blog and why there is a big difference between applying an AI algorithm to a dataset and developing a successful AI application.

BMJ review of 232 predictive models

If you haven’t seen any of these reports, a good place to start is the review article that appeared in the April, 2021 issue of BMJ and which analyzed 169 studies describing 232 COVID prediction models [1]. The title of the article “Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal” gives a good idea of what is covered. What the authors found is that of the 232 predictive models,

  1. “[the] models are poorly reported and at high risk of bias, raising concern that their predictions could be unreliable when applied in daily practice.”
  2. “Two prediction models (one for diagnosis and one for prognosis) were identified as being of higher quality than others and efforts should be made to validate these in other datasets.”

In other words, of the 232 predictive models studied, all except two could be unreliable when applied in clinical practice and the remaining two needed to be validated with data from additional datasets. This is not the good news, but it is not surprising either.

Turing Institute Report

The Alan Turing Institute is a UK research institute focused on data science and AI with the tag line “We believe data science and artificial intelligence will change the world.” (So do I, by the way, but I also believe that this data driven revolution has been going on for the past thirty years.) The Turing Institute ran a series of workshops on COVID-19 research and produced a summary report [2]. Here are some of the conclusions of the report:

  1. “… the single most consistent message across the workshops was the importance – and at times lack – of robust and timely data. Problems around data availability, access and standardisation spanned the entire spectrum of data science activity during the pandemic. The message was clear: better data would enable a better response.”
  2. “… conventional data analysis has been at the heart of the COVID-19 response, not AI”.

Perhaps most noteworthy is what the report didn’t say. It didn’t describe successful AI tools and their impact, but rather described challenges getting data, difficulties standardizing the data for analysis, and the many issues related around inequality and inclusion. It also described the “difficulty of communicating transparently with other researchers, policy makers and the public, particularly around issues of modelling and uncertainty [2].”

Don’t be Surprised When Black Box Frameworks Fail to Bring Value

Why should readers of this blog not be surprised? First, several of the posts of the themes of this blog is about 1) the difficulties getting data (Chapter 6, Crossing the Data Chasm of my Primer), building models, deploying models in AI systems, and extracting value from the AI systems, and 2) how to develop analytic strategies and implement best practices to overcome these difficulties. For example, my post “Why do so many analytic and AI projects fail” suggests using a simple radar plot to measure over time the increase or decrease int the likelihood that an AI project will fail.

Today, we have good tools and software frameworks for building AI models, but it still requires a lot of work and a good understanding of the practice of analytics to build and deploy a useful AI system. As Figure 1 shows, one level of understanding comes from cleaning and preparing data modeling, another from building a model with the data, another from deploying the model, and a still higher level when you can extract value from the model. Even for skilled modelers, data leakage is always a problem and can be quite hard to track down. Often times, you do not understand how to clean, prepare and harmonize data appropriately until you have built a model. Similarly, it can be hard to determine what features are actually available for a model running in the field until you build and deploy your first system that uses the model. Almost always, you do not understand how to build the right model, deploy the model correctly, and create the right actions around the model’s output until you see the system performing in practice, understand what works and what doesn’t, and start again.

With deep learning frameworks such as PyTorch, Keras and TensorFlow, building models does not require features. You can take someone else’s data to use in transfer learning and someone else’s modeling software and treat both as blackboxes. With this approach, you shouldn’t surprised when the deployed model in the real world doesn’t bring value. The staircase of data understanding (Figure 1) is about the real world practice of analytics and the engineering of analytic systems, and most of what we teach these days are machine learning and AI algorithms over someone else’s data.

References

[1] Laure Wynants, Ben Van Calster, Gary S. Collins, Richard D. Riley, Georg Heinze, Ewoud Schuit, Marc MJ Bonten et al. “Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal.” British Medical Journal 369 (2020).

[2] Alan Turing Institute, Data science and AI in the age of COVID-19. Reflections on the response of the UK’s data science and AI community to the COVID-19 pandemic, retrieved from: www.turing.ac.uk (https://www.turing.ac.uk/research/publications/data-science-and-ai-age-covid-19-report) on July 2, 2021.

Filed Under: Uncategorized Tagged With: AI COVID-19 models, AI models vs AI systems, black box AI frameworks, COVID-19 prediction models, COVID-19 research, data leakage, data science for COVID-19 research, practice of analytics

Primary Sidebar

Recent Posts

  • Developing an AI Strategy: Four Points of View
  • Ten Books to Motivate and Jump-Start Your AI Strategy
  • A Rubric for Evaluating New Projects that Produce Data
  • How Does No-Code Impact Your Analytic Strategy?
  • The Different Varieties of Advisors & the Difference it Makes

Recent Comments

    Archives

    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • June 2019
    • May 2019
    • September 2018

    Categories

    • Uncategorized

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Copyright © 2025