Clinical prediction models in the COVID-19 pandemic – helpful or misleading?
What for?
Facing the world-wide spread of coronavirus disease 2019 (COVID-19) infections with hospitals overcharged and a shortage of medical equipment in several countries, diagnostic and prediction models might help to identify patients with COVID-19 and predict the likelihood of the disease outcome (e.g. severity of disease, recovery, death).
From patients with confirmed disease, who is likely to progress to severe disease, who is likely to recover, who is likely to die (prognosis)?
Can we diagnose covid-19 in patients with suspected disease (diagnosis)?
Modelling
Models of various types have recently been proposed for diagnosis and prognosis of COVID-19 including rule based scoring systems, multivariate logistic regression and advanced machine learning models. Machine learning models are used to extract features from computer tomography (CT) images. When building a predictive model adequate selection of predictors is crucial and can be achieved by variable selection procedures (backward selection, ridge regression, LASSO etc.) but should also consider expert opinion. The review by Wynants, Van Calster et al. (2020) systematically screened COVID-19 studies (2696 titles) and identified 31 prediction models for the diagnosis and prognosis of COVID-19 that were critically assessed.
Relevant predictors that were identified in more than one model are shown below:
Diagnostic Model | Prognostic Model | |
---|---|---|
Demografics | ||
Age | X | X |
Sex | X | |
Desease symptoms | ||
Body temperature/fever | X | |
(Respiratory) signs/symptoms (such as shortness of breath, headache, shiver, sore throat, and fatigue) |
X | |
Laboratory parameters | ||
C-reactive protein | X | |
Lactic dehydrogenase | X | |
Lymphocyte count | X | |
Albumin or albumin/globin | X | X |
Direct bilirubin | X | X |
Red blood cell distribution width | X | X |
CT | ||
Features derived from CT scans | X |
Pitfalls
Most models proposed for diagnosis and prognosis of COVID-19 show excellent discriminative performance. However, all models reviewed by Wynants, Van Calster et al. (2020) were judged to be at high risk of bias. This is mainly because the models were fitted on data that were not representative of the target population. For instance, people without COVID-19 (controls) were underrepresented in diagnostic models. In prognostic models most studies excluded patients who neither recovered nor died at the end of the study period. However, to avoid sampling bias censoring should be accounted for.
Currently, most studies use data from China. Hence, generalizability of these findings to other countries with differing ethic groups, living conditions, and health care systems might be difficult. Moreover, many of manuscripts have not been peer reviewed at the time of the systematic review.
Now what?
Facing the coronavirus 2019 pandemic, authors worldwide have developed an astonishing amount of diagnostic and prognostic models in a short amount of time. Models should be validated and updated in larger, international datasets representative of the target population and thoroughly peer reviewed before using them in clinical applications. Eventually, those models might help to detect COVID-19 infections in patients with symptoms and predict the course of a diagnosed COVID-19 infection with a high discriminative performance.
However, if not carefully validated for a representative population, models could do more harm than good.
References
Wynants, L., B. Van Calster, M. M. J. Bonten, G. S. Collins, T. P. A. Debray, M. De Vos, M. C. Haller, G. Heinze, K. G. M. Moons, R. D. Riley, E. Schuit, L. J. M. Smits, K. I. E. Snell, E. W. Steyerberg, C. Wallisch and M. van Smeden (2020). “Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal.” Bmj 369: m1328.
Picture: @alexkich/AdobeStock.com