Thromb Haemost 2022; 122(04): 570-577
DOI: 10.1055/a-1525-7220
Stroke, Systemic or Venous Thromboembolism

Machine Learning to Predict Outcomes in Patients with Acute Pulmonary Embolism Who Prematurely Discontinued Anticoagulant Therapy

Damián Mora
1   Department of Internal Medicine, Hospital Virgen de la Luz, Cuenca, Spain
José A. Nieto
1   Department of Internal Medicine, Hospital Virgen de la Luz, Cuenca, Spain
Jorge Mateo
2   Neurobiological Research Group, Institute of Technology, Universidad de Castilla-La Mancha, Cuenca, Spain
Behnood Bikdeli
3   Cardiovascular Medicine Division, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States
4   Yale/YNHH Center for Outcomes Research and Evaluation, New Haven, Connecticut, United States
5   Cardiovascular Research Foundation (CRF), New York, New York, United States
6   Clinic of Angiology, University Hospital Zurich, Zurich, Switzerland
7   Center for Thrombosis and Hemostasis, University Hospital Mainz, Mainz, Germany
Javier Trujillo-Santos
8   Department of Internal Medicine, Hospital General Universitario Santa Lucía, Universidad Católica de Murcia, Murcia, Spain
Silvia Soler
9   Department of Internal Medicine, Hospital Olot i Comarcal de la Garrotxa, Gerona, Spain
Llorenç Font
10   Department of Haematology, Hospital de Tortosa Verge de la Cinta, Tarragona, Spain
11   Faculty of Medicine, University Cardiology Clinic, Skopje, Republic of Macedonia
Manuel Monreal
12   Department of Internal Medicine, Hospital Germans Trias i Pujol, Badalona, Barcelona, Spain
13   Department of Medicine, Universidad Católica de Murcia, Murcia, Spain
the RIETE Investigators › Author Affiliations
Funding The sponsors of RIETE had no role in study design, data collection, data analysis, data interpretation, or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication. B.B was supported by the National Heart, Lung, and Blood Institute, National Institutes of Health (NIH), through grant number T32 HL007854. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Dr. Bikdeli reports that he has been a consulting expert (on behalf of the plaintiff) for litigation related to a specific type of IVC filters. The current study is the idea of the investigators and has not been performed at the request of a third party.


Background Patients with pulmonary embolism (PE) who prematurely discontinue anticoagulant therapy (<90 days) are at an increased risk for death or recurrences.

Methods We used the data from the RIETE (Registro Informatizado de Pacientes con Enfermedad TromboEmbólica) registry to compare the prognostic ability of five machine-learning (ML) models and logistic regression to identify patients at increased risk for the composite of fatal PE or recurrent venous thromboembolism (VTE) 30 days after discontinuation. ML models included decision tree, k-nearest neighbors algorithm, support vector machine, Ensemble, and neural network [NN]. A “full” model with 70 variables and a “reduced” model with 23 were analyzed. Model performance was assessed by confusion matrix metrics on the testing data for each model and a calibration plot.

Results Among 34,447 patients with PE, 1,348 (3.9%) discontinued therapy prematurely. Fifty-one (3.8%) developed fatal PE or sudden death and 24 (1.8%) had nonfatal VTE recurrences within 30 days after discontinuation. ML-NN was the best method for identification of patients experiencing the composite endpoint, predicting the composite outcome with an area under receiver operating characteristic (ROC) curve of 0.96 (95% confidence interval [CI]: 0.95–0.98), using either 70 or 23 variables captured before discontinuation. Similar numbers were obtained for sensitivity, specificity, positive predictive value, negative predictive value, and accuracy. The discrimination of logistic regression was inferior (area under ROC curve, 0.76 [95% CI: 0.70–0.81]). Calibration plots showed similar deviations from the perfect line for ML-NN and logistic regression.

Conclusion The ML-NN method very well predicted the composite outcome after premature discontinuation of anticoagulation and outperformed traditional logistic regression.

* A full list of the RIETE investigators is given in [ Supplementary Appendix A ] .

Supplementary Material

Publication History

Received: 03 November 2020

Accepted: 08 June 2021

Accepted Manuscript online:
09 June 2021

Article published online:
13 July 2021

© 2021. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

  • References

  • 1 Barritt DW, Jordan SC. Anticoagulant drugs in the treatment of pulmonary embolism. A controlled trial. Lancet 1960; 1 (7138): 1309-1312
  • 2 Kearon C, Akl EA, Ornelas J. et al. Antithrombotic therapy for VTE disease: CHEST Guideline and Expert Panel report. Chest 2016; 149 (02) 315-352
  • 3 Nieto JA, Vicente JA, Prieto LM. et al; RIETE Investigators. Thirty-day outcomes in patients with acute pulmonary embolism who discontinued anticoagulant therapy before 90 days. Am Heart J 2018; 206: 1-10
  • 4 Goldhaber SZ, Visani L, De Rosa M. Acute pulmonary embolism: clinical outcomes in the International Cooperative Pulmonary Embolism Registry (ICOPER). Lancet 1999; 353 (9162): 1386-1389
  • 5 Laporte S, Mismetti P, Décousus H. et al; RIETE Investigators. Clinical predictors for fatal pulmonary embolism in 15,520 patients with venous thromboembolism: findings from the Registro Informatizado de la Enfermedad TromboEmbolica venosa (RIETE) Registry. Circulation 2008; 117 (13) 1711-1716
  • 6 Jiménez D, Aujesky D, Moores L. et al; RIETE Investigators. Simplification of the pulmonary embolism severity index for prognostication in patients with acute symptomatic pulmonary embolism. Arch Intern Med 2010; 170 (15) 1383-1389
  • 7 Jiménez D, Aujesky D, Díaz G. et al; RIETE Investigators. Prognostic significance of deep vein thrombosis in patients presenting with acute symptomatic pulmonary embolism. Am J Respir Crit Care Med 2010; 181 (09) 983-991
  • 8 Jiménez D, Kopecna D, Tapson V. et al; On Behalf Of The Protect Investigators. Derivation and validation of multimarker prognostication for normotensive patients with acute symptomatic pulmonary embolism. Am J Respir Crit Care Med 2014; 189 (06) 718-726
  • 9 Gulshan V, Peng L, Coram M. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016; 316 (22) 2402-2410
  • 10 Golden JA. Deep learning algoritms for detection of lymph node metastases from breast cancer: helping artificial intelligence be seen. JAMA 2017; 318 (22) 2184-2186
  • 11 Pirracchio R, Petersen ML, Carone M, Rigon MR, Chevret S, van der Laan MJ. Mortality prediction in intensive care units with the Super ICU Learner Algorithm (SICULA): a population-based study. Lancet Respir Med 2015; 3 (01) 42-52
  • 12 Mortazavi BJ, Bucholz EM, Desai NR. et al. Comparison of machine learning methods with national cardiovascular data registry models for prediction of risk of bleeding after percutaneous coronary intervention. JAMA Netw Open 2019; 2 (07) e196835
  • 13 Kawaler E, Cobian A, Peissig P, Cross D, Yale S, Craven M. Learning to predict post-hospitalization VTE risk from EHR data. AMIA Annu Symp Proc 2012; 2012: 436-445
  • 14 Willan J, Katz H, Keeling D. The use of artificial neural network analysis can improve the risk-stratification of patients presenting with suspected deep vein thrombosis. Br J Haematol 2019; 185 (02) 289-296
  • 15 Wang X, Yang YQ, Liu SH, Hong XY, Sun XF, Shi JH. Comparing different venous thromboembolism risk assessment machine learning models in Chinese patients. J Eval Clin Pract 2020; 26 (01) 26-34
  • 16 Rucco M, Sousa-Rodrigues D, Merelli E. et al. Neural hypernetwork approach for pulmonary embolism diagnosis. BMC Res Notes 2015; 8: 617
  • 17 Nudel J, Bishara AM, de Geus SWL. et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc 2020; 35 (01) 182-191
  • 18 Nafee T, Gibson CM, Travis R. et al. Machine learning to predict venous thrombosis in acutely ill medical patients. Res Pract Thromb Haemost 2020; 4 (02) 230-237
  • 19 Bikdeli B, Jimenez D, Hawkins M. et al; RIETE Investigators. Rationale, Design and Methodology of the Computerized Registry of Patients with Venous Thromboembolism (RIETE). Thromb Haemost 2018; 118 (01) 214-224
  • 20 Han J, Kamber M, Pei J. Data Mining: Concepts and Techniques. San Francisco, CA: Morgan Kaufmann Publishers; 2005
  • 21 Rivera-Lopez R, Canul-Reich J. Construction of near-optimal axis-parallel decision trees using a differential-evolution-based approach. IEEE Access 2018 6. 5548-5563
  • 22 Zhang S, Li X, Zong M, Zhu X, Wang R. Efficient kNN classification with different numbers of nearest neighbors. IEEE Trans Neural Netw Learn Syst 2018; 29 (05) 1774-1785
  • 23 Xing W, Bei Y. Medical health big data classification based on KNN classification algorithm. IEEE Access 2020 8. 28808-28819
  • 24 Yu S, Li X, Zhang X, Wang H. The OCS-SVM: an objective-cost-sensitive SVM with sample-based misclassification cost invariance. IEEE Access 2019 7. 118931-118942
  • 25 Kafai M, Eshghi K. CROification: accurate kernel classification with the efficiency of sparse linear SVM. IEEE Trans Pattern Anal Mach Intell 2019; 41 (01) 34-48
  • 26 Chen C, Dong D, Qi B, Petersen IR, Rabitz H. Quantum ensemble classification: a sampling-based learning control approach. IEEE Trans Neural Netw Learn Syst 2017; 28 (06) 1345-1359
  • 27 Yu Z, Wang D, Zhao Z. et al. Hybrid incremental ensemble learning for noisy real-world data classification. IEEE Trans Cybern 2019; 49 (02) 403-416
  • 28 Li Y, Wang XD, Luo ML, Li K, Yang XF, Guo Q. Epileptic seizure classification of EEGs using time-frequency analysis based multiscale radial basis functions. IEEE J Biomed Health Inform 2018; 22 (02) 386-397
  • 29 Lam D, Wunsch D. Unsupervised feature learning classification with radial basis function extreme learning machine using graphic processors. IEEE Trans Cybern 2017; 47 (01) 224-231
  • 30 Efron B, Tibshirani R. Improvements on cross-validation: the 632+ bootstrap method. J Am Stat Assoc 1977; 92: 438
  • 31 Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006; 27: 861-874
  • 32 Zhou X-H, Obuchowski NA, McClish DK. Statistical Methods in Diagnostic Medicine, 2nd ed. Hoboken, NJ: Wiley; 2011
  • 33 Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J 2014; 35 (29) 1925-1931
  • 34 Guo Ch, Pleiss G, Sun Y, Weinberger KQ. On calibration of modern neural networks. Paper presented at: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 2017
  • 35 Snoek J, Larochelle H, Adams RP. Practical bayesian optimization of machine learning algorithms. Adv Neural Inf Process Syst 2012; 25: 2951-2959
  • 36 Wang HL, Hsu WY, Lee MH. et al. Automatic machine-learning-based outcome prediction in patients with primary intracerebral hemorrhage. Front Neurol 2019; 10: 910
  • 37 Rajkomar A, Dean J, Kohane I. Machine learning in medicine. N Engl J Med 2019; 380 (14) 1347-1358