Subscribe to RSS
DOI: 10.1055/s-0044-1787185
Comparing Clinician Estimates versus a Statistical Tool for Predicting Risk of Death within 45 Days of Admission for Cancer Patients
Authors
Funding Supported in part by the National Institutes of Health/National Cancer Institute Cancer Center Support grant P30 CA008748.
- Abstract
- Background and Significance
- Objectives
- Methods
- Results
- Discussion
- Conclusion
- Clinical Relevance Statement
- Multiple Choice Questions
- References
Abstract
Objectives While clinical practice guidelines recommend that oncologists discuss goals of care with patients who have advanced cancer, it is estimated that less than 20% of individuals admitted to the hospital with high-risk cancers have end-of-life discussions with their providers. While there has been interest in developing models for mortality prediction to trigger such discussions, few studies have compared how such models compare with clinical judgment to determine a patient's mortality risk.
Methods This study is a prospective analysis of 1,069 solid tumor medical oncology hospital admissions (n = 911 unique patients) from February 7 to June 7, 2022, at Memorial Sloan Kettering Cancer Center. Electronic surveys were sent to hospitalists, advanced practice providers, and medical oncologists the first afternoon following a hospital admission and they were asked to estimate the probability that the patient would die within 45 days. Provider estimates of mortality were compared with those from a predictive model developed using a supervised machine learning methodology, and incorporated routine laboratory, demographic, biometric, and admission data. Area under the receiver operating characteristic curve (AUC), calibration and decision curves were compared between clinician estimates and the model predictions.
Results Within 45 days following hospital admission, 229 (25%) of 911 patients died. The model performed better than the clinician estimates (AUC 0.834 vs. 0.753, p < 0.0001). Integrating clinician predictions with the model's estimates further increased the AUC to 0.853 (p < 0.0001). Clinicians overestimated risk whereas the model was extremely well-calibrated. The model demonstrated net benefit over a wide range of threshold probabilities.
Conclusion The inpatient prognosis at admission model is a robust tool to assist clinical providers in evaluating mortality risk, and it has recently been implemented in the electronic medical record at our institution to improve end-of-life care planning for hospitalized cancer patients.
Background and Significance
Discussions between oncology patients and clinicians regarding end-of-life care play an important role in guiding informed decision-making related to palliative services, treatment cessation, hospice, and other aspects of advanced care planning.[1] Although the importance of having these discussions is widely recognized, the majority of inpatients with cancer do not have advance directives completed on admission, with fewer than 20% of high-risk cancer patients discussing their wishes in advance with their outpatient oncologists.[1] [2] [3] Having an accurate assessment of a patient's prognosis is one of the challenges faced by providers when initiating these difficult conversations.[4]
Several studies have found that clinical providers are accurate in estimating patient survival less than one-fourth of the time, when accuracy is defined as having a prediction that is within 33% of the amount of time that the patient actually survived.[5] [6] Although the majority of studies show that oncologists overestimate how long a patient will survive, there is substantial variability in the literature and some studies find that clinicians give more pessimistic estimates relative to actual survival outcomes.[5] [7] [8] [9] While the degree of predictive inaccuracy has not been shown to be dependent on clinical experience or the number of years in practice,[7] [10] there is evidence that having a close doctor–patient relationship is associated with a lower prognostic accuracy.[5]
Prediction models offer significant advantages over clinician-based assessments because the prognostic capabilities of individual clinicians vary significantly and because statistical tools are able to integrate a large number of factors and apply them consistently and without subjectivity.[11] Variables that have previously demonstrated strong signal with respect to survival across a range of cancers include patient demographics, tumor characteristics,[12] and laboratory findings.[13] Machine learning (ML) algorithms based on electronic health record data are an emerging methodology that may complement clinical decision-making in predicting cancer mortality[14] [15] as well as other illnesses[16] [17] [18] and prompt end-of-life discussions between patients and their providers.
Objectives
We developed a regression analysis model to predict mortality risk over the subsequent 45 days in solid tumor cancer patients using routine laboratory, demographic, biometric, and admission data at the time of hospital admission. This time interval was selected because patient outcomes are not easily predicted by clinical providers over this intermediate-term time interval, yet end-of-life planning and interventions are still feasible. We compared the model's performance with the judgement of inpatient and outpatient providers identified as being part of the patient's care team using an automated survey-based approach completed within 24 hours of hospital admission. We also assessed whether patient or provider characteristics correlated with clinician risk estimates and performed decision curve analysis to evaluate the clinical utility of using the prediction model in practice.
Methods
Study Design and Outcomes
The purpose of this study was to examine whether clinical judgment or a predictive model that utilized routine laboratory tests, demographic, biometric, and admission data would be best for predicting intermediate-term clinical outcomes of solid tumor oncology patients at the time of hospital admission. In the prospective phase of this study, the cohort consisted of solid tumor adult patients admitted to the Department of Medicine at Memorial Sloan Kettering Cancer Center (MSKCC) between February 7 and June 7, 2022. Surgical, pediatric, hematologic, and neurology patients were excluded because these patients were expected to have different mortality risks and disease trajectories than adult solid tumor medicine patients.
The patient's clinical providers were surveyed by email to assess the provider's ability to predict patient mortality 45 days after admission. The primary endpoint used to compare accuracy of clinician and model predictions was death of the patient within 45 days of the admission. The clinician and model predictions were compared with actual patient outcomes at this time point in the retrospective phase of the study. The study was determined to be exempt by the institutional review board (IRB) of MSKCC under study protocols X21–030 A(3) comparing the clinician's estimate with the statistical model's prediction of patient mortality. A second IRB#18- 491A, LABMED WA-005–22 was submitted to compare the predictions with actual patient outcomes after 45 days.
Clinician Surveys
At MSKCC, a summary of a patient's providers is automatically generated from activity in a patient's chart, such as clicks, orders placed, documents generated, and medications ordered. This tool is called ROSTR (Real-time Online Summary of Team Resources) and in this study, it was used to identify a patient's primary care team members within 24 hours of admission. Consents for participation and surveys were emailed to the providers overseeing a patient's clinical care the first afternoon following admission. We collected information regarding the provider's role in patient care (hospitalist, nurse practitioner or physician's assistant, inpatient medical oncologist, outpatient primary oncologist); the duration of time the clinical provider had known the patient (1 day, less than a week, less than a month, less than a year, greater than a year or don't know them); and the mortality prediction at 45 days following admission (likelihood of dying within 45 days of this admission entered as a percentage using a slider) prospectively in real-time following patient admission. The survey data were compiled using a Research Electronic Data Capture (REDCap) database, a secure platform for designing clinical research surveys.
Model Development
To build a model to predict inpatient mortality within 45 days of admission, we extracted retrospective data from our institutional data warehouse: biometric (height, weight, bovine serum albumin, body mass index [BMI]), demographic (age, sex), admission history (days since and number of Memorial Sloan Kettering admissions in prior 2 months), and the most recent laboratory test results within 24 hours of admission (comprehensive metabolic panel analytes, complete blood count (CBC) with differential analytes, magnesium, phosphorus, prothrombin time, international normalized ratio, activated partial thromboplastin time, and lactate dehydrogenase). We chose the first 24 hours for model prediction because of the expectation to incorporate it into the electronic medical record and help guide clinical decision-making in real-time.
When building the model and making predictions on patients, records with insufficient data (either missing patient identifiers or unavailable CBC or complete metabolic panel results) were excluded. Primarily, we used continuous values for laboratory results. However, for laboratory tests that demonstrated a nonlinear relationship with mortality risk, we also added categorical values (very low, low, normal, high, very high). Individual missing numeric laboratory values were imputed using the median of the admitted solid tumor patient population at MSKCC.
A logistic regression model that uses least absolute shrinkage and selection operator (lasso) was developed. Using a supervised ML methodology, the lasso lambda parameter was iteratively tuned to maximize performance. The study cohort composed a training cohort (n = 14,591 admissions between 2017 and 2018), a validation cohort (n = 3,945 admissions during the first half of 2019), and a test cohort (n = 4,015 admissions during the second half of 2019). We used time-based partitions to help account for changes in cancer treatment and survival rates over time. Eligibility criteria included all adult patients admitted to solid tumor medicine services at MSKCC. Patients admitted to surgical, pediatric, hematologic, and neurology services were excluded based on the expectation that these patients might have different mortality risks and disease trajectories.
Although the logistic model with lasso was selected due to its strong performance and the interpretability of such models, two tree-based ML algorithms XGBoost (extreme gradient boosted decision trees) and random forest (random decision trees) were also evaluated using the same training and validation cohorts and performance was assessed by comparing the areas under the curve (AUCs) at 45 days. Additional validation of the logistic regression model with lasso was the basis of the clinical trial comparing its performance with the assessment performed by clinical providers as described in the Study Design and Outcomes and Results sections of this article. The model was developed using R version 3.6.3 ( http://www.r-project.org/ ).
Statistical Methods
We assumed AUCs close to 0.75, an event rate of 25%, and a correlation between clinician and model predictions of 0.5 and calculated that a sample size of 1,000 patients would give a difference between AUCs close to ± 0.05. We estimated that we would accrue at least 1,000 patients in 4 months and stopped the study once 4 months had passed. The primary endpoint was death within 45 days of admission. All patients whose deaths were known to have occurred within 45 days were included; however, individuals with a final follow-up less than 45 days after admission were excluded because we could not establish accuracy for clinician or model predictions. Patients could have multiple admissions and more than one clinician risk prediction per admission (if multiple care team members replied to their surveys for the same admission). As most patients had only a single admission and evaluation, and to avoid the problem of correlated observations, we evaluated only the patient's first admission in the primary analysis, and when there were multiple clinician predictions for a given admission, one survey was selected randomly. As a sensitivity analysis, we repeated our analyses on the full dataset.
A linear regression model was created to assess whether clinicians are influenced by patient characteristics when estimating the risk of death within 45 days. Clinician-predicted risk was the dependent variable; race, ethnicity, sex, primary language, marital status, age, and religion were the predictors, with admitting service and model-predicted risk as covariates to reflect health status. We then repeated this analysis to evaluate whether clinical provider type or duration of relationship with the patient impacted risk estimates, using the prediction from the first analysis as a covariate.
We then calculated discrimination as AUC, calibration, and net benefit each separately for clinician estimates and model estimates, as well as whether discrimination was improved by combining model and clinician estimates. The 95% confidence interval (CI) for the difference between clinician and model estimates was calculated by bootstrapping, with 5,000 resamples.[19] Net benefit was calculated across the full range of threshold probabilities.[20] [21] Finally, we compared discrimination and calibration between different types of clinicians and estimated the size of the difference in predictions where more than one clinician gave a risk estimate. All statistical analyses were conducted using Stata version 17.0 (Stata Corp., College Station, Texas, United States).
Results
In the 4-month survey period, 1,771 unique patients had 2,287 solid tumor hospital admissions, resulting in 6,363 email surveys to clinicians. The overall response rate was 31% (including a small proportion of clinicians who responded but subsequently declined to participate) and 86% of surveys were completed within 24 hours of receipt. Only survey responses completed within 24 hours of admission were used to capture the clinician's initial assessments of the patient—this best ensured a fair comparison with the model, which was also generating its prediction within the first 24 hours.
The 1,689 surveys completed within 24 hours represented 1,292 admissions and 1,095 unique patients. Admissions were excluded from the final dataset if there were errors with the survey or if patients were lost to follow-up (n = 154 admissions). Hospital admissions were also excluded if the model failed to generate a risk score due to incomplete patient data (n = 69 admissions). The remaining 1,069 admissions (n = 911 patients) were used for further analysis. Within this group, 661 patients had one clinician prediction survey, 219 patients had two provider surveys, and 31 patients had three completed clinical surveys ([Fig. 1]).


The demographic data are provided in [Table 1]. The average age for the 911 patients included in the study was 65 years and the group was composed of 43% males and 57% females. The study population was 12% African American, 11% Asian American, and 69% white. Patients of Hispanic ethnicity comprised 8.7% of the cohort. Other covariates included marital status, religion, and language. Statistical analysis of how patient demographics impacted clinician risk predictions are included in [Table 2]. Clinical providers estimated a higher risk of death by 6% for African American and 8% for Asian American patients relative to white patients. Non-English language speakers were given a 7% higher risk of death relative to English-speaking patients. Other factors, including age, sex, Hispanic ethnicity, religion, and marital status, did not have a statistically significant association with clinician's risk prediction. Characteristics of the clinicians completing the surveys did not show any statistically significant difference between type of clinician or length of the relationship between patient and provider ([Table 2]).
Note: Data are presented as median (quartiles) or frequency (percentage).
Abbreviations: APP, advanced practice provider; CI, confidence interval.
Note: For patient characteristics, the coefficient is the difference in risk given to patients with that characteristic, after multivariable adjustment. For clinicians, the coefficient is the difference given by providers with that characteristic after multivariable adjustment.
The top 20 numeric data features identified by the regression analysis model are shown in [Fig. 2]. The x-axis represents both the direction and value of the coefficients. Negative risk coefficients are “protective,” as in a higher value decreases the risk of mortality, and these features are shown in green. Positive risk coefficients are “riskier,” as in a higher value increases the risk of mortality, and these features are shown in red.


Major influences on the model include abnormalities in blood electrolytes (such as calcium, potassium, sodium, chloride, and the anion gap), biomarkers of renal status (including blood urea nitrogen and creatinine), hematologic parameters (including the mean corpuscular volume, mean corpuscular hemoglobin, red cell distribution width, eosinophils, platelets, neutrophils, and lymphocytes), liver enzymes (including both aspartate aminotransferase and alanine aminotransferase), and indicators of cachexia and poor health (including albumin, alkaline phosphatase, body surface area, and number of nonelective admissions in the prior 60 days). The pattern of biochemical abnormalities identified by the model to predict patient mortality after hospitalization are indicative of broad dysfunction across multiple organ systems. In other words, a patient with compromised renal and liver function, along with abnormal hematologic parameters, who is also becoming cachectic may have small changes in numerous laboratory tests on admission, and the model is able to identify a subtle pattern indicative of poor prognosis whereas the clinician may not.
One of the major findings of this study is that the statistical model outperformed clinical providers with an AUC of 0.834 versus 0.753 for clinicians, and this difference was found to be statistically significant when bootstrapping, with 95% CI of 0.036 to 0.125; p < 0.0001. Combining risk of death at 45 days predicted by the statistical model or clinical judgement yielded a statistically significant (p < 0.0001) improvement of the AUC to 0.853. The AUC boost from a combined clinician–model approach is important because the prognostic model implemented in the electronic medical record is intended to complement the clinical judgement of providers.
As a sensitivity analysis, the AUC calculations were computed using the entire dataset, including instances when more than one clinician gave a risk estimate or when a patient had multiple admissions during the study, and similar results were found (AUCs of 0.829 for the model vs. 0.761 for clinicians). The model was well-calibrated relative to clinician estimates with clinicians generally overestimating risk ([Fig. 3A, C]). That said, the AUC boost mentioned above from a combined clinician–model approach suggests that clinicians are using prognostic information that is not captured by the model.


When subgroups of clinical providers were compared, outpatient medical oncologists were able to predict mortality more accurately relative to other clinicians (AUC 0.804 vs. 0.744). However, this observation may be confounded by the length of the relationship with the patient (AUC 0.804 for clinicians with a relationship with the patient for a week or more, compared with 0.728 for clinicians who had known the patient for less than a week). In general, calibration remained extremely poor for clinician estimates, even when restricting to clinicians with a longer relationship to the patient ([Fig. 3C]), as clinicians consistently overestimated risk of death relative to actual patient outcomes in our study.
The decision curve comparing the predictive model and clinical prediction is shown in [Fig. 4]. The predictive model depicted in orange has a positive net benefit across most of the range of threshold probabilities except for a limited set of end-of-life decisions requiring probabilities at or above 85%. In contrast, the clinician predictions illustrated with the green line have net harm for any decision that requires at least a 50% risk. The net benefit for the model is always higher than that for clinician prediction, suggesting that utilization of the statistical model would result in a net increase in the proportion of patients directed toward end-of-life care planning relative to clinician prediction.


Integrating this tool into the electronic health record so it can provide timely, actionable information to clinical providers has been a primary goal for this project. As shown in [Fig. 5], patients are stratified into low-risk, high-risk, and very high risk categories. The electronic medical record tool shows the patient's qualitative risk category, the percent risk of death in 45 days, and suggested order sets, such as pain management, supportive care, hospice education, and social work consults. In addition, an email is sent to the attending doctor that prompts the provider to use the advanced illness support order set and arrange for a goals-of-care conversation with high-risk and very high risk patients within 48 hours. Providers currently have full discretion over whether to integrate their clinical intuition with results provided by the model and the intent is for this model to be used as a decision support tool intended to inform and not to supplant clinical judgement.


Discussion
We found that a statistical prediction tool was significantly more accurate than clinician predictions of 45-day mortality in adult patients with advanced solid tumor cancers. Decision curve analysis demonstrated that using the model to guide decision-making would lead to better clinical outcome than using clinician predictions. Although one might expect that cancer staging and performance status would be major drivers of patient prognosis, tumor registry data can be several years old by the time a patient is admitted. This study found that combining subtle changes in multiple laboratory, demographic, and admission data features can be predictive of patient survival without adding the additional components of cancer stage, diagnosis, performance status, and BMI.
The integration of this predictive model into our electronic health record and its acceptance by clinical staff have been a significant undertaking. Prior to implementation, presentations were held with service chiefs and other major stakeholders to educate clinical staff about the appropriate use of this predictive model. In addition, we designed a page on the hospital SharePoint site with a video on how to use this Prognosis at Admission prediction tool, along with descriptions of the top features used to make predictions to increase transparency, understanding, and facilitate use of this clinical tool. We have also created a monitoring dashboard to evaluate whether there is any drift in the model's performance in real-time.
Building a statistical model that uses logistic regression rather than a more complex ML model may have been advantageous in facilitating model interpretability, because the prognostic tool uses laboratory findings and demographics that are already familiar to clinicians. While more complex ML models such as neural nets and tree-based models can have better performance, they can also be more challenging to interpret. Performing this pilot study comparing the performance of the model relative to clinician judgement, building a model based on information routinely used by clinical staff, and having web-based resources available on an ongoing basis have helped facilitate understanding of the model's output.
While most patients with advanced cancers do not discuss their end-of-life care preferences with their providers, a recent study found that 87% of patients support a policy to have admitting physicians initiate a conversation about advanced directives upon hospital admission.[2] One of the barriers to initiating these discussions is that clinicians are not able to accurately predict patient survival.[5] [7] [8] [9] [22] [23] The incorporation of laboratory data to evaluate patient prognosis may improve the reliability of mortality estimates[13] [14] [15] [24] [25] and improve the accuracy of subjective evaluation.
In addition to the humanistic benefit of more accurately predicting intermediate-term mortality, there are economic implications as well. A recent study using a ML tool embedded into the electronic health record system at the University of Pittsburgh found a sustained improvement in palliative care practices and a doubling of the number of goals-of-care conversations between patients and their providers.[26] Patients with advance directives are more likely to use hospice care rather than die in an intensive care unit, which is also associated with better quality of care for both patients and their caregivers[27] and significant health care-related cost savings.[28]
Predictive models have the advantage of being able to integrate numerous biochemical abnormalities across multiple organ systems, as well as other patient variables, in a reproducible and quantitative way. In our study population, the clinicians tended to overestimate risk relative to real-life outcomes, which is evident from the clinician calibration curve falling below the solid diagonal line that represents perfect concordance for higher predicted probabilities ([Fig. 3A]). We also observed that clinical providers use rough estimates that are inherently imprecise, with mortality predictions clustered linearly along the quartiles and deciles ([Supplementary Fig. S1] [available in the online version]). Moreover, we observed a lack of concordance between clinician estimates when comparing surveys completed by two or more providers on the same patients. Only 19% of the paired surveys had a difference in predicted risk of mortality less than 5%, while nearly half (43%) had a risk difference greater than 20% ([Supplementary Fig. S2] [available in the online version]). Incorporating the use of a regression model that prompts clinical staff to initiate end of life and palliative care order sets is advantageous, because it can reduce the subjectivity inherent in relying entirely on physician estimates of patient mortality risk.
One potential weakness of this study is that the response rate of 31% raises the possibility of nonresponse bias. However, unlike the classic examples of nonresponse bias—where the method for obtaining a response makes it easier to obtain a response from some participants than others—there is no obvious mechanism whereby responders would be systematically different from nonresponders. Rather, the low response rate may reflect the inherent difficulty in obtaining responses from clinicians who are working in busy clinical environments. While some studies have investigated clinician response rates in general for surveys with no time cutoff, to our knowledge, no studies have specifically established how many clinicians working in the hospital setting should be expected to respond to an email survey within 24 hours. Moreover, methods used in the literature to increase the response rate such as sending multiple emails, using paper format rather than electronic surveys, following up with telephone calls, and using monetary incentives[29] [30] were not feasible because the added time and costs incurred would be prohibitive, it would be almost impossible to use these to get the clinician response within the required 24-hour timeframe, and it is unclear whether these efforts would have yielded more valid results.[31]
Multiple studies have investigated mortality prediction using statistical methods[12] [14] [15] [24] [25] [26] [32] [33] [34] [35] [36] [37] [38] and our work validates the utility of these predictive tools in an acute cancer care hospital focusing on intermediate-term mortality. Our survey methodology ([Supplementary Fig. S3] [available in the online version]) of predicting probability of death within a set intermediate timeframe is unique as many of the studies that investigate the accuracy of clinical prediction use categorical scoring systems, specific temporal estimates,[9] [10] or the “surprise question” asking providers “Would I be surprised if this patient died in the next year.”[12] [33] [39] [40] Our automated survey-based approach ([Supplementary Fig. S3] [available in the online version]) enabled us to send surveys to many more clinicians than might have otherwise been possible given time and resource constraints. We hypothesize that the simplicity of our survey design—requiring just three multiple-choice responses, and one sliding scale response—minimized the time burden for clinicians and thereby was a factor in achieving our response targets. Our survey approach also yields descriptive information such as role in patient care and length of the patient–provider relationship, while minimizing the potential for recall bias by stipulating survey completion within 24 hours of admission as an inclusion criterion.
Although we cannot prove that our training dataset was completely free of all social biases, our analysis suggests that the predictive model may be less subject to bias relative to mortality estimates performed by clinical providers, who gave African American and Asian American patients higher risks of death than White patients, and non-English-speaking patients higher risks of death than English-speaking patients. This study suggests that while predictive models can undoubtedly incorporate bias and exacerbate disparities,[41] they may also be used to counteract bias and alleviate disparities. The inpatient prognosis prediction model provides a robust tool to assist clinical providers in evaluating mortality risk and this study was an important step in launching the model in a clinical inpatient setting.
In parallel with this model, we have also created a monitoring dashboard to track its performance, investigate its stability across demographic groups, and observe whether major categorical and numeric features comprising the model drift over time ([Supplementary Fig. S4] [available in the online version]). Using this tool, we have found that the model performance has been extremely stable by comparing current performance with historical data over the past 5 years. We have also been evaluating the baseline mortality rate over time because changes due to new cancer therapies could impact model performance and found this metric has also been stable for the duration of the study.
Conclusion
Future directions for this work include evaluating the model performance over a longer time interval to determine how incorporating the model impacts patient care. For instance, the ability to predict life expectancy within a 6-month time interval may help patients qualify for hospice care benefits from Medicare and other insurers.[42] Model performance in a larger and more diverse patient cohort also requires evaluation and a further exploration of clinician biases through surveys or qualitative interviews of patients and providers would be insightful. Another important downstream consideration is how clinicians incorporate the model's predictions into their clinical practice. The model levels the field so that every clinician has the same information about the patient's prognosis, regardless of skill or clinical experience. Evaluating how this information is used to care for our patients and whether services are provided equally across demographic groups is another important future direction.
Generalizing this model to additional clinical services and other health care environments is another promising future direction that may be feasible, since the clinical and demographic features underlying the model are not specific for cancer patients. Understanding whether clinicians perform better at predicting deaths from disease progression relative to secondary issues such as infections and other complications would also be of interest. Development of a model that predicts survival in outpatients might help identify additional individuals who could benefit from end-of-life supportive care and resources.
Clinical Relevance Statement
We have built a statistical model that uses laboratory and admission data to predict mortality risk in cancer patients. It has been implemented to assist clinical providers in guiding patients toward end-of-life care planning.
Multiple Choice Questions
-
If the statistical model described in the article predicts that a patient is at high risk and the clinical provider believes the patient has a low probability of dying at 45 days, then the best practice is to:
-
Allow the predictive models to initiate an advanced illness support order set because the model was shown to be more accurate than clinical judgement. Initiating the order set will save time for clinical providers with minimal risk for patients.
-
Allow the clinician to decide whether the advanced illness support order set and goals of care conversations are appropriate in light of the statistical model prediction and their own judgement.
-
Suppress the information provided by the statistical model because conflicting perspectives will confuse other members of the clinical team as well as the patient and their family.
-
Assume the statistical model is correct because clinical providers are more likely to underestimate mortality risk, so patients who would benefit from advanced illness support orders and goals of care conversations might be missed.
Correct Answer: The correct answer is option b. The statistical model is best used to assist providers in by helping them integrate complex data, but this tool is not intended supplant clinical judgment.
-
-
An advantage of using a statistical model over a machine learning model is:
-
Mathematical relationships between input variables are clearly defined in a statistical model.
-
Statistical models can handle a greater number of inputs relative to machine learning models.
-
Statistical models can integrate a wide variety of categorial and numerical data inputs whereas the data inputs are more limited for machine learning models.
-
Traditional statistical models outperform clinical judgement more frequently than machine learning models.
Correct Answer: The correct answer is option a. Traditional statistical models define the mathematical relationships between variables, whereas the output from machine learning models are not always clearly explainable.
-
Conflict of Interest
None declared.
Acknowledgments
Andrew Zarski for support in building the clinical trial online survey. Sarah McCaskey for early clinical guidance. Natalia Summervile for data science expertise and project support. Nicole DiBacco, Brittany Gross-Jolly, Julie Lee, Vanessa Rodriguez, William Rosa, and Kenneth Rosenblatt for their clinical communications expertise. Josiah Chung for work on the model monitoring dashboard. For their work to put this model in production: Christine Fitzpatrick, Kimberly Gould, Gregory Jordan, Bigyan K C, Rashmi Kashyap, Jessica Kochan, Linda Li, Ian Morgan, Reshma Nevrekar, Ron Pearson, Jithin Thomas, Yuri Turin, Surendranatha Reddy Vellipalem, and Everett Weiss.
Protection of Human and Animal Subjects
The study was reviewed by the institutional review board of Memorial Sloan Kettering Cancer Center and is in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects.
* Shared first authorship.
-
References
- 1 Knutzen KE, Sacks OA, Brody-Bizar OC. et al. Actual and missed opportunities for end-of-life care discussions with oncology patients: a qualitative study. JAMA Netw Open 2021; 4 (06) e2113193
- 2 Dow LA, Matsuyama RK, Ramakrishnan V. et al. Paradoxes in advance care planning: the complex relationship of oncology patients, their physicians, and advance medical directives. J Clin Oncol 2010; 28 (02) 299-304
- 3 Lambden J, Zhang B, Friedlander R, Prigerson HG. Accuracy of oncologists' life-expectancy estimates recalled by their advanced cancer patients: correlates and outcomes. J Palliat Med 2016; 19 (12) 1296-1303
- 4 Deelen J, Kettunen J, Fischer K. et al. A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nat Commun 2019; 10 (01) 3346
- 5 Christakis NA, Lamont EB. Extent and determinants of error in physicians' prognoses in terminally ill patients: prospective cohort study. West J Med 2000; 172 (05) 310-313
- 6 Hui D, Park M, Liu D. et al. Clinician prediction of survival versus the Palliative Prognostic Score: which approach is more accurate?. Eur J Cancer 2016; 64: 89-95
- 7 Cheon S, Agarwal A, Popovic M. et al. The accuracy of clinicians' predictions of survival in advanced cancer: a review. Ann Palliat Med 2016; 5 (01) 22-29
- 8 Glare P, Virik K, Jones M. et al. A systematic review of physicians' survival predictions in terminally ill cancer patients. BMJ 2003; 327 (7408) 195-198
- 9 Chu C, Anderson R, White N, Stone P. Prognosticating for adult patients with advanced incurable cancer: a needed oncologist skill. Curr Treat Options Oncol 2020; 21 (01) 5
- 10 Hui D, Kilgore K, Nguyen L. et al. The accuracy of probabilistic versus temporal clinician prediction of survival for patients with advanced cancer: a preliminary report. Oncologist 2011; 16 (11) 1642-1648
- 11 Liao L, Mark DB. Clinical prediction models: are we building better mousetraps?. J Am Coll Cardiol 2003; 42 (05) 851-853
- 12 Gupta S, Tran T, Luo W. et al. Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry. BMJ Open 2014; 4 (03) e004007
- 13 Kawai N, Yuasa N. Laboratory prognostic score for predicting 30-day mortality in terminally ill cancer patients. Nagoya J Med Sci 2018; 80 (04) 571-582
- 14 Manz CR, Chen J, Liu M. et al. Validation of a machine learning algorithm to predict 180-day mortality for outpatients with cancer. JAMA Oncol 2020; 6 (11) 1723-1730
- 15 Parikh RB, Manz C, Chivers C. et al. Machine learning approaches to predict 6-month mortality among patients with cancer. JAMA Netw Open 2019; 2 (10) e1915997
- 16 Temple MW, Lehmann CU, Fabbri D. Natural language processing for cohort discovery in a discharge prediction model for the neonatal ICU. Appl Clin Inform 2016; 7 (01) 101-115
- 17 Chopannejad S, Sadoughi F, Bagherzadeh R, Shekarchi S. Predicting major adverse cardiovascular events in acute coronary syndrome: a scoping review of machine learning approaches. Appl Clin Inform 2022; 13 (03) 720-740
- 18 Farion KJ, Wilk S, Michalowski W, O'Sullivan D, Sayyad-Shirabad J. Comparing predictions made by a prediction model, clinical score, and physicians: pediatric asthma exacerbations in the emergency department. Appl Clin Inform 2013; 4 (03) 376-391
- 19 DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44 (03) 837-845
- 20 Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 2006; 26 (06) 565-574
- 21 Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016; 352: i6
- 22 Selby D, Chakraborty A, Lilien T, Stacey E, Zhang L, Myers J. Clinician accuracy when estimating survival duration: the role of the patient's performance status and time-based prognostic categories. J Pain Symptom Manage 2011; 42 (04) 578-588
- 23 Viganò A, Dorgan M, Bruera E, Suarez-Almazor ME. The relative accuracy of the clinical estimation of the duration of life for patients with end of life cancer. Cancer 1999; 86 (01) 170-176
- 24 Bertsimas D, Dunn J, Pawlowski C. et al. Applied informatics decision support tool for mortality predictions in patients with cancer. JCO Clin Cancer Inform 2018; 2: 1-11
- 25 Manz CR, Parikh RB, Small DS. et al. Effect of integrating machine learning mortality estimates with behavioral nudges to clinicians on serious illness conversations among patients with cancer: a stepped-wedge cluster randomized clinical trial. JAMA Oncol 2020; 6 (12) e204759
- 26 Oo TH. et al. Improved palliative care practices through machine-learning prediction of 90-day risk of mortality following hospitalization. NEJM Catal 2023; 4 (01) 1-20
- 27 Wright AA, Zhang B, Ray A. et al. Associations between end-of-life discussions, patient mental health, medical care near death, and caregiver bereavement adjustment. JAMA 2008; 300 (14) 1665-1673
- 28 Zhang B, Wright AA, Huskamp HA. et al. Health care costs in the last week of life: associations with end-of-life conversations. Arch Intern Med 2009; 169 (05) 480-488
- 29 Cronin S, Li A, Bai YQ. et al. How do respondents of primary care surveys compare to typical users of primary care? A comparison of two surveys. BMC Prim Care 2023; 24 (01) 80
- 30 Wong ST, Hogg W, Burge F, Johnston S, French I, Blackman S. Using the CollaboraKTion framework to report on primary care practice recruitment and data collection: costs and successes in a cross-sectional practice-based survey in British Columbia, Ontario, and Nova Scotia, Canada. BMC Fam Pract 2018; 19 (01) 87
- 31 Hendra R, Hill A. Rethinking response rates: new evidence of little relationship between survey response rates and nonresponse bias. Eval Rev 2019; 43 (05) 307-330
- 32 Cheng L, DeJesus AY, Rodriguez MA. Using laboratory test results at hospital admission to predict short-term survival in critically ill patients with metastatic or advanced cancer. J Pain Symptom Manage 2017; 53 (04) 720-727
- 33 Owusuaa C, van der Padt-Pruijsten A, Drooger JC. et al. Development of a clinical prediction model for 1-year mortality in patients with advanced cancer. JAMA Netw Open 2022; 5 (11) e2244350
- 34 Zhu Z, Li L, Ye Z. et al. Prognostic value of routine laboratory variables in prediction of breast cancer recurrence. Sci Rep 2017; 7 (01) 8135
- 35 Curtin D, Dahly DL, van Smeden M. et al. Predicting 1-year mortality in older hospitalized patients: external validation of the HOMR model. J Am Geriatr Soc 2019; 67 (07) 1478-1483
- 36 van Walraven C, Forster AJ. The HOMR-Now! Model Accurately Predicts 1-Year Death Risk for Hospitalized Patients on Admission. Am J Med 2017; 130 (08) 991.e9-991.e16
- 37 van Walraven C, McAlister FA, Bakal JA, Hawken S, Donzé J. External validation of the Hospital-patient One-year Mortality Risk (HOMR) model for predicting death within 1 year after hospital admission. CMAJ 2015; 187 (10) 725-733
- 38 Wegier P, Kurahashi A, Saunders S. et al. mHOMR: a prospective observational study of an automated mortality prediction model to identify patients with unmet palliative needs. BMJ Support Palliat Care 2021; 14 (e1): e969-e975
- 39 Davis M, Vanenkevort E, Young A. et al. Validation of the surprise question and the development of a multivariable model. J Pain Symptom Manage 2023; 65 (05) 456-464
- 40 Moss AH, Lunney JR, Culp S. et al. Prognostic significance of the “surprise” question in cancer patients. J Palliat Med 2010; 13 (07) 837-840
- 41 Aquino YSJ, Carter SM, Houssami N. et al. Practical, epistemic and normative implications of algorithmic bias in healthcare artificial intelligence: a qualitative study of multidisciplinary expert perspectives. J Med Ethics 2023; (e-pub ahead of print)
- 42 Medicare.gov. 2022 [cited 2022 12/23/2022]; Accessed May 13, 2024 at: https://www.medicare.gov/coverage/hospice-care
Address for correspondence
Publication History
Received: 15 December 2023
Accepted: 29 April 2024
Article published online:
26 June 2024
© 2024. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
References
- 1 Knutzen KE, Sacks OA, Brody-Bizar OC. et al. Actual and missed opportunities for end-of-life care discussions with oncology patients: a qualitative study. JAMA Netw Open 2021; 4 (06) e2113193
- 2 Dow LA, Matsuyama RK, Ramakrishnan V. et al. Paradoxes in advance care planning: the complex relationship of oncology patients, their physicians, and advance medical directives. J Clin Oncol 2010; 28 (02) 299-304
- 3 Lambden J, Zhang B, Friedlander R, Prigerson HG. Accuracy of oncologists' life-expectancy estimates recalled by their advanced cancer patients: correlates and outcomes. J Palliat Med 2016; 19 (12) 1296-1303
- 4 Deelen J, Kettunen J, Fischer K. et al. A metabolic profile of all-cause mortality risk identified in an observational study of 44,168 individuals. Nat Commun 2019; 10 (01) 3346
- 5 Christakis NA, Lamont EB. Extent and determinants of error in physicians' prognoses in terminally ill patients: prospective cohort study. West J Med 2000; 172 (05) 310-313
- 6 Hui D, Park M, Liu D. et al. Clinician prediction of survival versus the Palliative Prognostic Score: which approach is more accurate?. Eur J Cancer 2016; 64: 89-95
- 7 Cheon S, Agarwal A, Popovic M. et al. The accuracy of clinicians' predictions of survival in advanced cancer: a review. Ann Palliat Med 2016; 5 (01) 22-29
- 8 Glare P, Virik K, Jones M. et al. A systematic review of physicians' survival predictions in terminally ill cancer patients. BMJ 2003; 327 (7408) 195-198
- 9 Chu C, Anderson R, White N, Stone P. Prognosticating for adult patients with advanced incurable cancer: a needed oncologist skill. Curr Treat Options Oncol 2020; 21 (01) 5
- 10 Hui D, Kilgore K, Nguyen L. et al. The accuracy of probabilistic versus temporal clinician prediction of survival for patients with advanced cancer: a preliminary report. Oncologist 2011; 16 (11) 1642-1648
- 11 Liao L, Mark DB. Clinical prediction models: are we building better mousetraps?. J Am Coll Cardiol 2003; 42 (05) 851-853
- 12 Gupta S, Tran T, Luo W. et al. Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry. BMJ Open 2014; 4 (03) e004007
- 13 Kawai N, Yuasa N. Laboratory prognostic score for predicting 30-day mortality in terminally ill cancer patients. Nagoya J Med Sci 2018; 80 (04) 571-582
- 14 Manz CR, Chen J, Liu M. et al. Validation of a machine learning algorithm to predict 180-day mortality for outpatients with cancer. JAMA Oncol 2020; 6 (11) 1723-1730
- 15 Parikh RB, Manz C, Chivers C. et al. Machine learning approaches to predict 6-month mortality among patients with cancer. JAMA Netw Open 2019; 2 (10) e1915997
- 16 Temple MW, Lehmann CU, Fabbri D. Natural language processing for cohort discovery in a discharge prediction model for the neonatal ICU. Appl Clin Inform 2016; 7 (01) 101-115
- 17 Chopannejad S, Sadoughi F, Bagherzadeh R, Shekarchi S. Predicting major adverse cardiovascular events in acute coronary syndrome: a scoping review of machine learning approaches. Appl Clin Inform 2022; 13 (03) 720-740
- 18 Farion KJ, Wilk S, Michalowski W, O'Sullivan D, Sayyad-Shirabad J. Comparing predictions made by a prediction model, clinical score, and physicians: pediatric asthma exacerbations in the emergency department. Appl Clin Inform 2013; 4 (03) 376-391
- 19 DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44 (03) 837-845
- 20 Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making 2006; 26 (06) 565-574
- 21 Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ 2016; 352: i6
- 22 Selby D, Chakraborty A, Lilien T, Stacey E, Zhang L, Myers J. Clinician accuracy when estimating survival duration: the role of the patient's performance status and time-based prognostic categories. J Pain Symptom Manage 2011; 42 (04) 578-588
- 23 Viganò A, Dorgan M, Bruera E, Suarez-Almazor ME. The relative accuracy of the clinical estimation of the duration of life for patients with end of life cancer. Cancer 1999; 86 (01) 170-176
- 24 Bertsimas D, Dunn J, Pawlowski C. et al. Applied informatics decision support tool for mortality predictions in patients with cancer. JCO Clin Cancer Inform 2018; 2: 1-11
- 25 Manz CR, Parikh RB, Small DS. et al. Effect of integrating machine learning mortality estimates with behavioral nudges to clinicians on serious illness conversations among patients with cancer: a stepped-wedge cluster randomized clinical trial. JAMA Oncol 2020; 6 (12) e204759
- 26 Oo TH. et al. Improved palliative care practices through machine-learning prediction of 90-day risk of mortality following hospitalization. NEJM Catal 2023; 4 (01) 1-20
- 27 Wright AA, Zhang B, Ray A. et al. Associations between end-of-life discussions, patient mental health, medical care near death, and caregiver bereavement adjustment. JAMA 2008; 300 (14) 1665-1673
- 28 Zhang B, Wright AA, Huskamp HA. et al. Health care costs in the last week of life: associations with end-of-life conversations. Arch Intern Med 2009; 169 (05) 480-488
- 29 Cronin S, Li A, Bai YQ. et al. How do respondents of primary care surveys compare to typical users of primary care? A comparison of two surveys. BMC Prim Care 2023; 24 (01) 80
- 30 Wong ST, Hogg W, Burge F, Johnston S, French I, Blackman S. Using the CollaboraKTion framework to report on primary care practice recruitment and data collection: costs and successes in a cross-sectional practice-based survey in British Columbia, Ontario, and Nova Scotia, Canada. BMC Fam Pract 2018; 19 (01) 87
- 31 Hendra R, Hill A. Rethinking response rates: new evidence of little relationship between survey response rates and nonresponse bias. Eval Rev 2019; 43 (05) 307-330
- 32 Cheng L, DeJesus AY, Rodriguez MA. Using laboratory test results at hospital admission to predict short-term survival in critically ill patients with metastatic or advanced cancer. J Pain Symptom Manage 2017; 53 (04) 720-727
- 33 Owusuaa C, van der Padt-Pruijsten A, Drooger JC. et al. Development of a clinical prediction model for 1-year mortality in patients with advanced cancer. JAMA Netw Open 2022; 5 (11) e2244350
- 34 Zhu Z, Li L, Ye Z. et al. Prognostic value of routine laboratory variables in prediction of breast cancer recurrence. Sci Rep 2017; 7 (01) 8135
- 35 Curtin D, Dahly DL, van Smeden M. et al. Predicting 1-year mortality in older hospitalized patients: external validation of the HOMR model. J Am Geriatr Soc 2019; 67 (07) 1478-1483
- 36 van Walraven C, Forster AJ. The HOMR-Now! Model Accurately Predicts 1-Year Death Risk for Hospitalized Patients on Admission. Am J Med 2017; 130 (08) 991.e9-991.e16
- 37 van Walraven C, McAlister FA, Bakal JA, Hawken S, Donzé J. External validation of the Hospital-patient One-year Mortality Risk (HOMR) model for predicting death within 1 year after hospital admission. CMAJ 2015; 187 (10) 725-733
- 38 Wegier P, Kurahashi A, Saunders S. et al. mHOMR: a prospective observational study of an automated mortality prediction model to identify patients with unmet palliative needs. BMJ Support Palliat Care 2021; 14 (e1): e969-e975
- 39 Davis M, Vanenkevort E, Young A. et al. Validation of the surprise question and the development of a multivariable model. J Pain Symptom Manage 2023; 65 (05) 456-464
- 40 Moss AH, Lunney JR, Culp S. et al. Prognostic significance of the “surprise” question in cancer patients. J Palliat Med 2010; 13 (07) 837-840
- 41 Aquino YSJ, Carter SM, Houssami N. et al. Practical, epistemic and normative implications of algorithmic bias in healthcare artificial intelligence: a qualitative study of multidisciplinary expert perspectives. J Med Ethics 2023; (e-pub ahead of print)
- 42 Medicare.gov. 2022 [cited 2022 12/23/2022]; Accessed May 13, 2024 at: https://www.medicare.gov/coverage/hospice-care









