Key-words:
Coronavirus disease 2019 - differential neutrophil count - logistic regression model
- random blood sugar - receiver operating characteristic curve
Introduction
Coronavirus disease 2019 (COVID-19) is a disease caused by severe acute respiratory
syndrome-coronavirus-2 (SARS-CoV-2). The emergence of the disease occurred in more
than 200 countries of the world.[[1]] The demographic profile of the SARS-CoV-2 infection varies widely across age, gender,
and socioeconomic strata. Similarly, the clinical spectrum of the disease encompasses
asymptomatic infection, mild upper respiratory tract illness, severe viral pneumonia
with respiratory failure, and even death.[[2]] There is a continuous search for efficient indicators of disease diagnosis, disease
severity, therapeutic response, and disease outcome. According to the 5th edition
of National Treatment Guidelines, the disease severity of COVID-19 is classified into
four stages on the basis of pulmonary imaging. The present study was undertaken to
develop a predictive model of disease mortality using easily available, cost-effective
blood indicators that include random blood sugar (RBS) and complete blood count. This
can be used as a quick screening tool, and patients with high risk would be evaluated
for more precise indictors of mortality. During the epidemic, this strategy could
be beneficial to triage patients and provide adequate management to these patients.
Patients and Methods
A hospital-based, retrospective, case–control study was designed in the SMS Medical
College and Hospital, Jaipur, to develop a prediction model for mortality risk using
logistic regression analysis in COVID-19 patients. The patients were managed as per
standard protocol of the institute. The study included case records of 23 nonsurvivors
(33%) and 47 survivors (67%) with laboratory-confirmed SARS-CoV-2 infection. The demographic
and laboratory details at the time of admission were collected to create database.
In case of missing data, the whole observation has been removed from the analysis.
The dependent variable was qualitative, either survivor or nonsurvivor. The regressors
(or predictors) included age, gender, presence of symptoms, RBS, and complete blood
count.
The aim of the present study is to develop a predictor model, where the choice of
regressors is not important in the sense that two models based on different regressors
can be equally good in prediction. There are no implications of causality or even
association, which are in the purview of explanatory and causal models.[[3]]
The current scenario necessitates the early development of predictor model. In order
to develop model with small sample, multiple analyses were performed to extract a
subset of regressors from a bunch. The general rule is to have at least ten participants
for each category of regressor.[[4]] The preliminary analysis of the data includes univariate logistic regression analysis
and comparison of means of all regressors in survivor and nonsurvivor groups. The
regressors that show significant difference of means in two groups were selected and
correlations were found among them. Between the two regressors that are significantly
correlated, one with higher odds ratio (OR) was selected. The top five regressors
(as the sample has 57 patients) having the highest ORs were selected to fit a multivariate
logistic regression model using a step-wise method [[Figure 1]]. The best model was chosen with Akaike information criterion (AIC). The best model
is one with minimum AIC value. The regressors contributing significantly in the prediction
model were used to train a 5-fold cross-validation logistic regression model, and
cutoff probability, area under the receiver operating characteristic (ROC) curve,
sensitivity, specificity, and accuracy were calculated.[[5]]
Figure 1: Flowchart of the study
In the present logistic regression model, the success was defined when dependent variable
took the value nonsurvival (or mortality).
The present study has included 21 regressors that are age (in years), gender (male
or female), presence or absence of symptoms (symptomatic or asymptomatic), RBS in
milligrams per deciliter, hemoglobin (Hb) in grams%, total leukocyte count (TLC) in
103 cells per cubic millimeter, total red blood cell count in million cells per cubic
millimeter, mean corpuscular volume in femtoliters per cell, mean corpuscular hemoglobin
(MCH) in picograms per cell, mean corpuscular Hb concentration in grams per deciliter,
red blood cell distribution width-coefficient of variation, platelet count (PLT) in
105 per cubic millimeters, packed cell volume in percent, differential neutrophil
count (NPHIL) in percent, differential lymphocyte count (LYMP) in percent, absolute
neutrophil count (ANC) in 103 cells per cubic millimeter, absolute lymphocyte count
(ALC) in 103 cells per cubic millimeter, differential monocyte count (MONO) in percent,
absolute monocyte count in 103 cells per cubic millimeter, differential neutrophil
count-to-differential lymphocyte count ratio (NLR), and ANC-to-absolute lymphocyte
count ratio (ANLR).
As the outcome (mortality) in the study was <5% (case fatality rate 2%–3%), we used
OR as a fairly good approximation of relative risk (RR) of death.[[4]],[[6]] To predict the outcome, the test values of regressor were put in the prediction
model to obtain a probability. The probability value obtained from the prediction
model should be compared with the cutoff probability of the ROC curve. The prediction
probability greater than the cutoff probability favors the outcome (mortality).
Statistical analysis
The continuous random variables were expressed as mean (standard deviation) or median
(interquartile range [IQR]) and compared with Mann–Whitney test. The categorical variables
were expressed as proportions and compared with Chi-square test or Z test for proportions.
Before applying the above tests, the assumptions of normality or variances were checked.
The significance of logistic coefficients was tested using Wald test. The best model
was selected using AIC and tested with Chi-square test. The level of statistical significance
was considered at 5%.
The logistic regression model analysis was performed on JASP Team (2020). JASP (Version
0.11.1.0)[Computer software]. University of Amsterdam, Netherlands. MATLAB Team 2016a,
MathWorks, MATLAB (Version 9.0.0341360) [Mathematical Computing software] Natick,
Massachusetts, USA. software platforms.
Results
Seventy patients were enrolled in the study with a median age of 50 years (IQR 30–60
years). The number of males was two times that of females. Symptomatic and asymptomatic
cases were equal. The comparison of regressors between survivor and nonsurvivor groups
was performed [Supplementary Table 1] and [Supplementary Table 2]. There was a statistically
significant difference in the mean age of survivors and nonsurvivors (P = 0.01), but
gender differences were not statistically significant in the two groups (P = 0.81).
The RBS, TLC, PLT, NPHIL, ANC, LYMP, ALC, NLR, and ANLR showed statistically significant
differences between the survivor and nonsurvivor groups. A significant correlation
was observed in various regressors [Supplementary Table 3].
The univariate analysis of regressors showed significant ORs for age (OR = 1.035),
RBS (OR = 1.025), TLC (OR = 1.265), NPHIL (OR = 1.22), ANC (OR = 1.4), LYMP (OR =
0.838), ALC (OR = 0.223), NLR (OR = 1.218), and ANLR (OR = 1.198) [[Table 1]]. The multivariate analysis was performed using ANC, NPHIL, ANLR, age, and RBS regressors
using step-wise method. The best model was chosen with minimum AIC value as 29.9 (P
< 0.001). It was found that two regressors that contribute significantly in the prediction
model are differential neutrophil count and RBS [[Table 1]]. The ROC curves for differential neutrophil count and RBS were shown separately
with AUC of 0.932 and 0.837, respectively [[Figures 2] and [3]]. The combined effect of regressors, NPHIL, and RBS in the prediction of nonsurvivors
had AUC, sensitivity, specificity, and inherent accuracy of 0.96, 90.5%, 88.9%, and
89.5%, respectively [[Table 2]]. These regressors and their estimates of logistic coefficients are given below:
= -32.77 + 0.33 ˟differential neutrophil count +0.04 ˟random blood sugar
where
is log odds of outcome. The 5-fold cross-validation logistic regression model was
trained with differential neutrophil count and RBS regressors to calculate model performance
metrics. The AUC, sensitivity, specificity, and accuracy were 0.95, 90%, 92%, and
70%, respectively. The cutoff probability was 0.30 for the mortality risk.
Figure 2: Receiver operating characteristic curve (red line) for differential neutrophil count
is shown. The area under the curve is 0.932
Figure 3: Receiver operating characteristic curve (red line) for random blood sugar is shown
with area under the curve of 0.837
Table 1: Odds ratio of various regressors with univariate analysis
Table 2: Confusion matrix
Discussion
The outbreak of the novel COVID-19 forces the medical fraternity around the world
to discover the many unfamiliar facets of the disease. The risk factors of mortality
is one of the important dimensions of the clinical research. The risk factors also
direct the health authorities to utilize medical infrastructure and human resource
optimally to reduce the number of deaths during the epidemic. The present study showed
that differential neutrophil count and RBS have significant contribution as indicators
of mortality in COVID.19 patients. The above two hematologic parameters have 70% validation
accuracy in predicting the outcome. Ruan et al. suggested age, the underlying diseases, and increased inflammatory indicators as
mortality indicators.[[8]] They showed significant differences in white blood cell counts, absolute lymphocyte
counts, PLTs, albumin, total bilirubin, blood urea nitrogen, blood creatinine, myoglobin,
cardiac troponin, C-reactive protein (CRP), and interleukin-6 in death and recovered
groups. In the present study, differential lymphocyte count is highly negatively correlated
with differential neutrophil count [Supplementary Table 3]. Green in a retrospective
study of 150 COVID-19 patients found significant differences in ferritin and interleukin-6
levels between nonsurvivors and survivors, suggesting that the cause of mortality
may be hyperinflammation due to viral infection.[[9]] Tan et al. retrospect the time course reports of complete blood count of dead and
recovered cases. They suggest time lymphocyte (%) model as the prognostic factor of
disease severity. The disease severity is proportional to decrease in lymphocyte count
across timeline.[[10]] Zhou et al. carried out a multivariable regression analysis and found increasing
odds of in-hospital death associated with older age (OR 1.10, 95% confidence interval
1.03–1.17, per year increase; P = 0.0043), higher Sequential Organ Failure Assessment
score (5.65, 2.61-12.23; P < 0.0001), and d-dimer >1 μg/mL (18.42, 2.64–128.55; P
= 0.0033) on hospital admission.[[11]] Du et al. identified four risk factors including age ≥65 years, cardiovascular
comorbidity, CD3 + CD8 + T cells ≤75 cell/μL, and cardiac troponin I levels ≥0.05
ng/mL as predictors of mortality. They specifically mentioned the last two factors
are more specific.[[12]] Gupta et al. reported preventive measures for patients with diabetes and mentioned
diabetes as an important risk factor for mortality in patients with other influenza
epidemics. The present study also mentioned RBS as an important predictor of mortality
risk.[[13]] Singh reported that ACE-2 receptors are expressed on pancreatic islets and infection
with SARS-CoV-1 cause hyperglycemia in people without existing diabetes mellitus.
The hyperglycemia was seen to persist 3 years after recovery from SARS, indicating
damage to the beta cells of pancreas. Similar effects may be shown by SARS-CoV-2,
which leads to increase in blood sugar levels.[[14]] Henry (2020) emphasized that extracorporeal membrane oxygenation (ECMO) therapy
can be an additional risk factor in COVID-19 patients, as it causes additional decrease
in lymphocyte population in these patients.[[14]] Li et al. designed a meta-analysis involving six studies that include the prevalence
of cardiovascular disease in COVID-19, and they compared the incidence in non-intensive
care unit (ICU)/severe and ICU/severe groups. The proportions of hypertension, cardio-cerebrovascular
disease, and diabetes in patients with COVID-19 were 17.1%, 16.4%, and 9.7%, respectively.[[15]] Vaduganathan et al. hypothesized the beneficial role of angiotensin-converting
enzyme-2 (ACE-2) inhibitors in COVID-19; according to them, SARS-CoV-2 may cause activation
of ACE-2 receptors and hypertension, which may be the risk factor for mortality in
COVID-19.[[16]] Vincent and Taccone emphasized the role of specific cause in COVID-19 deaths and
also stressed that therapeutic limitations are contributing factors in case fatality
rates.[[17]] Lippi et al. assessed the relationship between COVID-19 and hypertension in a pooled
analysis of COVID-19 patients and found 2.5-fold increased risk of severity and mortality
in patients above 60 years of age.[[18]] Pal and Bhansali discussed the role of ACE2 inhibitors as contributing factors
in mortality in diabetes mellitus patients. The use of ACE2 inhibitors causes overexpression
of ACE receptors, which is the entry port of SARS-CoV-2.[[19]] Liu et al. demonstrated that NLR is an independent risk factor of mortality in
COVID-19. They showed 8% higher risk of in-hospital mortality for each unit increase
in NLR (OR = 1.08).[[17]] The results corroborate with those of our study as NLR is highly correlated with
differential neutrophil count.[[20]] Du et al. in a retrospective observational study observed that the median age of
the patients was 65.8 years and 72.9% were male. Hypertension, diabetes, and coronary
heart disease were the most common comorbidities.[[21]] Zhao et al. in a meta-analysis reported predictors of disease severity as old age
(≥50 years, OR = 2.61), male gender (OR = 1.348), smoking (OR = 1.734), and any comorbidity
(OR = 2.635), especially chronic kidney disease (OR = 6.017), chronic obstructive
pulmonary disease (OR = 5.323), and cerebrovascular disease (OR = 3.219). In terms
of laboratory results, increased lactate dehydrogenase, CRP, and D-dimer and decreased
blood platelet and lymphocyte count were highly associated with severe COVID-19 (all
for P < 0.001).[[22]] Muniyappa et al. from their perspectives also mentioned older age, diabetes mellitus,
hypertension, and obesity as significant risk factors for hospitalization and death
in COVID-19 patients.[[23]] Most of the studies opine diabetes mellitus as one of the important risk factors
for risk of mortality, which corresponds to the RBS of the present study. Second,
decrease in differential lymphocyte count was observed in most of the studies, which
also correlates negatively well with the differential neutrophil count of the present
study. Although the validation accuracy of the present study is 70%, it could be a
good screening tool. To increase the accuracy of the predictive model, new regressors
which are mentioned in the above studies can be used, but they should be titrated
against the ease of availability and cost.
Conclusion
The management of COVID-19 patients during the epidemic is a challenge due to limited
medical resources. The present study is an effort to extract more information from
routine laboratory investigations and thus develop a screening tool that guide caregivers
to utilize specialized diagnostic and therapeutic procedures for a subset of patients
on higher mortality risk.
Limitations of the study
The present study is a retrospective case–control study aimed to predict the mortality
risk, though prospective studies are more accurate for prediction of risk. The sample
size of the study is not large enough and, therefore, may affect the performance metrics
of the predicted model.
Authors' contributions
All authors contributed equally.
Compliance with ethical principles
The reporting quality, formatting, and reproducibility guidelines of the present study
are set forth by the EQUATOR Network. As the study was retrospective in nature, de-identified
data were used, and consent of the patients was presumed, the approval of the institutional
review board/ethics committee has been taken.
Reviewers:
Khalid B Akkari (Dammam, Saudi Arabia)
Khadija Hajidh (Dubai, UAE)
Ahmed Elhassi (Benghazi, Libya)
Nazeer Khan (Karachi, Pakistan)
Editors:
Salem A Beshyah (Abu Dhabi, UAE)
Elmahdi A Elkhammas (Columbus OH, USA)