Open Access
CC BY 4.0 · Indian J Med Paediatr Oncol
DOI: 10.1055/s-0046-1817795
Original Article

Development and Validation of a Risk Assessment Tool for Detecting Oral Cavity Cancer in India: A Case–control Study Design Approach

Authors

  • Monica Mocherla

    1   Department of Public Health Dentistry, Sri Sai College of Dental Surgery, Vikarabad, Telangana, India
    2   Department of Public Health Dentistry, Faculty of Dental Sciences, MS Ramaiah University of Applied Sciences, Bengaluru, Karnataka, India
  • Pushpanjali Krishnappa

    2   Department of Public Health Dentistry, Faculty of Dental Sciences, MS Ramaiah University of Applied Sciences, Bengaluru, Karnataka, India
  • Denny John

    3   Faculty of Life and Allied Health Sciences, MS Ramaiah University of Applied Sciences, Bengaluru, Karnataka, India
 

Abstract

Introduction

Oral cancer is a significant health issue in India, often diagnosed late, resulting in poor outcomes/prognosis. Early identification of high-risk individuals is crucial for preventing complications, and focusing on these populations can significantly improve screening efforts. Implementing a risk assessment tool for oral cancer may enhance examination strategies across the country.

Objectives

Our objective was to develop a risk assessment model for oral cavity cancer based on a comprehensive understanding of risk factors, to ensure generalizability across Indian populations.

Materials and Methods

A multicenter case–control study was conducted from October 2022 to July 2023 across three cancer hospitals in Telangana, India to identify oral cancer risk factors. A risk score for each predictor was derived from the respective odds ratios (OR). The predictive ability of the regression model and the cut-off risk score were determined by calculating sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Calibration plots and the Hosmer–Lemeshow goodness-of-fit test were used to assess how well each model's predicted probabilities align with primary and secondary outcomes. Brier score was used as a measure of the model's overall accuracy. Decision curve analysis evaluated the model's clinical utility and net benefit for risk prediction. The models were validated using a bootstrap sample and OR from pooled studies from a systematic review.

Results

Years of smoked and smokeless tobacco, alcohol frequency, use of vegetables in the diet, and history of chronic oral trauma were the predictors. Risk scores ranged from −1 to 2. Area under the receiver operating characteristic curve for risk scores was good (0.76–0.840). Sensitivity was highest for upper socio-economic class, and pooled models while multivariable and bootstrapped upper socio-economic class models had the highest for specificity. Brier score of 0.1322 for the upper class and 0.1673 for the lower class indicated optimal model performance, while those for multivariable and pooled data models indicated suboptimal model performance.

Conclusion

The risk scoring model showed the ability to identify individuals at high risk for oral cancer, demonstrating good predictive ability for the Indian population. It needs validation in other populations to accurately pinpoint subgroups needing further clinical evaluation.


Introduction

Oral cavity cancer is a major health issue, especially in areas with high tobacco and alcohol use. Early identification of high-risk individuals is essential to prevent complications from late diagnoses.[1] In India, the India State-Level Disease Burden Initiative Cancer Collaborators reported 79,979 oral cancer deaths in 2022, ranking third in cancer mortality.[2] Tobacco, particularly smokeless varieties, and areca nut are the main causes, along with inadequate nutrition, oral HPV infection, and poor oral hygiene.[3]

Oral cancer in India poses a serious public health challenge, often diagnosed late, resulting in poor outcomes and unaffordable costs.[4] A lack of qualified healthcare professionals in rural areas leads to delays and advanced stages of cancer at diagnosis.[5] Early detection enhances outcomes and reduces costs, improving survival chances. Oral cancer mainly affects lower socioeconomic groups with high tobacco and alcohol use.[6] India has insufficient oral cancer screening programs, resulting in an increasing prevalence.[7] Implementing these programs is difficult as healthcare workers prioritize maternal and child health, immunization, and feel overburdened by added oral examinations.[8] There are no clear policies for health workers regarding oral health responsibilities, and they lack training in oral cavity examinations.[9] Targeting high-risk groups can enhance screening efforts.[10] A risk assessment tool for identifying high-risk individuals could improve examination strategies.[11]

We reviewed risk prediction models for early oral cancer detection and identified six studies, including two from India.[12] These models categorize risk factors into four types: sociodemographic history, medical history, dental health, and behavioral history (alcohol and tobacco use, and diet). The models demonstrated strong discrimination ability, ranging from acceptable (0.7–0.8) to excellent (0.8–0.9), with good predictive values. Calibration was performed only in Rao et al using the Hosmer–Lemeshow test.

Both risk assessment models were easy to use in resource-limited settings like India, albeit with several research gaps. The first was the absence of an inclusive risk factor model for oral cavity cancer in India. Gupta et al and Rao et al focused on oropharyngeal and upper aerodigestive tract cancers without addressing oral cavity specifics.[13] [14] Second, generalizability is limited due to region-specific risks. Gupta et al conducted a hospital-based case–control study in Pune with 240 control pairs, yielding a risk score of 0 to 26. Rao et al conducted an unmatched study in Karnataka with 180 cases and 272 controls, reporting scores of 0 to 28.[13] Third, Rao et al performed external validation to check the reliability of their risk score models in 200 bootstrap samples.[14] Fourth, both studies exhibited high bias risk regarding participants and analysis, raising concerns about applicability, as noted by the PROBAST tool.[12] Lastly, Gupta et al achieved a predictive value of AUC = 0.86 with 74.6% sensitivity and 84.6% specificity, while Rao et al had a higher predictive value (AUC = 0.9) with 93.5% sensitivity but lower specificity of 71.1%.[13] [14] The high false positivity could burden healthcare systems if used for screening of oral cavity cancer.

In a 2004 workshop by the National Cancer Institute, the UK recommended that cancer risk models be revised and strengthened for improved accuracy.[15] Our objective was to create a risk assessment model for oral cavity cancer using a detailed understanding of risk factors, aiming for generalizability across Indian populations.


Methods

Study Design

A multicenter case–control study was conducted from October 2022 to July 2023 across three cancer hospitals in Telangana, India, to identify oral cancer risk factors. The identification process involved several steps: Step 1: Literature review of studies on oral cancer risk predictors; Step 2: Developing an instrument with 25 items from various studies; Step 3: Validating content with 16 experts (two medical oncologists, four surgical oncologists, five oral cancer surgeons, five oral medicine specialists) through face-to-face interviews. Experts rated item relevance using a Likert-type scale from 1 (not relevant) to 4 (highly relevant) to refine the questionnaire format; Step 4: The content validity index (CVI) was calculated by scoring items rated 3 to 4 by experts and dividing by the number of experts to compute the CVI. Items with an I-CVI between 0.70 and 0.90 were revised, those with an I-CVI >0.90 were retained, and those with an I-CVI <0.70 were discarded. Step 5: The investigator was trained by an oral medicine specialist to diagnose oral cancer lesions using online photographs; Step 6: A pilot study was conducted using the revised questionnaire among 20 oral cancer cases and 20 controls.


Patient and Participant Selection and Recruitment

The case involved a person newly diagnosed with oral cancer, confirmed histopathologically, visiting a designated cancer hospital during the study. Anatomical sites included C00 t C06 based on the International Classification of Diseases, Oncology, 3rd edition.[16] Inclusion criteria required patients to be over 18, provide informed consent, and have a confirmed diagnosis. Patients with cancer recurrence, cognitive impairments, or advanced metastatic stages were excluded.

Controls were participants without oral cancer, including caregivers, relatives, and hospital visitors from the same hospitals as the cases. Their criteria were similar, except they must show no signs of cancer and could not have malignancies related to tobacco or alcohol, like liver, lung, or esophageal cancers, or cognitive impairments.

The detailed patient recruitment process, sample size estimation, data collection process, ethics and informed consent are documented elsewhere.[17]


Potential Predictors

Consumption of fruits and family history of cancer were deleted after consultation with experts on the subject due to the low frequency in the target population. After the initial round of content validation, the questionnaire had a total of 17 questions with an Item level CVI above 0.9. These included four demographic questions: history of adverse habits – eight items, dietary habits – two items, family history – one item and dental history – two items. A total of three questions had a CVI between 0.7 and 0.9 and were removed after repeat consultation with the experts, and five questions had a CVI value less than 0.7, leading to the removal of those questions. After assessment of the relevancy of each item, the instrument level CVI was calculated by taking an average item level CVI score of 17 items. The CVI score for this newly developed instrument was 0.86. No changes were required after the initial pilot test, and all the variables were included in the final questionnaire ([Supplementary Material], available in the online version only).


Primary and Secondary Outcomes

The primary outcomes include adjusted odds ratio with 95% confidence intervals and corresponding risk scores for the association between each predictor and the risk of oral cavity cancer. Secondary outcomes include performance metrics of the risk scores for each of these predictors in relation to the risk of oral cavity cancer.



Statistical Analysis

The association between each predictor and the risk of oral cavity cancer was assessed using a multivariable binary regression analysis controlling for age, gender, socioeconomic status, and place of residence to evaluate the relationship between different risk factors and oral cancer. A significance threshold of p < 0.05 was applied. Using the IPW package in R software, inverse probability of treatment weighting was also performed to evaluate the impact of different risk factors on oral cancer.[18] The inverse probability of treatment weights was computed using estimated propensity scores to achieve a balance between measured baseline covariates among cases and controls. Multiple logistic regression was used to calculate each person's propensity scores. The covariates were considered balanced when the absolute standardized mean difference was less than 0.1. Next, a weighted regression analysis determined odds ratios (OR) with 95% confidence intervals.

Derivation of the Risk Scores

Before including them in the risk assessment model, we assessed the variance inflation factor for multicollinearity among predictors. A stepwise backward elimination technique removed predictors with the highest p-values until only significant predictors (p < 0.05) remained, alongside clinically relevant ones ([Supplementary Material], available in the online version only). Risk scores were derived by combining the weight of each predictor to gauge oral cavity cancer risk. A scoring system based on regression coefficients rounded to integers assigned points. Each patient's score summed points of all predictors, categorizing them into high and low-risk groups. For external comparison, predictors with ORs beyond 10% from the pooled estimate were excluded, and revised ORs were calculated.[18] Pooled ORs from meta-analysis transformed into regression coefficients served as weights for each risk factor, contributing to the overall risk score.[18]


Model Performance

The model's performance was evaluated using AUC for discrimination ability, sensitivity, and specificity, along with calibration plots comparing predicted risks with observed primary and secondary outcomes. Receiver operating characteristic curves were created for the case–control study and pooled estimate model from the weighted dataset. Sensitivity and 1-specificity were plotted in both original and bootstrapped samples, resulting in an optimal cut-off score of 2.

The disease prevalence affects predictive values and sensitivity. The model's reliance on the case–control dataset makes measuring accurate disease prevalence difficult. Therefore, we adjusted for India's control sampling fraction and prevalence.[19] [20]

AUC values, sensitivity, and specificity were compared to identify the model with superior discriminatory power. Calibration plots and the Hosmer–Lemeshow goodness-of-fit test were used to assess how well the predicted probabilities aligned with the primary and secondary outcomes. The Brier score assessed overall accuracy using the mean squared error between predictions and observed results. Decision curve analysis evaluated the model's clinical utility and net benefit in risk prediction.


Model Validation

Internal validation of the model's predictive performance for risk scores was conducted by bootstrapping 1,000 samples to estimate confidence intervals and evaluate model stability. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for the final model at the optimum cut-off after bootstrapping. External validation used pooled ORs from a systematic review and meta-analysis.[18]

Subgroup analysis by socioeconomic status (upper vs. lower) investigated differences in risk factors for oral cavity cancer, examining whether specific predictors like tobacco use, diet, and chronic oral trauma varied by socioeconomic group.

Required permissions to collect data from the patients were obtained at the start of the study from respective hospital authorities. Permission was also obtained to access the patient's medical records to ascertain the diagnosis and treatment plan. Identified cases were contacted initially by the receptionist or ward nurses to confirm their willingness to participate in the study. To each potential subject, the researcher then explained the purpose of the study, procedures to be performed, and freedom to withdraw from the data collection process at any time during the study. The subjects were assured about the confidentiality of the data collected. They were informed about the oral examination to be performed and the time required for the interview in a language that they could understand. After clarifying their doubts/questions, informed consent was obtained from the subjects, and the interview was scheduled at their convenience.

The STROBE guidelines for case–control studies were used to report the methods and results of this study, along with the TRIPOD checklist for the risk assessment model.[21] [22]


Ethics

Ethical clearance for the study was obtained from the Human Research Ethics Committee of MNJ Cancer Hospital, Hyderabad (ECR/227/Inst/AP/2013/RR-19, dated 04 August 2022). All procedures performed in studies involving human participants were in compliance with the ethical standards of the 1964 Helsinki Declaration and ICMR-National Guidelines for Biomedical and Health Research.



Results

Of 250 eligible cases and 500 controls, 238 (95.2%) cases and 450 (90%) controls consented to participate. An initial imbalance in covariates was noted. After trimming extreme weights, balance was achieved regarding age, gender, socioeconomic status, and residence. A standardized mean difference of less than 0.1 was considered acceptable for balance. Our sample size estimation was 175 cases and 350 controls, as described elsewhere.[17] Therefore, the sample sizes of 214 cases and 420 controls were considered adequate for further analysis.

[Table 1] presents the final predictors, and their risk scores based on β coefficients. The Hosmer–Lemeshow test indicated good fit (p = 0.456). Risk scores were derived from the natural log of the regression model's ORs. The risk prediction model is defined as: Risk prediction model = −1.49 + 2.25 (smoking > 10 years) + 2.23 (smokeless tobacco > 10 years) + 1.69 (daily alcohol use) − 0.94 (vegetable consumption > 3 times/week) + 1.67 (chronic trauma history). The multivariable model from case–control data demonstrated good discrimination (AUC = 0.84, 95% CI: 0.78–0.86) with sensitivity (76%) and specificity (72%). Models based on upper and lower economic classes demonstrated strong discriminatory ability (AUC = 0.81, 95% CI: 0.72–0.84 for upper and AUC = 0.81, 95% CI: 0.75–0.86) with optimal sensitivity and specificity. The pooled data model had fair discrimination (AUC = 0.79, 95% CI: 0.75–0.82) and low specificity. Bootstrapped models from multivariable analysis exhibited good discrimination compared with the pooled model (AUC = 0.78), with all showing optimal sensitivity and specificity.

Table 1

Risk scores based on case–control study and pooled estimate

Characteristics

Case–Control

Pooled estimates

Beta coefficients

Adjusted OR

Risk scores

Beta coefficients

Adjusted OR

Risk scores

Years of smoking

Never

Ref

<10 y

0.4291816

1.536 (0.525–4.135)

0

0

1.00 (0.73–1.38)

0

>10 y

2.277574

9.753 (4.864–20.265)

2

1.050

2.86 (1.82–4.48)

1

Years of smokeless tobacco

Never

Ref

<10 y

1.99320

7.339 (3.625–15.217)

2

0.620

1.86 (1.47–2.34)

1

>10 y

2.19098

8.944 (4.587–18.0245)

2

1.532

4.19 (3.56–5.09)

2

Alcohol frequency

Never

Ref

Occasional

0.18564

1.204 (0.670–2.216)

0

0.086

1.09 (0.81–1.47)

0

Daily

1.77393

5.894 (3.236–11.019)

2

0.760

2.14 (1.67–2.75)

1

Vegetables

<3 times/week

Ref

>3 times/week

–1.00239

0.367 (1.414–6.208)

1

1.108

0.33 (0.22–0.47)

1

History of chronic trauma

Absent

Ref

Present

1.2345

4.437 (1.281–9.103)

2

0.672

1.96 (1.053.62)

1

Abbreviation: OR, odds ratio.


A Brier score close to 0, ideally under 0.25, signifies optimal performance for oral cancer. The multivariable model had a Brier score of 0.291, indicating suboptimal performance despite optimal sensitivity and specificity. The upper class scored 0.1322 and the lower class 0.1673, indicating optimal performance ([Table 2]). The pooled model scored 0.345, reflecting suboptimal performance ([Table 2]).

Table 2

Performance metrics of risk scores derived from multivariable model, upper- and lower-class models, pooled model, and bootstrapped multivariable, upper and lower, and pooled models

Type of model

AUC

Sensitivity

Specificity

Brier score

Multivariable model

0.821 (95% CI: 0.794–0.858)

0.7184 (95% CI:0.656–0.774)

0.7666667 (95% CI:0.7247–0.804)

0.291

Bootstrapped multivariable model

0.841 (95% CI: 0.781–0.868)

0.7692 (95% CI:0.670–0.900)

0.7234 (95% CI:0.568–0.8104)

Upper class model

0.7637 (95% CI: 0.714–0.813)

0.8231707 (95% CI: 0.763–0.832

0.654321 (95% CI:0.632–0.781)

0.1322

Bootstrapped upper class model

0.81552 (95% CI:0.724–0.849)

0.7211155 (95% CI:0.721–0.725)

0.7871854 (95% CI:0.787–0.887)

Lower class model

0.8158702 (95% CI:0.759- 0.872)

0.7702703 (95% CI:0.717–0.824)

0.7439614 (95% CI:0.663–0.799)

0.1673

Bootstrapped lower class model

0.8152876 (95% CI:0.754–0.869)

0.7676145 (95% CI:0.636–0.882)

0.75224 (95% CI:0.616–0.871)

Pooled model

0.7803595 (95% CI:0.745–0.815)

0.8151261 (95% CI:0.725 0.846)

0.6377778 (95% CI:0.641–0.729)

0.345

Bootstrapped pooled model

0.7937 (95% CI:0.750–0.822)

0.7689868 (95%CI:0.714–0.822)

0.6328187 (95% CI:0.589–0.674)

Abbreviations: AUC, area under curve; CI, confidence Interval.


The calibration plots for the upper and lower classes were well-calibrated, with the predicted probability matching the actual outcome in the population. In the lower class, the model performed well at the lower risk threshold. Still, in the intermediate threshold range, the model underestimated or overestimated before calibrating well at the higher thresholds.

Decision curve analysis revealed a net benefit in using the multivariable model for predicting the risk of oral cavity cancer at a risk threshold between 0.2 and 0.4 ([Fig. 1]). The model had the highest net benefit compared with all treated at a threshold of 0.2. The pooled model had a higher net benefit in the mid-range thresholds between 0.2 and 0.4. Thresholds beyond 0.6 will diminish benefits and give no additional benefit of using the model ([Fig. 1]).

Zoom
Fig. 1 Decision curve analysis for case–control study and pooled analysis.

Discussion

Identifying high-risk individuals for oral cavity cancer in resource-constrained settings is crucial. While some models show good predictive value, logistical constraints affect real-world applications.[12] There is a trade-off between simplicity and accuracy in mass screening risk assessments for serious diseases like oral cavity cancer. Simpler models may be more feasible for widespread use but can compromise sensitivity and specificity compared with more complex models.[12]

This model lacks transformations, interactions, or continuous variables and does not include an oral examination component. Due to this reason, although there is evidence of an association between tooth loss (a multifactorial process involving dental caries, periodontal disease, and various socio-economic factors) and oral cancer,[23] dental caries assessment was not considered for the model. Similarly, while the primary data from the case–control study on mouth rinsing were considered, they were not included in the risk scoring. The developed risk score model was created to be straightforward and user-friendly, requiring minimal logistics. Where trained health professionals are available, this information can be directly gathered using straightforward questions, and risk scores can be computed in future studies. In rural areas where trained health professionals are scarce, this model would be particularly beneficial.

The screening risk model developed in the present research had a good discriminatory ability (AUC= 0.84 for the multivariable model and 0.79 for the pooled model). This aligns with earlier studies, which reported an AUC ranging from 0.7 to 0.9.[12] The screening models developed by Rao et al and Gupta et al from hospital-based case–control studies in other regions of India also reported similar findings. Chewing quid with tobacco emerged as the most important predictor in all these models.[12] [13] [14]

The risk assessment model developed in the present study was well calibrated at lower risk thresholds but showed slight overestimation and underestimation in the middle range, before showing better calibration at higher thresholds. When risk models were developed separately based on the socioeconomic status of the target population, the upper- and lower-class models demonstrated good discrimination and were well-calibrated. Low SES feeds a vicious cycle that results in poor lifestyles, health behaviors, and educational outcomes.[24] In a meta-analysis by Conway et al, evidence suggested that socioeconomic conditions play a role in the risk of oral cavity cancer.[25] Thus, we hypothesize that the macroenvironment linked to low SES, including the impact of inadequate education on health, lack of access to healthcare, poor nutrition, poor hygiene, an unfavorable work environment, and substandard living conditions, may act in concert with other known risk behaviors frequently found in low SES groups to cause oral cancer through complex social interactions.[26]

A screening program's sustainability and effectiveness are based on its PPV and NPV, which are influenced by the population's disease prevalence.[27] Almost 86% of the adults in the study sample who had a risk score higher than the cutoff 2 were free of oral cancer. The oral cancer risk score's NPV was almost 70% higher than its PPV with the national prevalence. A higher NPV is likely to reassure a person with a lower score (negative test) that they are doubtful to have oral cancer in a population where the disease is more common.[13] However, in populations where prevalence is low, it may not provide useful information if applied to plausibly related populations with comparable behaviors that are highly susceptible to developing oral cavity cancer.

One of the common drawbacks of any risk assessment model is a high false positivity rate. Anxiety among the patients and their families, along with the trauma of undergoing further testing unnecessarily, may result when the false positivity rates are high. However, for a disease like oral cancer, in which early detection significantly improves survival outcomes, it is preferable to have a higher false positivity rate rather than a higher false negative rate.[28] With a cut-off score of 2, the false positive rate of the multivariable risk model was 23.3%.

Using the data on which the model was created, internal model validation measures the model's statistical performance and evaluates optimism. A risk prediction model's performance is likely to be overly optimistic in the data sample from which it was created. For internal validation, k-fold cross-validation or bootstrapping are the recommended methods.[29] Bootstrap resampling with 1,000 resamples revealed similar discrimination and calibration of the developed model.

To apply the risk model, predicting risk after identifying oral cavity cancer risk factors is crucial. Beta coefficients from these factors calculated risk scores in this study, ranging from 0 to 8, with a cutoff score of 2 achieving the highest sensitivity (76%) and specificity (72%). The pooled model, which included smoking history, scored from 0 to 9, also performing best at a cutoff of 2. Although its sensitivity (76%) and specificity (63%) were slightly lower than those of the multivariable model, this outcome was expected, as it was based on pooled estimates from various studies and assessed using a new dataset.

Risk prediction models should be based on cohort studies. For diseases like oral cancer, cohort studies can be costly and time-consuming due to long induction periods. Comprehensive databases with detailed information on participant risk factors are currently lacking in India. Thus, a case–control study is a viable alternative for developing a risk prediction model. The key issue with such studies is the selection of control.

The strength of this study was that it aimed to ensure controls accurately represented the disease's prevalence in the original population, selecting non-cancer patients from the same hospitals as the cases. Caregivers were involved to simplify the process and encourage patient participation while maintaining data confidentiality, which limited public involvement in later analyses. Propensity scores calculated by inverse probability weighting in this study help reduce dimensions with multiple confounders, addressing interactions and non-linearity—a strength over previous studies.[30] Minimal missing data occurred because a single examiner conducted direct interviews, thereby reducing bias and fostering rapport with participants. The findings indicated that an individual's socioeconomic status can mediate the effects of tobacco and alcohol consumption on oral cavity cancer causation. The models were well-calibrated and more accurate when stratified by economic factors class.

One limitation of case–control studies is the use of closed-ended questionnaires, which limit the collection of supplementary data on oral cavity cancer variables. However, focusing on a few specific factors minimizes recall bias.[31] The small sample size in each socioeconomic class, along with samples drawn from a single urban city in South India, restricts the model's generalizability, requiring validation on larger samples derived from multiple centers in the country. While we had data on the type of smoked form (cigarette or bidi/chutta) or smokeless form (Gutkha/Khaini or Quid without tobacco), we did not consider these individual forms for the risk scores due to limited sample sizes across the two categories. The most reliable test of a risk assessment model is external validation, which evaluates its statistical performance in comparable patient cohorts. This external validation can occur over different timeframes, locations, or populations.

There is a need for greater coverage of public health initiatives to prevent smoking, chewing tobacco, and alcohol consumption, as these increase oral cancer risk in the country. Effective tobacco control legislation and media campaigns have reduced tobacco use.[31] However, the rising popularity of carcinogenic products like areca nut and evasion of bans contributes to increased oral cancer rates. Future research should investigate their potential carcinogenic effects. Tobacco use in India is declining due to heightened awareness and local laws, but risk factors remain consistent. The increase in oral cancer among non-tobacco users necessitates further investigation[32] [33] Chronic trauma from poorly fitting dentures is another significant risk. Clinicians should ensure proper denture fit, educate patients, and monitor precancerous changes. Public health policies must improve dental care access, raise awareness, and implement preventive measures for high-risk groups, particularly the elderly and socioeconomically disadvantaged.[34] [35]


Conclusion

A predictive screening model utilizing risk scores was developed and validated in a population from a country with a high rate of oral cancer. Additional research is needed to confirm the model's effectiveness in different populations before it can be recommended to identify subgroups for more comprehensive screening strategies.



Conflict of Interest

None declared.

Authors' Contributions

M.M., P.K., and D.J. contributed to conception and design of the study. D.J. conceived the pooled analysis component which was used for external validation. M.M. led the contribution to data acquisition for the case––control study and for the pooled analysis component under the supervision of D.J. M.M. led the analysis and interpretation under the supervision of D.J. M.M. drafted the first manuscript and D.J. reviewed and modified the draft manuscript and prepared the final document. All authors critically reviewed the manuscript and gave final approval. M.M., P.K., and D.J. agree to be accountable for all aspects of work, ensuring integrity and accuracy.


Ethical Approval

The study has received ethics approval from Institutional Ethics Committee of MNJ Institute of Oncology and Regional Cancer Center, on August 04, 2022 (ECR/227/Inst/AP/2013/RR-19).


Data Availability Statement

Dataset can be obtained through written request to the corresponding author.


Patient Consent

Patient consent is not applicable for this study.



Address for correspondence

Denny John, BPT, MPH, MHA, PhD
Faculty of Life and Allied Health Sciences, MS Ramaiah University of Applied Sciences
New BEL Road, Bengaluru 560054, Karnataka
India   

Publication History

Article published online:
28 February 2026

© 2026. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Thieme Medical and Scientific Publishers Pvt. Ltd.
A-12, 2nd Floor, Sector 2, Noida-201301 UP, India


Zoom
Fig. 1 Decision curve analysis for case–control study and pooled analysis.