Abstract
Background Breast cancer is the most prevailing heterogeneous disease among females characterized
with distinct molecular subtypes and varied clinicopathological features. With the
emergence of various artificial intelligence techniques especially machine learning,
the breast cancer research has attained new heights in cancer detection and prognosis.
Objective Recent development in computer driven diagnostic system has enabled the clinicians
to improve the accuracy in detecting various types of breast tumors. Our study is
to develop a computer driven diagnostic system which will enable the clinicians to
improve the accuracy in detecting various types of breast tumors.
Methods In this article, we proposed a breast cancer classification model based on the hybridization
of machine learning approaches for classifying triple-negative breast cancer and non-triple
negative breast cancer patients with clinicopathological features collected from multiple
tertiary care hospitals/centers.
Results The results of genetic algorithm and support vector machine (GA-SVM) hybrid model
was compared with classics feature selection SVM hybrid models like support vector
machine-recursive feature elimination (SVM-RFE), LASSO-SVM, Grid-SVM, and linear SVM.
The classification results obtained from GA-SVM hybrid model outperformed the other
compared models when applied on two distinct hospital-based datasets of patients investigated
with breast cancer in North West of African subcontinent. To validate the predictive
model accuracy, 10-fold cross-validation method was applied on all models with the
same multicentered datasets. The model performance was evaluated with well-known metrics
like mean squared error, logarithmic loss, F1-score, area under the ROC curve, and
the precision–recall curve.
Conclusion The hybrid machine learning model can be employed for breast cancer subtypes classification
that could help the medical practitioners in better treatment planning and disease
outcome.
Keywords
triple-negative breast cancer - clinicopathological parameters - hybrid machine learning
models - classification - genetic algorithm - support vector machine