Keywords AI - CNN - Squash - glioma - CNS tumor - Artificial Intelligence - Convolutional Neural
Network - neurosurgery - neuro-oncology - neuropathology
Introduction
The central nervous system (CNS) malignancies account for approximately 1.3 to 2%
of all tumors in India, with a global incidence rate of 5 to 10 per 100,000 persons.[1 ]
[2 ]
[3 ] Accurate and timely diagnosis of these tumors is crucial for optimal patient management.
Computer-aided diagnosis has been widely used in neurosurgery, particularly in neuroradiology,
assisting in interpreting images from X-ray, computed tomography (CT), and magnetic
resonance imaging (MRI). Artificial Intelligence (AI) has been successfully applied
in medical imaging, including generating reports for traumatic brain injury, segmenting
brain tumor morphology, and differentiating high- and low-grade brain tumors using
MRI.[4 ]
[5 ]
[6 ] However, these methods predominantly rely on complex feature extraction and preprocessing
techniques, which can be time-consuming and error-prone.
Intraoperative squash smear cytology, introduced in the early 1930s, remains a valuable
diagnostic tool for CNS tumors, with reported accuracies ranging from 76 to 96%.[7 ] The technique involves rapid preparation and microscopic evaluation of tumor tissue
and typically requires 30 minutes to 2 hours for interpretation. While this timeframe
may seem reasonable, it can be a significant delay in the context of an ongoing neurosurgical
procedure. Furthermore, access to an expert neuropathologist is often limited, particularly
during off-hours, weekends, and in resource-constrained settings. In some cases, an
initial sample may not be representative of the tumor, necessitating additional sampling,
which prolongs surgery. In stereotactic biopsies, immediate confirmation of whether
the obtained sample is from the target site could prevent unnecessary negative biopsies.
Research Gap and Problem Statement
Despite advancements in AI-driven pathology, limited research has explored its application
in intraoperative cytology. Existing AI-based diagnostic tools primarily focus on
radiological imaging and histopathology slides, with little emphasis on real-time
cytological assessment. This study seeks to address this gap by developing an AI-driven
system capable of providing rapid, reliable, and automated intraoperative squash smear
cytology classification for CNS tumors.
Literature Review
AI in medical imaging and histopathology: prior studies have demonstrated the effectiveness
of deep learning models in medical imaging. For example, Chilamkurthy et al (2018)[8 ] applied CNNs to non-contrast CT scans, achieving high accuracy in detecting intracranial
hematomas and skull fractures. Xu et al (2015)[9 ] explored CNN-based classification of brain tumor histopathology slides, reporting
a 97.5% accuracy. However, these studies primarily focus on MRI, CT, and histological
specimens, lacking applications in intraoperative smear cytology.
CNN applications in tumor classification: CNN models have been employed for various
oncological applications, such as breast cancer histology classification (Arau et
al 2017)[10 ] and cervical cancer detection using Pap smears (Bora et al. 2016).[11 ] These studies validate CNNs as efficient diagnostic tools, yet they do not address
the unique challenges of intraoperative squash smear cytology.
Squash smear cytology in brain tumors: Squash smear cytology remains a widely used
intraoperative diagnostic technique with accuracies between 76 and 96% (Jindal et
al 2017; Mitra et al 2010).[7 ]
[12 ] Despite its effectiveness, reliance on neuropathologist expertise limits its widespread
availability, particularly in emergency scenarios and underresourced settings.
AI in cytopathology and squash smear analysis: while AI has been applied to cytopathology
in limited capacities, such as thyroid fine-needle aspiration cytology classification
(Sanyal et al. 2018),[13 ] its application in intraoperative CNS tumor cytology is underexplored. Carneiro
et al. (2016)[14 ] discussed deep learning approaches for medical image labeling but did not specifically
address intraoperative CNS tumor diagnosis.
This study aims to bridge the existing research gap by:
Developing a CNN-based AI model specifically trained on a large dataset of 10,000
intraoperative squash smear images for glioma classification.
Addressing the unmet need for AI-assisted intraoperative cytology by automating the
differentiation between high- and low-grade gliomas.
Implementing feature visualization techniques to enhance model interpretability and
provide insight into AI-based decision-making.
Introducing a rapid, reproducible AI-driven diagnostic tool that can assist neurosurgeons
in centres with limited neuropathology expertise, improving intraoperative decision-making
and patient outcomes.
By leveraging AI to enhance intraoperative diagnostic capabilities, this study may
contribute by reducing diagnostic delays and increasing the accessibility of high-quality
tumor classification in neurosurgery.
Methods
This study aimed: (1) to construct a CNN model specifically for analyzing brain tumor
tissue squash smear cytology and (2) to assess the efficacy of such AI agents in discerning
between high-grade versus low-grade gliomas.
Our intent was to build a standardized dataset for training and testing of CNS tumor
cytology slides, absent at present. Once the dataset was formed, an AI agent, a CNN
model tailored for handling such medical images, was developed. The study design workflow
is described in the [Fig. 1 ].
Fig. 1 Design workflow for the study. (A ) Image acquisition: intraoperative squash smear slides of gliomas were prepared and
digitized using a high-resolution microscope. (B ) Image preprocessing: the images were resized, normalized, and augmented to enhance
model generalization. (C ) Convolutional neural network (CNN) model development: a CNN was designed and trained
using a dataset of 10,000 labeled images. (D ) Model evaluation: the trained model was validated and tested to assess accuracy,
sensitivity, and specificity. (E ) Feature visualization and interpretation: heatmaps were generated to understand
how the model makes predictions, ensuring transparency and reliability.
Image Acquisition and Data Collection
Intraoperative squash smear slides of CNS neoplasms from our center's Department of
Neuropathology were gathered. The study included retrospective cases of squash smears
prepared between June 2017 and December 2018. A minimum of 50 representative images
were obtained from each slide. Only nonoverlapping, uniformly spread, and well-stained
areas on the slide were selected for imaging. These slide images were then digitalized
by a digital microscope for Artificial Neural Network (ANN) training. The data collected
included images of the squash smear slides and the final histopathological diagnosis,
which only considered gliomas. These obtained images were divided into two sets: a
training set and a validation set/testing dataset ([Fig. 2 ]). The training set images were exclusively used for network training, whereas the
rest were used solely for validation/testing. Of the total 10,000 images collected,
6,000 were high-grade gliomas and 4,000 were low-grade gliomas. Of these, 3,200 were
used for training, whereas the remainder were used for testing/validation.
Fig. 2 Distribution of images in different dataset done randomly. Note that the rest of
the data are used for the testing and validation for the model, these are the data
that are never seen by the model during training period and expected to help in generalization
of the model.
Details of Artificial Neural Network Training
Image Preprocessing
The acquired images had a height of 480 pixels and a width of 612 pixels. These were
coded in RGB Standard Code and composed of three channels (612 × 480 × 3). Subsequent
images discussed will follow the same convention. Initially, the image resolution
was reduced to (150 × 150 × 3) to facilitate local system computation. Following this,
each image, pixel by pixel, was vectorized into large arrays. The arrays were then
normalized using Min-Max Scaler Normalization and used as inputs. The image inputs
were processed through Image Data Generator to make the data more randomized by adding
noise, rotations, and other parameters to make the data more generalized. [Fig. 3 ] illustrates an image before (1) and after (2) preprocessing.
Fig. 3 Image before (A ) and after (B ) preprocessing is done. Some color changes also might be noted due to preprocessing
algorithm.
Building the Convolutional Neural Network Model
A specialized form of ANN, the Convolutional Neural Network (CNN), utilizes a convolution
function:
Yk = f(Wk ∗ x)
This formula simplifies the actual convolution formula[15 ] to suit this literature. Here, x denotes the inputs, Wk signifies the filter for
the kth feature map, f (.) denotes the convolution function, and Yk represents the
output of the function given the input x at the kth position. Convolution, at its
core, is a linear operator.
A CNN model was assembled based on VGG19,[15 ] and additional layers were integrated to manage these processed images ([Fig. 4 ]). The researcher (B.T.) built the model using Python language with Keras library
and TensorFlow backend, all of which are open source libraries. As illustrated in
the figure below, the input shape is (150,150,3). Subsequently, the layers were constructed
as alternating layers of convolution functions and pooling layers, connected via dense
connecting layers. An additional set of ANN layers were constructed atop this model
to form the final five layers of the network. These were designed specifically in
consideration of the input and data nature. Finally, the model was compiled using
“Stochastic Gradient Descent (SDG)” with a loss function of “binary crossentropy”
and a metric of accuracy as “accuracy,” resulting in a total of 28,939,329 trainable
parameters.
Fig. 4 CNN model based on VGG19 with additional layers. All parameters were re-trained to
curate results for the squash images.
The final output layer produced only a binary output and was labeled as either “1”
for high-grade glioma or “0” for low-grade glioma tumors. This layer was programmed
with the activation function of “Sigmoid,” which yields an output between 0 and 1,
providing the probability of high-grade glioma or low-grade tumors in the CNN network's
prediction.
Training the Model
Next, the CNN Model was trained on a workstation with 16 GB RAM and employed NVIDIA
RTX 2060 6 GB Graphic Processor and Intel Architecture for process augmentation. This
required 121.02 minutes to train in batches of 32 images across 100 epochs. Notably,
the training was performed without any feature extraction or human intervention.
Validation and Statistics
After training the ANN using the images from the training set, the accuracy and cross-validation
matrices were computed on the validation sets, which had not been previously analyzed
by the neural network. This provided an accuracy and precision model for novel images,
assisting in validating the model's fitness for actual use and for real-world practice.
Statistical analysis was performed using the Scikit-Learn Library, which is integrated
with the Keras library for the same purposes.
Results
Results during Dataset Training
[Fig. 5 ] displays a line chart outlining the data loss and accuracies for Training and Validation
Data. In these figures, a lower value is better for loss and a higher value is better
for accuracies. The model converges nicely by the 20th epoch, with minimal overfitting. Techniques such as Learning Rate Regulation and
Dropouts were employed to achieve these results, and multiple models were tested before
settling on the final model. While the process of tuning hyperparameters can be tedious
and time-consuming, it is essential. By the time the 100th epoch was reached, the loss was 0.0950 in the training set and 0.1016 for the validation
set. Training set accuracies of 96.2% were achieved and 96.39% were observed in the
validation set. It should be noted that this validation dataset serves for internal
comparison during the actual training period but is not used for model training.
Fig. 5 Training and validation. (A ) Loss and (B ) accuracies over 100 training epochs.
Results during the Testing Phase
The testing dataset, which includes 1,800 images, was used to produce the final results.
These results came forth in the form of “1” for “High Grade Glioma” and “0” for “Low
Grade Glioma.” The model was capable of diagnosing images singularly or in multiples.
The discussion below entails the results, inclusive of all images in the testing dataset.
A confusion matrix, as shown in [Fig. 6 ], presents a cross table of true labels versus predicted labels for the testing datasets
established by the model on previously unseen images. Since our outcome can only be
binary, either high grade or low grade, the confusion matrix is a simple 2 × 2 table.
Sensitivity, specificity, positive and negative predictive values were computed as
shown in [Table 1 ]. As apparent, the accuracy in prediction for high-grade glioma was 91 and 77% for
low-grade glioma. Furthermore, the reports can be generated within a few milliseconds.
Table 1
Sensitivity, specificity, positive, and negative predictive values for the patient
Performance of model on testing dataset
True positive rate (sensitivity)
91%
True negative rate (specificity)
77%
Positive predictive value (PPV)
86.6%
Negative predictive value (NPV)
83.89%
F1-score
0.887
Abbreviation: F1-score is mentioned.
Fig. 6 Confusion matrix built on the testing dataset. lgg, Low-Grade Glioma/Others; NB:hgg,
High-Grade Glioma.
The F1-Score, a harmonic mean of precision and recall, provides an overall performance
of the model. The score's maximum value can be 1 and the minimum can be 0. The F1-Score,
as recorded, is 0.887.
Feature Visualization
This section illuminates how the network perceives each slide image and how it can
be instrumental in locating regions of interest for review by neuropathologists. Screening
thousands of high-power fields on a given slide can be a challenging task for neuropathologists.
This process may prove beneficial if AI models screen the regions of interest, thereby
limiting the neuropathologist's scrutiny to specific regions of interest on the entire
slide. Different stages of image features visualized by the model are showcased in
[Fig. 7 ].
Fig. 7 Various stages of images seen by model. (A ) Original Image given as input to the CNN model (B ) First Layer (block1_conv1) Parameter Visualization. (C ) Heatmap of the output from the final layer(block5_conv5) obtained from CNN model
is superimposed on the grayscale image of the original image in background. The red
color indicates where the maximum importance is given, while making the decision that
it is High-Grade Glioma, followed by yellow and green. The background and normal cells
are given the least importance in making the final decision by a neuropathologist.
Discussion
This study demonstrates that AI agents built with CNN can be applied to determine
the diagnosis of CNS malignant tumor squash smears. We have found that an AI model
trained from scratch can efficiently differentiate and diagnose High-Grade Gliomas
from Low-Grade Gliomas with human-comparable accuracies of 91 and 77%, respectively.
With the development of a suitable infrastructure, we can deliver appropriate intraoperative
diagnoses. We can also achieve preliminary reports on the nature of the tumor with
reasonable accuracy and speed.
Squash smear, prepared from tumor tissue obtained during surgery and sent for analysis,
provides insight into the nature and type of the tumor a neurosurgeon is dealing with.
Reported accuracy from various studies comparing squash diagnosis with the final pathological
diagnosis ranges from 83 to 95%, with varying accuracies in squash obtained from stereotactic
biopsy procedures and direct tumor decompression procedures.[7 ]
[11 ]
[12 ]
[16 ]
AI can simplify diagnosis. Previous studies, reliant on some form of feature selection,
suggest cervical cancers can be diagnosed with accuracies ranging from 85 to 90% on
an external testing dataset.[11 ] Furthermore, in cases of Thyroid Cancer, fine-needle aspiration cytology diagnosis
can deliver a sensitivity rate of 90.48%.[13 ] However, many of these studies were limited in their sample size, leading to concerns
about the generalizability of results.
Chilamkurthy et al.[8 ] demonstrated the utility of CNN networks in predicting various abnormalities from
noncontrast CT scans. Their study suggested accuracies up to 92% can be achieved in
detecting pathologies such as intracranial hematoma, subdural or extradural hematoma,
and bony abnormalities like skull fractures (92.0%), midline shift, and mass effect
(93.0%). In another study, histopathological slides for CNS tumor classification and
segmentation were analyzed using CNN.[9 ] They used only 23 histopathological slides of glioblastoma multiforme and 22 images
of Low-Grade Glioma and were able to achieve accuracies of up to 97.5%. In contrast,
our study utilized 10,000 images for building the CNN model, presently, to our best
knowledge, the only available tool for diagnosing squash smear cytology of brain tumors.
Our CNN AI agent demonstrated a sensitivity rate for detection of high-grade glioma
and low-grade glioma of 91 and 77%, respectively. These results are comparable to
conventional detection methods by human pathologists (83–95%)[7 ]
[12 ] and in various other studies (77–80%).[17 ]
As revealed in the results with feature visualization, we can gain insight into how
the CNN model operates. This offers us an understanding of which parts of an image
our model prioritizes during diagnosis. With feature visualization, we can screen
an entire slide, providing the expert with an overview and more focused observations.
This process saves substantial human effort and expert time.
Limitations and Future Possibilities
The study relies on images taken from a microscopic camera by a pathologist. Only
nonoverlapping, well-stained, well-spread smears were imaged, which may not represent
actual real-time conditions when digitized images are subjected to AI. However, this
study points toward better machine learning and echoes the need for retraining and
refinement of the AI process to overcome practical, real-time challenges and offer
superior results and guidance.
Looking forward, there are many possibilities for expansion in this research area.
The same model can be made more accurate and generalized for diagnosing all types
of brain tumors. Similar technology, if implemented in surgical cellular microscopes,
can someday help differentiate normal from abnormal brain tissue in real time, particularly
in the case of low-grade gliomas. However, further well-funded research with more
available data are needed to make all this possible. In the next stage of the study,
we are planning to not only distinguish normal versus abnormal tissue via binary classifier
model but also to distinguish between other common amenable neurosurgical pathologies
like meningioma, schwannoma, pituitary macroadenoma among others.
Conclusion
If properly standardized images of CNS tumor squash smear cytology are obtained, AI
can be reliably used. It can achieve accuracies comparable to human experts in diagnosing
neuropathological slides and could prove useful in future endeavors to accelerate
the diagnosis of smear slides by these experts.
It is important to note that this project is a pilot study in the realm of CNS tumor
cytology analysis using deep learning techniques. The promising results advocate for
the continuation of further research on the subject, with an emphasis on using larger
datasets and scaling the model to enhance its generalizability.