Introduction
The adoption and implementation of digital health in gastroenterology have occurred
more quickly than was expected. This has been primarily related to the application
of artificial intelligence (AI) utilizing deep learning for image analysis in the
field of gastrointestinal endoscopy. AI for polyp detection has been proven effective
in standalone performance studies as well as in randomized clinical trials. In a very
short time, several products have been approved by regulatory agencies and introduced
into clinical practice [1 ]. Similar algorithms are now available to detect early cancer in upper gastrointestinal
endoscopy and small bowel endoscopy [2 ]
[3 ]
[4 ]. More recently, algorithms for neoplasia characterization, mainly in the lower gastrointestinal
field, have become available, opening the door to AI-assisted optical diagnosis [5 ].
In addition to the technology’s widespread use in gastrointestinal endoscopy, artificial
neural network applications have expanded to data analytics that may lead to a more
precise or personalized diagnostic and therapeutic process based on a multidimensional
assessment of disease severity and predictors of either prognosis or therapeutic responses
[6 ]
[7 ]
[8 ]. This may apply to inflammatory bowel disease (IBD), liver diseases, and other immune-related
or oncological disorders of the gastrointestinal tract.
The primary reason for the subitaneous translation of AI in clinical practice is represented
by the fact that these plug-in software algorithms only require the generation and
validation of a mathematical algorithm. This represents a major difference with hardware
innovations, such as new endoscopic platforms or advanced imaging, and new drugs that
tend to appear relatively infrequently in a 5– to 10-year span. However, the development
of AI algorithms still requires the collection of cases and supervised annotation,
which is a time-consuming process that may require several years to be finalized.
For this reason, it is relevant to understand how this translational process works.
For instance, observational studies are needed to collect large and robust databases
to train AI algorithms. On the other hand, prospective interventional trials are requested
before such algorithms can be clinically implemented. Clusterization of studies on
one specific topic may be expected to lead to rapid implementation in clinical practice,
while some other fields may be marginalized due to paucity of data or challenges in
algorithm generation.
To predict the future of AI in gastroenterology, we systematically reviewed all the
studies registered on the clinicaltrials.gov platform, including the main field and
purposes.
Methods
This systematic review was based on data retrieved from Clinicaltrials.gov (https://clinicaltrials.gov),
a free database listing 390,330 studies (November 2021). This database has been widely
used in the past when conducting systematic reviews to analyze research trends in
gastroenterology (such as in IBD) [9 ]
[10 ] and daily endoscopic practice [11 ]
[12 ]
[13 ], forecasting future research efforts and drawing patient-centered management plans
or guideline proposals. Search-filter strategy, data extraction sub-classifications,
and study nomination process are reported in the Appendix .
Inclusion and exclusion criteria
For our study, only study entries that focused on use of AI-based applications in
the field of gastroenterology/hepatology were selected. The term AI, formally defined
by Medical Subject Headings (MeSH) in 1986 and elaborated upon further in the Appendix , refers to any task completed by a computer system that would ordinarily be achieved
using human intellect and reasoning. The terms machine learning and deep learning
fall under the broad class of AI as per MeSH definitions. Study entries that analyzed
machine learning algorithms to produce models that predict disease progression, prognosis,
medical treatment outcomes, and/or improve the quality of current clinical practices
in the field were included. Study entries that utilized AI but had no relevance to
the field and vice versa were excluded. If multiple versions of the same entry existed,
only the most recent version of it was included, and those prior excluded. Those studies
that remained inactive for more than 2 years were automatically excluded. No language
restrictions were needed during the selection process as all study entries registered
to ClinicalTrials.gov were in English.
Data extraction
Each study entry was searched for the following information: 1) geographical setting;
2) funding (academic/hospital); 3) year of registration; 4) field of application (gastrointestinal
endoscopy/IBD/liver disease/other); 5) type and design of study (interventional/observational,
single-center/multicenter); 6) patient population (number, age); 7) state of recruitment
(completed/ongoing/not started); 8) study-endpoints; and 9) study status (published/unpublished).
One-dimensional frequency distributions (absolute, relative) were determined for the
analyzed study characteristics. Dataset quantitative acquisition, processing, and
statistical evaluation were carried out using Microsoft Excel software for Microsoft
Windows.
Results
Study characteristics
An initial literature search using the Clinicaltrials.gov results database identified
115 related study records. Eight study entries were excluded as unrelated to gastroenterology,
two due to lack of AI technology, while one study was excluded as a duplicate. Thus,
103 study entries were included in our analysis. [Fig. 1 ] shows the methodological process used by the authors to select study entries. Twenty-one
of 103 (20 %) and 27 of 103 (26 %) had already closed or completed enrollment at the
time of the search, while 49 of 103 (48 %) were open (status not reported for six
study entries). The study end points and area of focus of the individual AI studies
are summarized in [Table 1 ] and Supplementary Table 2 respectively. Supplementary Table 1 shows the publication status of completed studies. Supplementary Fig. 1 represents the recruitment age bracket for the selected study entries.
Fig. 1 Flowchart of the systematic review.
Table 1
End points of the included studies.
Main topics of the study entries
No. studies
Upper gastrointestinal endoscopy
Esophagus
Esophageal cancer detection: Squamous cell neoplasia-focused; incorporating probe
based confocal laser endomicroscopy (pCLE)
2
Barrett’s early detection-focused; incorporating image retrieval methods
1
Esophageal cancer treatment response prediction: Predicting a complete pathological
response to neoadjuvant therapy; incorporating radiomics
1
GERD assessment: Incorporating near focus narrow band imaging AI
1
Stomach
Gastric precancer/polyp/neoplasm detection
10
Magnetic controlled capsule endoscopy: AI platforms developed for MCCE, quality assessments
3
Clinical decision support system: for upper gastrointestinal cancer care
1
Gastric cancer (GC) multiomics: AI diagnostic algorithm; incorporating multiomics
to characterize advanced GC in Europe, Latin American and Carribean populations
1
Lower gastrointestinal endoscopy
Detection polyp/neoplasm
22
Detection & characterization polyp/neoplasm
21
Characterization polyp/neoplasm
9
Bowel preparation quality
3
Colorectal cancer biomarkers and multiomics evaluation application of AI to identify
blood and stool biomarkers to detect early colorectal cancer (Freenome)
2
Small bowel
Small bowel bleeding – capsule endoscopy: AI assistance in video reading
1
Hepatology
Hepatocellular carcinoma detection and characterization: Incorporating imaging phenomics
to create signatures for various HCC types (iBiopsy)
2
Hepatocellular carcinoma prognostication: Incorporating MRI radiomics to predict the
prognosis of early-stage HCC after minimally invasive treatment
2
Hepatocellular carcinoma screening: Incorporating DL to clinical, biological, elastographic
and ultrasonic parameters to risk stratify hepatocarcinogenesis in non-tumor liver
parenchyma
1
Hepatobiliary disease – ocular association: Incorporating DL to predict hepatobiliary
diseases from ocular images
1
Metastatic liver disease in colon cancer: Developing AI based software to predict
metastatic liver nodules in patients with colorectal cancer
1
Non-alcoholic fatty liver disease & non-alcoholic steatohepatitis: Incorporating AI
to differentiate both and to stage fibrosis
1
Polycystic liver disease: Developing a CNN for automated liver contour; segment detection
in polycystic liver
1
Pancreaticobiliary
Pancreatic neoplasm histology: AI use for rapid on-site evaluation & automated counting
of Ki-67 in biopsy samples of neuroendocrine neoplasias respectively.
2
ERCP navigation system: AI application to biliary stricture navigational instructions
for guidewire direction and stent placement
2
Acute pancreatitis (AP): AI application to determining severity of AP
1
Pancreatic disease biomarker evaluation: Pairing AI with biomolecular analyses of
markers (Berg’s Interrogative Biology Platform)
1
Pancreatic neoplasm screening: AI-based surveillance to predict early pancreatic cancer
using health records and big data
1
Endoscopic ultrasound for pancreatic cancer staging: Lymph node metastases detection
and characterization DL algorithm
1
EUS navigation system: DL-based real-time scanning of the pancreas for lesion detection
1
Choledocholithiasis prediction model: Symbolic regression applied to symptomatology,
biochemical and imaging parameters
1
IBD
Crohn’s and ulcerative colitis: AI application to assess disease severity via endoscopic
images, Raman spectroscopy, chronic pain profiling, histology and radiomics
4
Ulcerative colitis: DL application to automate evaluation of inflammatory activity
using pCLE, red density and other big data, respectively
3
Crohn’s disease: AI applied to signals obtained from digital wearables to forecast
transition of symptoms when in stress.
1
GERD, gastroesophageal reflux disease; AI, artificial intelligence; HCC, hepatocellular
carcinoma; MRI, magnetic resonance imaging; DL, deep learning; CNN, convolutional
neural network; ERCP endoscopic retrograde cholangiopancreatography; EUS, endoscopic
ultrasound; IBD, inflammatory bowel disease.
Study type and design
Among the 103 study entries, 48 (47 %) were planned as interventional and 55 (53 %)
as observational studies. In detail, 30 of 48 (63 %) interventional trials were randomized,
16 of 48 (33 %) were N/A (non-applicable) and two of 48 (4 %) were non-randomized
trials. Of the 30 randomized trials, 29 (97 %) had a parallel design, while one of
30 (3 %) was based on a factorial assignment. In the non-randomized interventional
group, both studies had a parallel assignment. The observational studies were predominantly
cohort-type (34/55 [62 %]), case-only (5/55 [9 %]), and three of 55 (5 %) were case
control, followed by other methodologies in 13 of 55 (24 %). The time frame of the
observational studies was prospective in 38 of 55 (69 %), retrospective in nine (16 %),
cross-sectional in one (2 %), and other in seven (13 %). The study types and designs
are illustrated in Supplementary Fig. 2.
Study location and funding source
According to the retrieved records, the study was based in Asia in 45 of 103 cases
(44 %), in Europe in 44 of 103 (43 %), United States 10 of 103 (10 %), and North America
four of 103 (4 %). [Fig. 2 ] depicts the geographical distribution of registered study protocols across various
countries worldwide. The majority of the studies (74/103 [72 %]) were planned as single-center.
Hospitals and universities were the primary funding sources for trials conducted in
Asia 43 of 45 (96 %), Europe 35 of 44 (80 %), and North America four of four (100 %),
whereas the primary funding source for the United States was from industry in five
of 10 (50 %). The various funding sources for each location are shown in Supplementary Fig. 3 .
Fig. 2 Geographical location of the study entries. The map was created with the help of free
software https://www.mapchart.net/world.html
AI field of application
The area of focus of AI research appeared to be gastrointestinal endoscopy in 76 of
103 study entries (74 %), followed by pancreato-biliary diseases in 10 of 103 (10 %),
hepatology in nine of 103 (9 %), and IBD in eight of 103 (8 %). Among the records
in gastrointestinal endoscopy, colorectal polyp detection was the most prevalent researched
area in 43 of 76 (57 %), followed by colorectal polyp characterization in 30 of 76
(39 %). [Fig. 3 ] shows the main field of the proposed application for AI in gastroenterology.
Fig. 3 Main field of proposed application for AI in gastroenterology.
Image analysis was more commonly studied as opposed to data analysis in the area of
pancreaticobiliary diseases (6/10 [60 %]), hepatology (8/9 [89 %]), and IBD (6/8 [75 %]),
as shown in Appendix Figure.
[Fig. 4 ] shows the number of study entries by year for the entire period of 2007 to 2021.
In 2018, seven of eight studies (87.5 %) were observational, while in 2021, 21 of
34 (61.8 %) were interventional, with a progressive increase of the study entries
by year and inversion of the ratio between observational and interventional studies
over time. Supplementary Table 3 shows the number of studies in which such roles were created, validated or performed
in their objectives. Of the 103 studies, 73 % validated existing AI-based platforms
and algorithms (75/103), while only 6 % set out to create new ones (6/103). Twenty-one
percent (22 of 103) of the studies aimed to do both. Their trend from 2007 to 2021
is illustrated in Supplementary Fig. 4 .
Fig. 4 AI studies in the field of gastroenterology by year of registration.
Discussion
According to our study, research efforts surrounding AI in gastroenterology were primarily
focused on detecting and characterizing colorectal neoplasia. At the same time, the
use of machine learning for personalized medicine received far less attention and
was found to be applied primarily in hepatology. In addition, the time of conversion
between observational and interventional study entries was approximately 2 years,
suggesting a fast translation of AI research in the clinical setting.
There may be three main reasons for the dominance of colorectal neoplasia as the primary
driver of AI-based research. First, deep learning developed for non-medical imaging,
i. e., face/object-recognition software, coupled with high-performing graphical interfaces,
was technologically optimized for real-time endoscopy. However, it could be argued
that this also applies to upper gastrointestinal neoplasia. Thus, the second reason
may be the high prevalence of colorectal polyps with a polyp-to-patient ratio up to
3:1, facilitating the collection of a critical volume of cases needed for training
an adequate deep learning system. We can estimate that thousands of frames must be
annotated to train and thoroughly test a reliable deep learning model for polyp detection
(i. e., computer-aided diagnosis for detection [CADe]) and characterization (computer-aided
diagnosis for characterization [CADx]). In this regard, upper gastrointestinal neoplasia,
such as Barrett’s esophagus or early gastric cancer, may be penalized by its very
low prevalence outside a tertiary center. Of note, there was a clear tendency for
a single-center research setting, preventing the coalescence of multicenter/national
databases that may be required for less prevalent disorders. Third and finally, CADe
implementation exploits the clinical relevance of the conversion of any increase in
adenoma detection rate in additional colorectal cancer prevention, as well as the
progressive expansion of population-based colorectal cancer screening programs in
several Western and Eastern countries.
The progressive inversion of the gradient between observational and interventional
studies was expected. In the observational phase, databases are prospectively collected
to train and test AI algorithms in an artificial setting, i. e., standalone performance.
In the interventional phase, such algorithms are validated against the reference standard
in a clinical setting through randomized or sequential trials. On the other hand,
what was somewhat unexpected was the very short time of conversion between the two
phases of AI translation in clinical practice, which is well in line with the subitaneous
appearance in the gastrointestinal endoscopy market of several devices approved by
regulatory agencies for colorectal polyp detection/characterization. Thus, AI research
appears as one of the fastest channels to shift innovation from bench to bedside.
Disappointingly, only a limited number of studies on personalized medicine based on
the use of machine learning were documented during the timeframe of our analysis.
This represents a substantial difference between oncology and genetics, where most
AI models are aimed at patient prognosis or response prediction. A possible reason
for the small number is the difficulties faced in collecting large enough databases
for rare and infrequently encountered diseases, such as liver or pancreatic cancer
or IBD [14 ]. In fact, most of the study entries in these fields were related to image analysis,
irrespective of whether that was endoscopy-based or using ultrasound or cross-sectional
imaging.
The main limitation of our analysis is represented by the fact that a study entry,
simply put, does not necessarily indicate that the study will be executed, finalized
and published. However, given that a substantial proportion of databases were already
completed, a relevant proportion of published studies would support the validity of
our data.
Conclusions
In conclusion, our analysis shows the dominance of CADe/CADx for colorectal neoplasia
for AI research in gastroenterology, as well as the limited time span required for
its conversion into clinical practice, mirroring what is happening in the gastrointestinal
endoscopy market. A different research approach, including a possible lead of scientific
societies, is required for AI application to rarer and/or clinically oriented fields.