We read with interest the editorial by Hassan et al [1] entitled “AI everywhere in endoscopy, not only for detection and characterization,” prompted by the recent paper of Hansen et al. on “Novel artificial intelligence (AI)-driven software significantly shortens the time required for annotation in computer vision projects” [2]. As Hassan et al. point out, unlike classic machine learning methods (MLM), the new kid on the block’s (i. e., deep learning [DL]) main advantage is its capability to automatically extract image features so that computers can use them to characterize their content [3]. This, essentially, means that the accuracy of this unsupervised approach depends primarily on the aptness and quality of the training data provided.
Especially in the field of capsule endoscopy (CE), where imaging data are readily available, it remains to be determined who will plough through the images, delineate/annotate and comment on regions of interest, and make sure that DL training is performed with high-quality material. Considering this, putting in a substantial amount of human effort (including personal) [1], we set off to create a series of respective CE databases, i. e., KID, CAD-CAP, and Kvasir Capsule [4]
[5]
[6] for the benefit of computer scientists, at the expense of effort by ourselves and colleagues. Although they are enriched by and enlarged with CE images from different manufacturers, the diverse databases contain numerous classes of gastrointestinal normal and abnormal findings that have been prepared in various way. Therefore, the level of cleanliness in the databases is diverse and they offer a unique opportunity and respective point of reference for AI software developers. This approach sets the scene for structured delivery of a series of much-needed solutions for accurate detection and characterization of abnormal CE findings. These include reliably producing thumbnails of anatomical landmarks (i. e. stomach, small bowel, colon), which is of tremendous importance especially with the emerging trend of panenteric CE [7]; reproducible assessment of bowel cleanliness [8], which can easily surpass that of human readers [9], thus allowing crucial decisions to be made on repeating a procedure; and, crucially, the relevance of findings according to the clinical setting [10].
The experience to date shows that the time required to create such databases is substantial, and the overall effort, cleverly described as human slavery to the inanimate AI master [1], is far from negligible. Instinctively one would want to consider maximizing the use of all the existing databases. However, planning for the future, we need concrete standards/guidelines for creating new databases according to user needs and international General Data Protection Regulation requirements. At present, though, our Sword of Damocles is sorting out any related medico-legal issues about data protection and eventually merging all the databases into a unique, free-to-access, single database for DL training in CE [11].