CC BY-NC-ND 4.0 · Yearb Med Inform 2019; 28(01): 120-127
DOI: 10.1055/s-0039-1677911
Section 5: Decision Support
Survey
Georg Thieme Verlag KG Stuttgart

Artificial Intelligence in Clinical Decision Support: a Focused Literature Survey

Stefania Montani
1  DISIT, Computer Science Institute, University of Piemonte Orientale, Alessandria, Italy
,
Manuel Striani
1  DISIT, Computer Science Institute, University of Piemonte Orientale, Alessandria, Italy
› Author Affiliations
Further Information

Correspondence to

Prof. Stefania Montani
DISIT, Computer Science Institute, University of Piemonte Orientale
Viale Michel 11, Alessandria
Italy   

Publication History

Publication Date:
16 August 2019 (online)

 

Summary

Objectives: This survey analyses the latest literature contributions to clinical decision support systems (DSSs) on a two-year period (2017-2018), focusing on the approaches that adopt Artificial Intelligence (AI) techniques in a broad sense. The goal is to analyse the distribution of data-driven AI approaches with respect to “classical" knowledge-based ones, and to consider the issues raised and their possible solutions.

Methods: We included PubMed and Web of ScienceTM publications, focusing on contributions describing clinical DSSs that adopted one or more AI methodologies.

Results: We selected 75 papers, 49 of which describe approaches in the data-driven AI area, 20 present purely knowledge-based DSSs, and 6 adopt hybrid approaches relying on both formalized knowledge and data.

Conclusions: Recent studies in the clinical DSS area demonstrate a prevalence of data-driven AI, which can be adopted autonomously in purely data-driven systems, or in cooperation with domain knowledge in hybrid systems. Such hybrid approaches, able to conjugate all available knowledge sources through proper knowledge integration steps, represent an interesting example of synergy between the two AI categories. This synergy can lead to the resolution of some existing issues, such as the need for transparency and explainability, nowadays recognized as central themes to be addressed by both AI and medical informatics research.


#

1 Introduction

Clinical decision-support systems (DSSs) aim to enhance healthcare decision-making, with the final objective to improve the quality of care provided by healthcare organizations [1], [2]. Since earlier years, clinical DSSs have been often conceived as applications of Artificial Intelligence (AI) methodologies to medical domains [3]. Examples range from the so-called “expert systems” based on rules [4], to approaches relying on models of deeper and more principled human reasoning, like, e.g., causal reasoning [5], to more recent proposals exploiting fuzzy logic, Bayesian networks, case-based reasoning (CBR), and other techniques (see, e.g., [6], [7]). Al-based DSSs have been used as aids across a wide range of medical tasks among various medical specialties.

According to the Barcelona Declaration for the Proper Development and Usage of Artificial Intelligence in Europe [3], AI methodologies can be divided into two fundamentally different categories: knowledge-based AI and data-driven AI. Knowledge-based AI consists in “an attempt to model human knowledge in computational terms, starting in a top-down fashion from human self-reporting of what concepts and knowledge individuals use to solve problems or answer queries in a domain of expertise, including common sense knowledge”, and “formalizes and operationalizes this knowledge in terms of software. It rests primarily on highly sophisticated but now quite standard symbolic computing technologies and has already had a huge impact” [3]. Data-driven AI, on the other hand, is characterized by “starting in a bottom-up fashion from large amounts of data of human activity, which are processed with statistical machine learning methods [...] in order to abstract patterns that can then be used to make predictions, complete partial data, or emulate human behaviour in similar conditions in the past. Data-driven AI requires big data and very substantial computing power to reach adequate performance levels” [3]. Knowledge-based methodologies are well established, but less able to exploit large volumes of data, and to automatically build a knowledge model by generalizing from data themselves. In fact, knowledge models often remain human-developed. Knowledge acquisition and formalization is thus a “bottleneck”, which consumes development time and requires a significant initial effort. On the other hand, data-driven methodologies are currently receiving a lot of attention, thanks to the large amount of data available in electronic form, to the availability of powerful computing architectures, and to the significant advancement of machine learning techniques that are able to extract characteristic features and to identify patterns from data with a high level of accuracy.

Interestingly enough, Chen et al. [8] have recently analysed this AI category distinction, referring to the field of clinical DSSs. They have observed that the knowledge-based approach to clinical decision support is limited in scale, due to the lack of evidence in some domains as well as to the cost of human knowledge authoring processes. On the contrary, taking advantage of the accumulated clinical data, stored, e.g., in electronic patient records, powerful data-driven clinical decision support systems can be implemented, possibly leading to more effective outcomes in practice.

Data-driven approaches, however, might lack in transparency and explainability , while, as highlighted by Shortlife et al. [9], a clinical DSS should enable users to “understand the basis for any advice or recommendations that are offered”.

In this survey, we analyse the latest literature contributions to clinical DSSs (years 2017-2018), focusing on approaches that adopt AI techniques. Our main goal is to understand if “classical” knowledge-based DSSs are still being proposed, or if, as in other domains, earlier AI principles and methods are less represented, in favour of big data analytics and machine learning approaches. The challenges of the surveyed methodological choices are analysed, with a particular focus on the transparency and explainability issues, nowadays recognized as central themes to be addressed by AI and medical informatics research, as mentioned above. Promises deriving from synergies between the two AI categories are also discussed.


#

2 Methods

We have considered two bibliographic repositories, namely PubMed and Web of Science” (WoS), restricting searches to years 2017 and 2018 (up to November).

Given the focus of our survey, PubMed was queried specifying (in AND) the MeSH terms “Decision Support System, Clinical” and “Artificial Intelligence”. Considering that some contributions may not adopt MeSH terms, we have also searched PubMed by specifying some related keywords to be searched for (one at a time) in the title, namely “machine learning”, “expert system”, “information retrieval”, “cognitive aid”. The search with the keyword “machine learning” actually proved to be too generic (as commented in the Results section below), so we repeated a more focused query that searched for the keywords “machine learning” and “decision” (in AND) in the title. We admitted only papers in English, where the abstract was available. We first assessed the appropriateness of the retrieved papers by reading their abstracts. Those papers that were not excluded in this phase were read in full, by dividing the effort equally between the two co-authors.

WoS does not provide MeSH term filtering; therefore, we indicated “Decision Support” (“Clinical Decision Support” was considered less inclusive) and “Artificial Intelligence” (in AND) as topics, and limited our areas of interest to “Medical Informatics”, “Computer Science Artificial Intelligence”, “Computer Science Information Systems”, and “Computer Science Interdisciplinary Applications”. By reading the title and the abstract, we could focus on the papers of interest. In this case also, papers that were not excluded were read in full, by dividing the effort equally between the two co-authors.

As a general policy, we excluded works that did not describe a DSS, but we kept the papers that (1) were reviews or existing tools comparisons, or (2) coped with related issues, such as the need for data interoperability, or (3) were limited to the presentation of a preliminary design work.

Papers were categorized on the basis of the adopted methodology and task.


#

3 Results

As regards to PubMed, through the MeSH term search we retrieved 68 papers, 29 of which were excluded after the abstract/full paper reading assessment.

The “cognitive aid” keyword search returned 6 papers, but none of them was devoted to the description of a DSS.

The “machine learning” keyword search returned 1,778 papers; however, this list was too general, since it included aids of very different complexity, typically much simpler than a DSS. We then restricted the search by adding the “decision” keyword in AND. This solution returned 15 papers. After the abstract/full paper reading assessment that led us to identify 8 papers not meeting the study criteria, and after having excluded 3 additional papers already retrieved by the MeSH term search, we kept 4 works.

The “expert system” keyword search provided us with 36 papers, but 19 of them were out of scope (e.g., focused on chemistry/pharmacology), or did not describe a DSS. After having excluded 2 additional papers already retrieved by the MeSH term search, we kept 15 works.

The “information retrieval” keyword search produced 31 papers. Only 2 of them met the selection criteria.

The WoS search returned 113 papers, 17 of which were pertinent and fulfilled our selection criteria. Two of these 17 papers had been retrieved by the PubMed search as well, so we kept 15 works.

In summary, 75 papers were considered in our analysis.

[Figure 1] illustrates the overall review flow.

Zoom Image
Fig. 1 Flow of the review process.

When reading the papers, we distinguished the retrieved works on the basis of the adopted methodology, considering the two main categories (i.e., data-driven vs. knowledge-based DSSs), and refining the distinction by identifying appropriate subgroups. The presence of hybrid approaches was identified as well. We also highlighted the main supported task.

Our results are reported in the following subsections.

3.1 Data-driven Decision Support

Out of the 75 retrieved papers, 49 can be categorized as data-driven decision support approaches [10]–[58]. As regards to the addressed task, a large part of these contributions deals with prediction, intended as classification or regression [10]–[15], [17]–[19], [21]–[36], [38]–[42], [44]–[46], [48], [49], [52]–[54], [57]. One work is focused on association rule mining [20], and one adopts statistics for risk analysis [55]. A set of papers deal with information extraction from natural language texts [16], [33], [43], or more specifically from electronic patient record data [32], [37], [44], [50], [56]. The extracted information can then be exploited for classification purposes [33], for statistical correlation and knowledge integration [32], [43], or for outlier detection [37]. Retrieval and interpretation of similar complex data, such as time series, images, voice recordings, or radiotherapy plans are addressed in [17], [30], [31], [34], [38], [47], [53], [58].

As regards to the adopted methodologies, interestingly, a large part of the classification approaches adopts more than one machine learning techniques [10]–[14], [16], [21], [22], [24]–[27], [33], [39], [42], [45], [48]–[51], [54], [57], such as Support Vector Machines (SVM), Neural Networks (NN), Decision Trees, and Bayesian models, in synergy or in competition. The work in [14], for instance, adopts machine learning to find out relevant Single Nucleotide Polymorphisms (SNPs) related to Type 2 diabetes, and to predict patient risk. The authors use the Random Forest technique to search for the most important attributes related to diabetes. SVM and Logistic Regression (LR) are also adopted. Their performances have been compared to that achieved by Random Forest. Moreover, the relevance of the attributes obtained through Random Forest has then been used to perform predictions with the k-Nearest Neighbour method, weighting the attributes in the similarity measure according to the relevance provided by Random Forest itself. Working on 677 subjects, Random Forest has outperformed all the other tested machine learning techniques in terms of prediction accuracy (0.853 with respect to 0.835 with LR and 0.825 with SVM on raw data), and in terms of the stability of the estimated relevance of the attributes.

NN deserve a special consideration, since they are the key technique adopted in many of the retrieved papers [15], [17]–[19], [28]–[30], [35], [40], [41], [52], [58]. Indeed, image interpretation and classification are fields where NN/deep learning approaches work well [59]. As an example, the paper in [30] exploits a convolutional NN for detecting haemorrhage, mass effect, or hydrocephalus at non-contrast-enhanced head computed tomographic examinations, and for identifying suspected acute infarct. In a retrospective study analysing 2,583 representative images, the approach has provided promising results in detecting critical findings at non-contrast-enhanced head computed tomography. Suspected acute infarct detection has shown lower sensitivity (62%) with respect to hydrocephalus detection (80%), but higher specificity (96% with respect to 90%). The authors plan to conduct further investigation in a controlled and prospective clinical setting.

The use of NN requires the choice of proper methods to train the network itself. The work in [17], for instance, proposes a novel algorithm based on predator-prey particle swarm optimization, to train the weights of a rather simple NN architecture (a single-hidden layer NN). The approach, applied in the field of magnetic resonance image interpretation, has outperformed six state-of-the-art methods.

Papers dealing with information extraction from text, images, time series, or voice recordings, adopt proper methodologies, such as natural language processing techniques (e.g., [32]), NN (e.g., [58]), or voice analysis [53]. [Table 1] shows the distribution of the surveyed data-driven works by task and by methodology.

Table 1

Distribution of the surveyed data-driven works by task and by methodology.

Task

Methodology

Association rules

Decision trees

Hidden Markov Models (HMM)

Neural networks

NLP

Retrieval techniques

Statistics

SVM

Voice analysis

Multiple methods

Total

Image interpretation

47

17, 30, 58

34, 38

6

Prediction

23

31

15, 17-19, 28-30, 35, 40, 41, 52

33, 44

34, 38

32

36, 46

53

10-14, 21, 22, 24-27, 39, 42, 45, 48, 49, 51, 54,57

39

Risk analysis

55

1

Rule mining

20

1

Time series interpretation

31

1

Voice interpretation

53

1

Text interpretation and prediction

16, 43, 44, 50, 56

32, 37, 43

16, 33, 50

8

Total

1

2

1

12

6

2

4

2

1

22


#

3.2 Knowledge-based Decision Support

Out of the 75 papers we analysed, 26 reported the adoption of knowledge-based AI methods [60]–[85], possibly in combination with data- driven methods [61], [63], [70], [71], [74], [76]. This later group of 6 hybrid approaches will be specifically detailed in the next subsection.

The knowledge-based methods explicitly represent knowledge as symbols in the form of rules, ontologies, past cases, or other types of knowledge structures. Specifically, the surveyed approaches ranged from the use of ontologies [61], [62], [67]–[69], to rule-based reasoning (including the adoption of fuzzy rules) [64], [66]–[68], [72]–[75], [78]–[80], [82]–[85], to the definition of semantic or associative networks [76], [77], to temporal reasoning [65], to probabilistic models such as Bayesian Networks [60], [63], [70], [81], and to case base reasoning CBR [61], [71].

As regards to the addressed task, most works deal with diagnosis [60], [62], [64], [70], [73], [74], [77]–[80], [82], [84], [85] or therapy/treatment support [61]–[63], [66], [75], [81], while some works address classification [83], patient’s behaviour tracking or activity scheduling [65], and training [68]. A set of papers deals with computer-interpretable guidelines (CIGs) [67], [69], [71], [72].

As an example, the paper in [67] affords the issue of reconciling multiple clinical guidelines for decision support in comor- bid patients. The developed tool exploits semantic web technologies to achieve knowledge modelling and knowledge integration, by aligning multiple ontologi- cally-modelled clinical pathways to develop a unified comorbid patient knowledge model. Guideline execution using reasoning engines to derive guideline-mediated recommendations is also supported. The tool has been analysed in the domain of Atrial Fibrillation and Chronic Heart Failure. [Table 2] shows the distribution of the surveyed knowledge-based works by task and by methodology.

Table 2

Distribution of the surveyed knowledge-based works by task and by methodology.

Task

Methodology

CBR

Ontologies

Probabilistic Models

Rules

Semantic Networks

Temporal Reasoning

Total

Behavior tracking / activity scheduling

65

1

Classification

83

1

Diagnosis

62

60, 70

64, 73, 74, 78-80, 82,84,85

77

13

CIG

71

67, 69

67, 72

4

Problem action pattern

76

1

Therapy

61

61, 62

63, 81

66, 75

6

Training

68

68

1

Total

2

5

4

15

2

1


#

3.3 Hybrid Approaches

Rather interestingly, among the 26 knowledge-based approaches we analysed in section 3.2, 6 papers report on more properly hybrid approaches, since they are able to take advantage of both formalized knowledge and available data [61], [63], [70], [71], [74], [76]. This is a quite natural situation when adopting probabilistic graphical models such as Bayesian Networks, where domain knowledge is typically exploited to build the network structure and/or to identify the key variables (i.e., the qualitative network information), but conditional probability tables (i.e., the quantitative network information) are then learned/ refined from the available data [63], [70], [76]. CBR is also intrinsically “hybrid”, since it conjugates the use of formalized knowledge, e.g., the case structure definition, with the exploitation of past operative examples, i.e., of data [61], [71]. However, we found also other less typical proposals, such as the work in [74]. This specific paper presents a DSS to be used in the domain of Anti-Phospholipid Syndrome diagnosis, which relies on a logic programming approach for knowledge representation and reasoning, complemented with a NN, where neuron connections are learned and tuned from the data, and do not mirror any explicit or easily explainable domain knowledge.

Combinations of several machine learning methods are also frequent, as already mentioned in the “Data-driven decision support” section and can be provided in order to compare classification results obtained through different techniques (see e.g., [11]), to realize ensemble learning (see e.g., [27]), or more generally to implement a more complex strategy able to overcome the results of simpler approaches (see e.g., [10]).


#
#

4 Discussion

The main goal of this survey was to identify a possible prevalence of data-driven AI methodologies in the development of clinical DSSs, with respect to knowledge-based approaches. In fact, 65% of the papers that we reviewed in this survey can be categorized in the data-driven methodological area. This finding demonstrates that knowledge-based methods are less represented than statistical and machine learning approaches in the very recent literature. Indeed, machine learning methodologies are particularly appropriate when dealing with high dimensional data, such as time series or medical images. As regards to medical image classification, in particular, deep learning tools are proving particularly suitable. More generally, taking advantage of the data accumulated across the entire care continuum in multiple data sources, medical professionals can use data-driven DSSs to improve patient care more successfully than knowledge-based techniques.

As a matter of fact, the exploitation of all the available patient data is a key element to be considered in a DSS. In fact, in addition to the papers analysed in this survey, using the MeSH keywords described in the “Methods” section, we retrieved from PubMed other works (e.g., [86], [87]) that dealt with interoperability, knowledge integration and knowledge management , as pre-requisites towards the development of a clinical DSS. While technically speaking these works could not be included in the survey, as they do not describe a DSS, they show the centrality of these themes. In particular, they suggest that the use of standard terminologies, such as SNOMED-CT, can help to actually implement interoperability. Moreover, developing user-friendly knowledge authoring tools and automatic knowledge acquisition facilities, implementing dimensionality reduction and missing value imputation as data preparation steps, can support the creation of sharable and interoperable knowledge as well.

The paper in [87] also highlights another topic: data-driven AI can be cost-effective, but also potentially beyond human abilities. Building and adopting machine learning techniques can be relatively easy, but understanding the provided outcome is sometimes difficult and obscure, due to the typical black-box nature of machine learning.

The need for transparency and explainability is nowadays being recognized as a central theme to be addressed by AI research. As an example, the DARPA research funding agency in the US highlights this issue through its “Explainable Artificial Intelligence” initiative [88]. Focusing on the field of clinical DSSs, the paper in [9] also clearly states that “black boxes are unacceptable”, since a clinical DSS must enable end users to understand all the generated suggestions.

Very interestingly, DARPA [88] proposes the combination of knowledge-based and data-driven methods as a solution to meet future challenges of AI, beginning with explainability. Even though every system should implement its own explanation module, which may vary on the basis of the application domain and of the adopted methodologies and algorithms, we believe that, in general, the synergy between different knowledge types and AI methodologies can in fact represent a promising strategy to deal with transparency and explainability issues. Indeed, various examples of hybrid systems, combining formalized knowledge and learnt knowledge have been retrieved in this survey. This could be a direction to follow, in order to improve DSS competence, flexibility, and, of course, explainability. Therefore, it is also our opinion that the already mentioned theme of knowledge integration will remain a key research direction for the future.

As a final consideration, it is worth mentioning that this survey has some limitations. In particular, it has a rather narrow focus, namely AI for DSSs. This choice was motivated by the special topic of the current edition of the IMIA Yearbook (2019). As a consequence, the number of examined works is rather limited, and interesting contributions to the field of DSSs may have been left out. Similarly, some works may have been ignored, if not indexed by PubMed or WoS, or not accessible through our queries. Indeed, we are aware of some interesting contributions that were left out, such as, e.g., the work in [89] in the area of deep learning for medical image interpretation, and the work in [90] in the area of CIGs. CIGs, in particular, represent a very active research field [91], testified in our survey by the works in [67], [69], [71], [72], where AI is typically adopted for domain knowledge representation, and for automated reasoning as well. In order to avoid the exclusion of further papers, we did not concentrate only on journal publications, as it was done in [92], and we did not limit our search to papers focusing on a specific study design, as in [93], or to papers strictly selected on the basis of the quality of the reported study, as in [94]. Finally, it is worth noting that this study covers a much more limited time frame (two years), with respect to more comprehensive and elaborated surveys, such as the ones in [2], [95].


#

5 Conclusion

Recent works in the clinical DSS area demonstrate a prevalence of data-driven AI, with respect to knowledge-based classical approaches. Despite their success, however, data-driven methods may lack transparency in how a conclusion is reached, while the capability of explaining and justifying their outcome to the end user is central in the medical domain [9]. The synergy between different knowledge types and between the two categories of AI methodologies could represent a promising strategy to deal with transparency issues. As the Barcelona Declaration for the Proper Development and Usage of Artificial Intelligence in Europe [3] states, in fact, “the full potential of AI will only be realized with a combination of these two approaches”. An effort towards the definition of hybrid systems, able to integrate knowledge-based and data-driven methods, is already witnessed in the recent literature. Such a strategy also entails data interoperability and knowledge integration issues, to allow the exploitation of different knowledge sources; recent works are approaching this theme as well. We believe that the use of hybrid approaches for DSSs will be a key direction for the future. Indeed, the powerful and promising data-driven DSSs can strongly benefit of methods based on knowledge formalization, and of their generalization and abstraction capabilities, which can be proven particularly helpful to provide a really explainable decision support.


#
#

Correspondence to

Prof. Stefania Montani
DISIT, Computer Science Institute, University of Piemonte Orientale
Viale Michel 11, Alessandria
Italy   


  
Zoom Image
Fig. 1 Flow of the review process.