CC BY-NC-ND 4.0 · Yearb Med Inform 2020; 29(01): 159-162
DOI: 10.1055/s-0040-1701991
Section 6: Knowledge Representation and Management
Survey
Georg Thieme Verlag KG Stuttgart

Ontologies, Knowledge Representation, and Machine Learning for Translational Research: Recent Contributions

Peter N. Robinson
1  The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
2  Institute for Systems Genomics, University of Connecticut, Farmington, CT, USA
,
Melissa A. Haendel
3  Oregon Clinical & Translational Research Institute, Oregon Health & Science University, Portland, OR, USA
4  Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, OR, USA
› Author Affiliations
Further Information

Publication History

Publication Date:
21 August 2020 (online)

Summary

Objectives: To select, present, and summarize the most relevant papers published in 2018 and 2019 in the field of Ontologies and Knowledge Representation, with a particular focus on the intersection between Ontologies and Machine Learning.

Methods: A comprehensive review of the medical informatics literature was performed to select the most interesting papers published in 2018 and 2019 and that document the utility of ontologies for computational analysis, including machine learning.

Results: Fifteen articles were selected for inclusion in this survey paper. The chosen articles belong to three major themes: (i) the identification of phenotypic abnormalities in electronic health record (EHR) data using the Human Phenotype Ontology ; (ii) word and node embedding algorithms to supplement natural language processing (NLP) of EHRs and other medical texts; and (iii) hybrid ontology and NLP-based approaches to extracting structured and unstructured components of EHRs.

Conclusion: Unprecedented amounts of clinically relevant data are now available for clinical and research use. Machine learning is increasingly being applied to these data sources for predictive analytics, precision medicine, and differential diagnosis. Ontologies have become an essential component of software pipelines designed to extract, code, and analyze clinical information by machine learning algorithms. The intersection of machine learning and semantics is proving to be an innovative space in clinical research.