CC BY-NC-ND 4.0 · Yearb Med Inform 2022; 31(01): 165-166
DOI: 10.1055/s-0042-1742531
Section 4: Clinical Research Informatics
Best Paper Selection

Best Paper Selection


Appendix: Summary of Best Papers Selected for the 2021 Edition of the IMIA Yearbook, Clinical Research Informatics Section

Bahmani A, Alavi A, Buergel T, Upadhyayula S, Wang Q, Ananthakrishnan SK, Alavi A, Celis D, Gillespie D, Young G, Xing Z, Nguyen MHH, Haque A, Mathur A, Payne J, Mazaheri G, Li JK, Kotipalli P, Liao L, Bhasin R, Cha K, Rolnik B, Celli A, Dagan-Rosenfeld O, Higgs E, Zhou W, Berry CL, Van Winkle KG, Contrepois K, Ray U, Bettinger K, Datta S, Li X, Snyder MP

A scalable, secure, and interoperable platform for deep data-driven health management

Nat Commun 2021 Oct 1;12(1):5757

This paper presents a major effort in building a secured and scalable platform to gather big biomedical data from different sources (including genomics, EHRs, wearable sensors). The authors focus on technical aspects of building such a platform: security (local data storage, defense against reverse engineering, mobile app security, anonymization), scalability (authentication, messaging, machine learning, infrastructure based on the open-source tool Terraform) and analysis (data preprocessing, feature extraction). Although interoperability issues are not really mentioned, several APIs for data collection are described. The main features of the platform are data visualization, monitoring and alerts, as well as feature prediction with logistic regression (84 features). The platform can be used at patient-level or at cohort-level and has been used for the detection of pre-symptomatic COVID-19 cases, and for biological characterization of insulin-resistance heterogeneity. In conclusion, this very large-scale platform for biomedical data offers guarantees in terms of security, scalability, data preprocessing and provides features for visualization, monitoring and analysis.

Cheng AC, Duda SN, Taylor R, Delacqua F, Lewis AA, Bosler T, Johnson KB, Harris PA

REDCap on FHIR: Clinical Data Interoperability Services

J Biomed Inform 2021 Sep;121:103871

This paper describes the development and evaluation of the REDCap Clinical Data Interoperability Services (CDIS) module that provides seamless data exchange between the REDCap research Electronic Data Capture (EDC) system and any EHR system with a FHIR API without project-by-project involvement from Health Information Technology staff. An iterative process has been used to design all aspects of the CDIS module (access control, authentication, variable selection, and mapping) in such a way that end users could easily set up and use the module in 2 use cases. In the “Clinical Data Pull” (CDP) mode the CDIS automatically pulls EHR data into user-defined REDCap fields. In the “Clinical Data Mart (CDM)” mode, the CDIS collects all specified data for a patient over a given time. Beyond the stakeholders group initially involved including Vanderbilt University Medical Center (VUMC) health IT and EPIC EHR teams, other healthcare organizations and EHR vendors have been associated through the REDCap consortium. As of Nov 2020, since its release (1st CDP project live@VUMC launched Q3 2018, REDCap released on Epic App Orchard Q1 2019), 82 projects are running at VUMC (55 CDP, 27 CDM) with 19.5 M data points transferred. With a large scale adoption in REDCap consortium sites (26 implementations in other institutions with EPIC EHRs / 47 ongoing implementations in institutions with EPIC (n=26) or Cerner (n=9) EHRs), the REDCap Clinical Data and Interoperability Services (CDIS) are key contributions to the integration of care and research activities. Thanks to the CDIS modules, leveraging the FHIR standard to use of EHR as electronic source for clinical research, the researchers can self-service the setup of real time and direct data extraction from the EHR reducing the need for manual transcription and flat file uploads and improving the accuracy and efficiency of EHR data collection.

Pedrera-Jiménez M, García-Barrio N, Cruz-Rojo J, Terriza-Torres AI, López-Jiménez EA, Calvo-Boyero F, Jiménez-Cerezo MJ, Blanco-Martínez AJ, Roig-Domínguez G, Cruz-Bermúdez JL, Bernal-Sobrino JL, Serrano-Balazote P, Muñoz-Carrero A

Obtaining EHR derived datasets for COVID-19 research within a short time: a flexible methodology based on Detailed Clinical Models

J Biomed Inform 2021 Mar;115:103697

Responding to the urgent need for health data insights during the COVID-19 pandemic, and utilizing this as a use case for a generalizable methodology, Pedrera-Jiménez et al report on the use of Detailed Clinical Models (DCMs) as a formalized representation for the structure and semantics of research data sets. They propose this as a data transformation pathway to generate datasets rapidly and accurately from EHRs for secondary use, without loss of meaning or error, allowing for frequent changes in specification, and being easy to validate.

The authors took as their use case the need to rapidly generate a research data set conforming to the International Severe Acute Respiratory and emerging Infection Consortium (ISARIC-WHO) COVID-19 specification. Instead of the classical approach of authoring this data set in an electronic case report form (eCRF), the authors modelled this research data set as a portfolio of Detailed Clinical Models: EHR archetypes conforming to the ISO 13606 standard. These archetypes each expressed a specific data structure pattern which was a profiled subset of the generic 13,606 EHR interoperability reference model, and incorporated semantic constraints (e.g,. value sets) drawn from SNOMED-CT or LOINC, as appropriate for clinical or laboratory concepts. These DCMs were used as the data extraction mapping target from the EHR system at the Hospital Universitario 12 de Octubre in Madrid. The extraction included data on 4,489 patients hospitalised with COVID-19 over a six-month period during 2020. The flexibility and agility of this method was demonstrated through the ability to revise the data set specification easily by modifying or adding further archetypes. The authors discuss the future potential of this method to also utilise HL7 FHIR resources as alternative DCMs, through a forthcoming ISO Technical Specification on “Guidelines for implementation of HL7/FHIR based on ISO 13940 and ISO 13606”. There is a growing need and opportunity for the systematic and interoperable reuse of routinely collected (real-world) EHR data for research, through the centralised or federated querying of standardised data sets. This research, although mono-centric and exemplified through COVID-19, is included as a best paper because the methodology is generalisable to any area of clinical research for which there is relevant real-world data.


Die Autoren geben an, dass kein Interessenkonflikt besteht.


Artikel online veröffentlicht:
04. Dezember 2022

© 2022. IMIA and Thieme. This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany