CC BY-NC-ND 4.0 · Methods Inf Med 2021; 60(01/02): 009-020
DOI: 10.1055/s-0041-1724107
Original Article

Analysis of Not Structurable Oncological Study Eligibility Criteria for Improved Patient-Trial Matching

Julia Dieter*
1   Deparment of Medical Informatics for Translational Oncology, German Cancer Research Center, Heidelberg, Germany
Friederike Dominick*
1   Deparment of Medical Informatics for Translational Oncology, German Cancer Research Center, Heidelberg, Germany
Alexander Knurr
1   Deparment of Medical Informatics for Translational Oncology, German Cancer Research Center, Heidelberg, Germany
Janko Ahlbrandt
1   Deparment of Medical Informatics for Translational Oncology, German Cancer Research Center, Heidelberg, Germany
Frank Ückert
1   Deparment of Medical Informatics for Translational Oncology, German Cancer Research Center, Heidelberg, Germany
› Author Affiliations


Background Higher enrolment rates of cancer patients into clinical trials are necessary to increase cancer survival. As a prerequisite, an improved semiautomated matching of patient characteristics with clinical trial eligibility criteria is needed. This is based on the computer interpretability, i.e., structurability of eligibility criteria texts. To increase structurability, the common content, phrasing, and structuring problems of oncological eligibility criteria need to be better understood.

Objectives We aimed to identify oncological eligibility criteria that were not possible to be structured by our manual approach and categorize them by the underlying structuring problem. Our results shall contribute to improved criteria phrasing in the future as a prerequisite for increased structurability.

Methods The inclusion and exclusion criteria of 159 oncological studies from the Clinical Trial Information System of the National Center for Tumor Diseases Heidelberg were manually structured and grouped into content-related subcategories. Criteria identified as not structurable were analyzed further and manually categorized by the underlying structuring problem.

Results The structuring of criteria resulted in 4,742 smallest meaningful components (SMCs) distributed across seven main categories (Diagnosis, Therapy, Laboratory, Study, Findings, Demographics, and Lifestyle, Others). A proportion of 645 SMCs (13.60%) was not possible to be structured due to content- and structure-related issues. Of these, a subset of 415 SMCs (64.34%) was considered not remediable, as supplementary medical knowledge would have been needed or the linkage among the sentence components was too complex. The main category “Diagnosis and Study” contained these two subcategories to the largest parts and thus were the least structurable. In the inclusion criteria, reasons for lacking structurability varied, while missing supplementary medical knowledge was the largest factor within the exclusion criteria.

Conclusion Our results suggest that further improvement of eligibility criterion phrasing only marginally contributes to increased structurability. Instead, physician-based confirmation of the matching results and the exclusion of factors harming the patient or biasing the study is needed.

* Equally contributing authors.

Publication History

Received: 19 August 2020

Accepted: 08 December 2020

Article published online:
22 April 2021

© 2021. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

  • References

  • 1 Stiller CA, Kroll ME, Pritchard-Jones K. Population survival from childhood cancer in Britain during 1978-2005 by eras of entry to clinical trials. Ann Oncol 2012; 23 (09) 2464-2469
  • 2 Chow CJ, Habermann EB, Abraham A. et al. Does enrollment in cancer trials improve survival?. J Am Coll Surg 2013; 216 (04) 774-780 , discussion 780–781
  • 3 Goyal J, Nuhn P, Huang P. et al. The effect of clinical trial participation versus non-participation on overall survival in men receiving first-line docetaxel-containing chemotherapy for metastatic castration-resistant prostate cancer. BJU Int 2012; 110 (11 Pt B): E575-E582
  • 4 Metz JM, Coyle C, Hudson C, Hampshire M. An Internet-based cancer clinical trials matching resource. J Med Internet Res 2005; 7 (03) e24
  • 5 Weng C, Tu SW, Sim I, Richesson R. Formal representation of eligibility criteria: a literature review. J Biomed Inform 2010; 43 (03) 451-467
  • 6 Niland J, Dorr D, El Saadawi G. et al. Knowledge Representation of Eligibility Criteria in Clinical Trials. Chicago: American Medical Informatics Association Annual Symposium; 2007
  • 7 Sordo M, Boxwala AA, Ogunyemi O, Greenes RA. Description and Status Update on GELLO: a Proposed Standardized Object-Oriented Expression Language for Clinical Decision Support. Amsterdam: IOS Press; 2004: 164-168
  • 8 Tu SW, Peleg M, Carini S. et al. A practical method for transforming free-text eligibility criteria into computable criteria. J Biomed Inform 2011; 44 (02) 239-250
  • 9 Doods J, Botteri F, Dugas M, Fritz F. EHR4CR WP7. A European inventory of common electronic health record data elements for clinical trial feasibility. Trials 2014; 15: 18
  • 10 Luo Z, Yetisgen-Yildiz M, Weng C. Dynamic categorization of clinical research eligibility criteria by hierarchical clustering. J Biomed Inform 2011; 44 (06) 927-935
  • 11 Bruland P, McGilchrist M, Zapletal E. et al. Common data elements for secondary use of electronic health record data for clinical trial execution and serious adverse event reporting. BMC Med Res Methodol 2016; 16 (01) 159
  • 12 Ross J, Tu S, Carini S, Sim I. Analysis of eligibility criteria complexity in clinical trials. Summit On Translat Bioinforma 2010; 2010: 46-50
  • 13 Köpcke F, Trinczek B, Majeed RW. et al. Evaluation of data completeness in the electronic health record for the purpose of patient recruitment into clinical trials: a retrospective analysis of element presence. BMC Med Inform Decis Mak 2013; 13: 37
  • 14 Ateya MB, Delaney BC, Speedie SM. The value of structured data elements from electronic health records for identifying subjects for primary care clinical trials. BMC Med Inform Decis Mak 2016; 16: 1
  • 15 DIMDI, Deutsches Institut für Medizinische Dokumentation und Information. Internationale statistische Klassifikation der Krankheiten und verwandter Gesundheitsprobleme, 10th revision, German Modification. Accessed April 16, 2020 at:
  • 16 Division of Medical Informatics for Translational Oncology. Electronic Science Record. . Accessed April 16, 2020
  • 17 Luo Z, Johnson SB, Weng C. Semi-automatically inducing semantic classes of clinical research eligibility criteria using UMLS and hierarchical clustering. AMIA Annu Symp Proc 2010; 2010: 487-491
  • 18 Weng C, Batres C, Borda T. et al. A real-time screening alert improves patient recruitment efficiency. AMIA Annu Symp Proc 2011; 2011: 1489-1498
  • 19 Weng C, Wu X, Luo Z, Boland MR, Theodoratos D, Johnson SB. EliXR: an approach to eligibility criteria extraction and representation. J Am Med Inform Assoc 2011; 18 (Suppl. 01) i116-i124
  • 20 Sahoo SS, Tao S, Parchman A. et al. Trial prospector: matching patients with cancer research studies using an automated and scalable approach. Cancer Inform 2014; 13: 157-166