Representation of Information about Family Relatives as Structured Data in Electronic Health Records
12 October 2013
Accepted: 11 February 2014
21 December 2017 (online)
Background: The ability to manage and leverage family history information in the electronic health record (EHR) is crucial to delivering high-quality clinical care.
Objectives: We aimed to evaluate existing standards in representing relative information, examine this information documented in EHRs, and develop a natural language processing (NLP) application to extract relative information from free-text clinical documents.
Methods: We reviewed a random sample of 100 admission notes and 100 discharge summaries of 198 patients, and also reviewed the structured entries for these patients in an EHR system’s family history module. We investigated the two standards used by Stage 2 of Meaningful Use (SNOMED CT and HL7 Family History Standard) and identified coverage gaps of each standard in coding relative information. Finally, we evaluated the performance of the MTERMS NLP system in identifying relative information from free-text documents.
Results: The structure and content of SNOMED CT and HL7 for representing relative information are different in several ways. Both terminologies have high coverage to represent local relative concepts built in an ambulatory EHR system, but gaps in key concept coverage were detected; coverage rates for relative information in free-text clinical documents were 95.2% and 98.6%, respectively. Compared to structured entries, richer family history information was only available in free-text documents. Using a comprehensive lexicon that included concepts and terms of relative information from different sources, we expanded the MTERMS NLP system to extract and encode relative information in clinical documents and achieved a corresponding precision of 100% and recall of 97.4%.
Conclusions: Comprehensive assessment and user guidance are critical to adopting standards into EHR systems in a meaningful way. A significant portion of patients’ family history information is only documented in free-text clinical documents and NLP can be used to extract this information.
Citation: Zhou L, Lu Y, Vitale CJ, Mar PL, Chang F, Dhopeshwarkar N, Rocha RA. Representation of information about family relatives as structured data in electronic health records. Appl Clin Inf 2014; 5: 349–367 http://dx.doi.org/10.4338/ACI-2013-10-RA-0080
- 1 Guttmacher AE, Collins FS, Carmona RH. The family history –more important than ever. New England Journal of Medicine 2004; 351 (22) 2333-2336.
- 2 Feero WG, Guttmacher AE, Collins FS. Genomic Medicine –An Updated Primer. New England Journal of Medicine 2010; 362 (21) 2001-2011.
- 3 Fuchs CS, Giovannucci EL, Colditz GA, Hunter DJ, Speizer FE, Willett WC. A prospective study of family history and the risk of colorectal cancer. New England Journal of Medicine 1994; 331 (25) 1669-1674.
- 4 Barrett-Connor E, Khaw K. Family history of heart attack as an independent predictor of death due to cardiovascular disease. Circulation 1984; 69 (Suppl. 06) 1065-1069.
- 5 Pharoah PD, Day NE, Duffy S, Easton DF, Ponder BA. Family history and the risk of breast cancer: a systematic review and meta analysis. International Journal of Cancer 1998; 71 (Suppl. 05) 800-809.
- 6 Annis AM, Caulder MS, Cook ML, Duquette D. Family history, diabetes, and other demographic and risk factors among participants of the National Health and Nutrition Examination Survey 1999–2002. Preventing chronic disease 2005; 2 (Suppl. 02) A19.
- 7 Scheuner MT, Wang SJ, Raffel LJ, Larabell SK, Rotter JI. Family history: a comprehensive genetic risk assessment method for the chronic conditions of adulthood. American journal of medical genetics 1997; 71 (Suppl. 03) 315-324.
- 8 Valdez R, Yoon PW, Qureshi N, Green RF, Khoury MJ. Family history in public health practice: a genomic tool for disease prevention and health promotion. Annual review of public health 2010; 31: 69-87.
- 9 HL7 Clinical Genomics Work Group.. The Family History Standard –Implementation Guide. November, 2012.
- 10 HL7/ANSI.. HL7 Version 3 Standard: Clinical Genomics. Pedigree.
- 11 HL7 Implementation Guide: CDA Release 2 –Continuity of Care Document (CCD®). April, 2007.
- 12 Blumenthal D, Tavenner M. The “meaningful use” regulation for electronic health records. New England Journal of Medicine 2010; 363 (Suppl. 06) 501-504.
- 13 Centers for Medicare & Medicaid Services –EHR Incentive Program.. Stage 2 Eligible Hospital and Critical Access Hospital Meaningful Use Menu Set Measures –Measure 4 of 6. Octobor, 2012.
- 14 Centers for Medicare & Medicaid Services –EHR Incentive Program.. Stage 2 Eligible Professional Meaningful Use Menu Set Measures –Measure 4 of 6. October, 2012.
- 15 International Health Terminology Standard Development Organisation (IHTSDO).. SNOMED Clinical Terms (SNOMED CT). 2012
- 16 LOINC.. http://loinc.org (Last accessed on 11/22/2013).
- 17 UMLS.. http://www.nlm.nih.gov/research/umls (last accessed on 11/22/2013).
- 18 Johnson SB, Bakken S, Dine D, Hyun S, Mendonca E, Morrison F. et al. An electronic health record based on structured narrative. Journal of the American Medical Informatics Association: JAMIA 2008; 15 (Suppl. 01) 54-64.
- 19 Surgen General’s Family Health History Initiative: http://www.hhs.gov/familyhistory/index.html (last accessed on 10/6/2013).
- 20 Friedlin J, McDonald CJ. Using a natural language processing system to extract and code family history data from admission reports. AMIA Annual Symposium proceedings / AMIA Symposium AMIA Symposium 2006: 925.
- 21 Goryachev S, Kim H, Zeng-Treitler Q. Identification and extraction of family history information from clinical reports. AMIA Annual Symposium proceedings / AMIA Symposium AMIA Symposium 2008: 247-251.
- 22 Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC medical informatics and decision making 2006; 6: 30.
- 23 Lewis N, Gruhl D, Yang H. editors. Dependency Parsing for Extracting Family History. Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on; 2011: IEEE.
- 24 De Marneffe M-C, Manning CD. editors. The Stanford typed dependencies representation. Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation 2008 Association for Computational Linguistics.
- 25 Chapman WW, Chu D, Dowling JN. editors. ConText: An algorithm for identifying contextual features from clinical text. Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing 2007 Association for Computational Linguistics.
- 26 Zhou L, Plasek JM, Mahoney LM, Karipineni N, Chang F, Yan X. et al. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to process medication information in outpatient clinical notes. AMIA Annual Symposium proceedings / AMIA Symposium AMIA Symposium 2011; 2011: 1639-1648.
- 27 HL7 Version v3 Code System.. http://hl7.org/fhir/v3/RoleCode (last accessed on July 16, 2013).
- 28 Genetics Home Reference.. http://ghr.nlm.nih.gov (last accessed on 1/24/2014).
- 29 Partners’ Research Patient Data Repository. http://rc.partners.org/rpdr (last accessed on 10/6/2013).
- 30 Baeza-Yates R, Ribeiro-Neto B. Modern Information Retrieval. New York: ACM Press, Addison-Wesley.; 1999