Methods of Information in Medicine, Inhaltsverzeichnis Methods Inf Med 2016; 55(04): 347-355DOI: 10.3414/ME16-01-0012 Original Articles Schattauer GmbH The Importance of Context: Risk-based De-identification of Biomedical Data[*] Fabian Prasser ** 1 Technical University of Munich, University Hospital rechts der Isar, Institute of Medical Statistics and Epidemiology, Munich, Germany , Florian Kohlmayer ** 1 Technical University of Munich, University Hospital rechts der Isar, Institute of Medical Statistics and Epidemiology, Munich, Germany , Klaus A. Kuhn 1 Technical University of Munich, University Hospital rechts der Isar, Institute of Medical Statistics and Epidemiology, Munich, Germany› InstitutsangabenArtikel empfehlen Abstract Volltext als PDF herunterladen Keywords KeywordsInformation science - computer security - data protection - data anonymization - risk - data quality Referenzen References 1 Schneeweiss S. Learning from Big Health Care Data. N Engl J Med 2014; 370 (Suppl. 23) 2161-3. PubMed PMID: 24897079. 2 Murdoch T, Detsky A. The inevitable application of big data to health care. J Am Med Assoc 2013; 309 (Suppl. 13) 1351-2. PubMed PMID: 23549579. 3 Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol 2013; 31 (Suppl. 12) 1102-10. PubMed PMID: 24270849. 4 Christoph J, Griebel L, Leb I, Engel I, Köpcke F, Toddenroth D. et al. Secure secondary use of clinical data with cloud-based NLP services. Methods Inf Med 2015; 54 (Suppl. 03) 276-82. PubMed PMID: 25377309. 5 US National Institutes of Health.. NOTOD-14–124: NIH Genomic Data Sharing Policy [Internet]. Genomic Data Sharing Policy Team 2014 [cited 2016 Feb 04]. Available from: https://grants.nih.gov/grants/guide/notice-files/NOT-OD-14–124.html. 6 Liu V, Musen M, Chou T. Data breaches of protected health information in the united states. J Am Med Assoc 2015; 313 (Suppl. 14) 1471-3. PubMed PMID: 25871675. 7 Hallinan D, Friedewald M, McCarthy P. Citizens’ perceptions of data protection and privacy in Europe. Comp Law Sec Rev 2012; 28 (Suppl. 03) 263-72. doi: 10.1016/j.clsr.2012.03.005. 8 Schadt EE. The changing privacy landscape in the era of big data. Mol Syst Biol 2012; 8: 612. PubMed PMID: 22968446. 9 Sweeney L. Computational disclosure control – A primer on data privacy protection [dissertation]. Cambridge (MA): Massachusetts Institute of Technology; 2001 10 El Emam K. Guide to the de-identification of personal health information. 1st ed. Boca Raton: CRC Press; 2013 11 El Emam K, Arbuckle L. Anonymizing health data: case studies and methods to get you started. 1st ed. Sebastopol: O’Reilly and Associates; 2014 12 HIPAA administrative simplification statute and rules. 45 C.F.R. Parts 160, 162, and 164 (2013). 13 Samarati P. Protecting respondents’ identities in microdata release. Trans Knowl Data Eng 2001; 13 (Suppl. 06) 1010-27. doi: 10.1109/69.971193. 14 US Health insurance portability and accountability act of 1996. Pub. L. 104–191, 110 Stat. 1936 (Au-gust 21, 1996). 15 Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal L 281 , 23/11/1995 P. 0031–0050 (October 24, 1995). 16 Xia W, Heatherly R, Ding X, Li J, Malin BA. R-U policy frontiers for health data de-identification. J Am Med Inform Assoc 2015; 22 (Suppl. 05) 1029-41. PubMed PMID: 25911674. 17 Sweeney L. k-anonymity: A model for protecting privacy. Int J Uncertain Fuzz 2002; 10 (Suppl. 05) 557-70. doi: 10.1142/S0218488502001648. 18 El Emam K. Risk-based de-identification of health data. IEEE Security & Privacy 2010; 8 (Suppl. 03) 64-7. doi: 10.1109/MSP.2010.103. 19 Pannekoek J. Statistical methods for some simple disclosure limitation rules. Statistica Neerlandica 1999; 53 (Suppl. 01) 55-67. doi: 10.1111/1467–9574.00097. 20 El Emam K, Dankar FK. Protecting privacy using k-anonymity. J Am Med Inform Assoc 2008; 15 (Suppl. 05) 627-37. PubMed PMID: 18579830. 21 Hoshino N. Applying pitman’s sampling formula to microdata disclosure risk assessment. J Off Stat 2001; 17 (Suppl. 04) 499-520. 22 Chen G, Keller-McNulty S. Estimation of identification disclosure risk in microdata. J Off Stat 1998; 14 (Suppl. 01) 79-95. 23 Rinott Y. On models for statistical disclosure risk estimation. In: Proceedings of the Joint ECE/Eurostat Work Session on Statistical Data Confidentiality. 2003. Apr 7–9; Luxembourg; 2003. 24 Dankar FK, El Emam K, Neisa A, Roffey T. Estimating the re-identification risk of clinical data sets. BMC Med Inform Decis Mak 2012; 12: 66. PubMed PMID: 22776564. 25 Prasser F, Kohlmayer F. Putting statistical disclosure control into practice: The ARX data anonymization tool. In: Gkoulalas-Divanis A, Loukides G. editors. Medical Data Privacy Handbook. New York: Springer; 2015. p. 111-48. 26 Iyengar V. Transforming data to satisfy privacy constraints. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002 Jul 23–26 Edmonton, Canada: ACM; 2002. p. 279-88. doi: 10.1145/775047.775089. 27 Bayardo RJ, Agrawal R. Data privacy through optimal k-anonymization. In: Aberer K, Franklin MJ, Nishio S. editors Proceedings of the 21st International Conference on Data Engineering. 2005 Apr 5–8 Tokyo, Japan: IEEE Computer Society; 2005. p. 217-28. doi: 10.1109/ICDE.2005.42. 28 Prasser F, Kohlmayer F, Lautenschlaeger R, Eckert C, Kuhn KA. ARX – A Comprehensive tool for anonymizing biomedical data. In: Proceedings of the AMIA 2014 Annual Symposium. 2014 Nov 15–19 Washington, DC, US.: AMIA; 2014. p. 984-93. PubMed PMID: 25954407. 29 El Emam K, Malin BA. Appendix B: Concepts and methods for de-identifying clinical trial data. In: Committee on Strategies for Responsible Sharing of Clinical Trial Data; Board on Health Sciences Policy; Institute of Medicine, editor. Sharing clinical trial data: Maximizing benefits, minimizing risk. Washington (DC): National Academies Press (US); 2015. p. 1-290. 30 Cox LH, Karr AF, Kinney SK. Risk-utility paradigms for statistical disclosure limitation: How to think, but not how to act. Int Stat Rev 2011; 79 (Suppl. 02) 160-83. doi: 10.1111/j.1751–5823.2011.00140.x. 31 Malin B, Karp D, Scheuermann RH. Technical and policy approaches to balancing patient privacy and data sharing in clinical and translational research. J Investig Med 2010; 58 (Suppl. 01) 11-8. PubMed PMID: 20051768. 32 El Emam K, Rodgers S, Malin B. Anonymising and sharing individual patient data. BMJ 2015; 350: h1139. PubMed PMID: 25794882 33 El Emam K, Jonker E, Arbuckle L, Malin B. A systematic review of re-identification attacks on health data. PloS one 2011; 6 (Suppl. 12) e28071. Epub 2011 Dec 2. PubMed PMID: 22164229. 34 US Department of Health and Human Services – Office of the Assistant Secretary for Planning and Evaluation. Standards for Privacy of Individually Identifiable Health Information. Fed Regist 2000; 65 (Suppl. 250) 82462-829. 35 El Emam K, Brown A, AbdelMalik P, Neisa A, Walker M, Bottomley J. et al. A method for managing re-identification risk from small geographic areas in Canada. BMC Med Inform Decis Mak 2010; 10: 18. PubMed PMID: 20361870. 36 El Emam K, Dankar FK, Vaillancourt R, Roffey T, Lysyk M. Evaluating the risk of re-identification of patients from hospital prescription records. Can J Hosp Pharm. 2009 62. (4) PubMed PMID: 22478909. 37 Templ M, Kowarik A, Meindl B. Statistical disclosure control for micro-data using the R package sdcMicro. J Stat Softw 2015; 67 (Suppl. 01) 1-36. doi: 10.18637/jss.v067.i04. 38 Hundepool A, Wetering A, Ramaswamy R, Franconi L, Polettini S, Capobianchi A. et al. Mu-Argus, Version 4.2 User’s Manual [Internet]. The Hague, Netherlands: Statistics Netherlands; 2008. [cited 2016 Feb 04]. Available from: http://neon.vb.cbs.nl/casc/Software/MuManual4.2.pdf. 39 El Emam K, Dankar FK, Issa R, Jonker E, Amyot D, Cogo E. et al. A globally optimal k-anonymity method for the de-identification of health data. J Am Med Inform Assoc 2009; 16 (Suppl. 05) 670-82. PubMed PMID: 19567795. 40 Heatherly RD, Loukides G, Denny JC, Haines JL, Roden DM, Malin BA. Enabling genomic-phenomic association discovery without sacrificing anonymity. PloS one 2013; 8 (Suppl. 02) e53875. Epub 2013 Feb 6. PubMed PMID: 23405076. 41 Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. Trans Knowl Discov Data 2007; 1 (Suppl. 01) 3. doi: 10.1145/1217299.1217302. 42 McGraw D. Building public trust in uses of Health Insurance Portability and Accountability Act de-identified data. J Am Med Inform Assoc 2013; 20 (Suppl. 01) 29-34. PubMed PMID: 22735615. 43 Domingo-Ferrer J, Torra V. Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Min. Knowl. Discov 2005; 11 (Suppl. 02) 195-212. doi: 10.1007/s10618–005–0007–5. 44 Goldberger J, Tassa T. Efficient anonymizations with enhanced utility. In: Saygin Y, Xu Yu J, Kargupta H, Wang W, Ranka S, Yu PS, Wu X. editors Proceedings of the ICDMW’09 IEEE International Conference on Data Mining Workshops. 2009 Dec 6 Miami, USA: IEEE Computer Society; 2009. p. 106-13. doi: 10.1109/ICDMW.2009.15. 45 Soria-Comas J, Domingo-Ferrer J, Sanchez D, Martinez S. t-Closeness through microaggregation: strict privacy with enhanced utility preservation. Trans Knowl Data Eng 2015; 27 (Suppl. 11) 3098-110. doi: 10.1109/TKDE.2015.2435777 46 Dankar FK, El Emam K. Practicing differential privacy in health care: A Review. Trans Data Priv 2013; 6 (Suppl. 01) 35-67. 47 Dwork C. Differential privacy. In: Bugliesi M, Preneel B, Sassone V, Wegener I. editors Proceedings of the 33rd International Colloquium. ICALP 2006 Jul 10–14 Venice, Italy. Berlin; Heidelberg: Springer; 2006. p. 1-12. doi: 10.1007/11787006_1. 48 El Emam K, Álvarez C. A critical appraisal of the Article 29 Working Party Opinion 05/2014 on data anonymization techniques. Int Data Priv Law 2015; 5 (Suppl. 01) 73-87. doi: 10.1093/idpl/ipu033. Zusatzmaterial Zusatzmaterial Zusatzmaterial