Yearbook of Medical Informatics, Inhaltsverzeichnis Yearb Med Inform 2014; 23(01): 42-47DOI: 10.15265/IY-2014-0018 Original Article Georg Thieme Verlag KG StuttgartTechnical Challenges for Big Data in Biomedicine and Health: Data Sources, Infrastructure, and Analytics N. Peek 1 Dept. of Medical Informatics, Academic Medical Center, University of Amsterdam, The Netherlands 2 Centre for Health Informatics, Institute of Population Health , University of Manchester, Manchester, UK , J. H. Holmes 3 Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA , J. Sun 4 College of Computing, Georgia Institute of Technology, Atlanta, GA, USA› InstitutsangabenArtikel empfehlen Abstract Volltext als PDF herunterladen Keywords KeywordsBig Data - electronic health records - distributed computing - statistical analysis Referenzen References 1 Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008; 128-44. 2 Friedman C, Elhadad N. Natural language processing in health care and biomedicine. In: Shortliffe EH, Cimino JJ. (eds.). Biomedical Informatics. Computer Applications in Health Care and Biomedicine (4th ed.) London: Springer; 2014. p. 255-84 3 Deserno TM. Biomedical Image Processing. Berlin: Springer; 2011 4 Rubin DL, Greenspan H, Brinkley JF. Biomedical Imaging Informatics. In: Shortliffe EH, Cimino JJ. editors. Biomedical Informatics. Computer Applications in Health Care and Biomedicine (4th ed.) London: Springer; 2014. p. 285-327 5 Eysenbach G, Köhler C. Health-Related Searches on the Internet. JAMA 2004; 291: 2946. 6 Carneiro HA, Mylonakis E. Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clin Infect Dis 2009; 49 (Suppl. 10) 1557-64. 7 Mandl KD, Overhage JM, Wagner MM, Lober WB, Sebastiani P, Mostashari F. et al. Implementing syndromic surveillance: a practical guide informed by the early experience. J Am Med Inform Assoc 2004; 11: 141-50. 8 White RW, Tatonetti NP, Shah NH, Altman RB, Horvitz E. Web-scale pharmacovigilance: listening to signals from the crowd. J Am Med Inform Assoc 2013; 20 (Suppl. 03) 404-8. 9 New Tweets per second record, and how! Twitter, Inc 2014 [cited 2014 Jan 15]. Available from: URL: https://blog.twitter.com/2013/new-tweets-per-second-record-and-how 10 Langmead B, Schatz MC, Lin J. et al. Searching for SNPs with cloud computing. Genome Biol 2009; 10: R134. 11 Wang Y, Goh W, Wong L, Montana G. Alzheimer‘s Disease Neuroimaging Initiative. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes. BMC Bioinformatics 2013; 14 Suppl 16-S6. 12 Ng K, Ghoting A, Steinhubl SR, Stewart WF, Malin B, Sun J. PARAMO: A PARAllel predictive MOdeling platform for healthcare analytic research using electronic health records. J Biomed Inform 2014; 48: 160-70. 13 Sahoo SS, Jayapandian C, Garg G, Kaffashi F, Chung S, Bozorgi A. et al. Heart beats in the cloud: distributed analysis of electrophysiological 舘Big Data‘ using cloud computing for epilepsy clinical research. J Am Med Inform Assoc 2014; 21 (Suppl. 02) 263-71. 14 Zhao S, Prenger K, Smith L, Messina T, Fan H, Jaeger E. et al. Rainbow: A tool for large-scale whole-genome sequencing data analysis using cloud computing. BMC Genomics 2013; 14: 425. 15 Hird SM. LociNGS: A Lightweight Alternative for Assessing Suitability of next-Generation Loci for Evolutionary Analysis. PloS One 2012; 7 (Suppl. 10) e46847. 16 Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M. et al. Bigtable: A distributed storage system for structured data. ACM Trans Comput Syst 2008; 26 (Suppl. 02) 4. 17 Rosenthal A, Mork P, Li MH, Stanford J, Koester D, Reynolds P. Cloud computing: a new business paradigm for biomedical information sharing. J Biomed Inform 2010; 43 (Suppl. 02) 342-53. 18 Amazon Web Services.. Creating Healthcare Data Applications to Promote HIPAA and HITECH Compliance. White paper, Amazon, August 2012 http://media.amazonwebservices.com/AWS_HIPAA_Whitepaper_Final.pdf (last accessed 20 May 2014) 19 Jeffrey D, Ghemawat S. MapReduce: Simplified data processing on large clusters. Sixth Symposium on Operating Systems Design & Implementation (OSDI); 2004; 137-50. 20 Ng K, Ghoting A, Steinhubl SR, Stewart WF, Malin B, Sun J. PARAMO: A PARAllel Predictive MOdeling Platform for Healthcare Analytic Research Using Electronic Health Records. J Biomed Inform 2013, doi:10.1016/j.jbi.2013.12.012. 21 Xin RS, Rosen J, Zaharia M, Franklin MJ, Shenker S, Stoica I. Shark: SQL and rich analytics at scale. ACM SIGMOD Conference 2013 1145/2463676.2465288. 22 Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein JM. Graphlab: A new parallel framework for machine learning. In: Grünwald P, Spirtes P. editors. Proc 26th Conference on Uncertainty in Artificial Intelligence. AUAI Press; 2010. p. 340-9 23 McPheeters ML, Sathe NA, Jerome RN, Carnahan RM. Methods for systematic reviews of administrative database studies capturing health outcomes of interest. Vaccine 2013; 31 (Suppl. 10) K2-6. 24 Greenwald P, Friedlander BR, Lawrence CE, Hearne T, Earle K. Diagnostic sensitivity bias -- an epidemiologic explanation for an apparent brain tumor excess. J Occup Med 1981; 23 (Suppl. 10) 690-4. 25 Tessier-Sherman B, Galusha D, Taiwo OA, Cantley L, Slade MD, Kirsche SR. et al. Further validation that claims data are a useful tool for epidemiologic research on hypertension. BMC Public Health 2013; 13: 51. 26 Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB, Pulley JM. et al. Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am J Hum Genet 2010; 86 (Suppl. 04) 560-72. 27 D’Agostino RB Jr.. Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med 1998; 17 (Suppl. 19) 2265-81. 28 De Vries H, Kemps HMC, Van Engen-Verheul MM, Kraaijenhagen RA, Peek N. Cardiac abilitation and survival in a large representative community cohort of Dutch patients. Submitted for publication. 29 Friedman JH. Greedy function approximation: A gradient boosting machine. Annals of Statistics 2001; 29 (Suppl. 05) 1189-232. 30 Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science 1996; 273: 1516-7. 31 Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 2005; Feb 6 (Suppl. 02) 95-108. 32 Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature 2009 Feb 19 457 7232 1012-4.