Cost-effective GPU-Grid for Genome-wide Epistasis Calculations

B. Pütz; T. Kam-Thong; N. Karbalai; A. Altmann; B. Müller-Myhsok

doi:10.3414/ME11-02-0049

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook X Linkedin Weibo

Download PDF

Methods Inf Med 2013; 52(01): 91-95
DOI: 10.3414/ME11-02-0049

Focus Theme – Original Articles

Schattauer GmbH

Cost-effective GPU-Grid for Genome-wide Epistasis Calculations

B. Pütz

¹MPI of Psychiatry, Statistical Genetics, Munich, Germany

,

T. Kam-Thong

¹MPI of Psychiatry, Statistical Genetics, Munich, Germany

,

N. Karbalai

¹MPI of Psychiatry, Statistical Genetics, Munich, Germany

,

A. Altmann

¹MPI of Psychiatry, Statistical Genetics, Munich, Germany

,

B. Müller-Myhsok

¹MPI of Psychiatry, Statistical Genetics, Munich, Germany

› Author Affiliations

Further Information

Publication History

received: 01 December 2011

accepted: 13 September 2012

Publication Date:
20 January 2018 (online)

Abstract
Full Text
References

Permissions and Reprints

Summary

Background: Until recently, genotype stud -ies were limited to the investigation of single SNP effects due to the computational burden incurred when studying pairwise interactions of SNPs. However, some genetic effects as simple as coloring (in plants and animals) cannot be ascribed to a single locus but only understood when epistasis is taken into account [1]. It is expected that such effects are also found in complex diseases where many genes contribute to the clinical outcome of affected individuals. Only recently have such problems become feasible computationally.

Objectives: The inherently parallel structure of the problem makes it a perfect candidate for massive parallelization on either grid or cloud architectures. Since we are also dealing with confidential patient data, we were not able to consider a cloud-based solution but had to find a way to process the data in-house and aimed to build a local GPU-based grid structure.

Methods: Sequential epistatsis calculations were ported to GPU using CUDA at various levels. Parallelization on the CPU was compared to corresponding GPU counterparts with regards to performance and cost.

Results: A cost-effective solution was created by combining custom-built nodes equipped with relatively inexpensive consumer-level graphics cards with highly parallel GPUs in a local grid. The GPU method outperforms current cluster-based systems on a price/performance criterion, as a single GPU shows speed performance comparable up to 200 CPU cores.

Conclusion: The outlined approach will work for problems that easily lend themselves to massive parallelization. Code for various tasks has been made available and ongoing development of tools will further ease the transition from sequential to parallel algorithms.

Keywords

Epistasis - GPU - grid - computing

References
1 Miko I. Epistasis: Gene interaction and phenotype effects. Nature Education 2008; 1: 1

PubMed Google Scholar
2 Affymetrics [Internet]. Available from. http://www.affymetrics.com.

PubMed Google Scholar
3 Illumina [Internet]. Available from. http://www.illumina.com.

PubMed Google Scholar
4 Scott L, Mohlke K, Bonnycastle L, Willer C, Li Y, Duren W. et al A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 2007; 316: 1341-1645.

Crossref PubMed Google Scholar
5 The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 2007; 449: 851-861. Available from http://hapmap.ncbi.nlm.nih.gov.

Crossref PubMed Google Scholar
6 The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 2010; 467: 1061-1073.

Crossref PubMed Google Scholar
7 Cordell HJ. Epistasis: what it means, what it doesn’t mean, and statistical methods to detect it in humans. Hum Mol Genet 2002; 11 (20) 2463-2468.

Crossref PubMed Google Scholar
8 Schüpbach T, Xenarios I, Bergmann S, Kapur K. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinfor-matics 2010; 26 (11) 1468-1469. Available from http://www.vital-it.ch/software/FastEpistasis.

Crossref PubMed Google Scholar
9 Kam-Thong T, Czamara D, Tsuda K, Borgwardt K, Lewis CM, Erhardt-Lehmann A. et al EPI-BLASTER - Fast exhaustive two-locus epistasis detection strategy using graphical processing units. European Journal of Human Genetics. 2010. Available from http://www.mpipsykl.mpg.de/epiblaster.

Google Scholar
10 Kam-Thong T, Pütz B, Karbalai N, Müller-Myhsok B, Borgwardt K. Epistasis detection on quantitative phenotypes by exhaustive enumeration using GPUs. Bioinformatics 2011; 27 (13) i214-i221. Available from http://www.mpipsykl.mpg.de/epigpuhsic.

Crossref PubMed Google Scholar
11 Hu X, Liu Q, Zhang Z, Li Z, Wang S, He L. et al SHEsisEpi, a GPU-enhanced genome-wide SNP-SNP interaction scanning algorithm, efficiently reveals the risk genetic epistasis in bipolar disorder. Cell Res 2010; 20 (07) 854-857.

Crossref PubMed Google Scholar
12 Yung LS, Yang C, Wan X, Yu W. GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies. Bioinformatics 2011; 27 (09) 1309-1310.

Crossref PubMed Google Scholar
13 Hemani G, Theocharidis A, Wei W, Haley C. EpiGPU: exhaustive pairwise epistasis scans parallelized on consumer level graphics cards. Bioinformatics 2011; 27 (11) 1462-1465.

Crossref PubMed Google Scholar
14 gpgpu.org [Internet]. Available from. http://gpgpu.org/papers.

PubMed Google Scholar
15 MathWorks. Parallel Computing Toolkit;. Available from. http://www.mathworks.com/products/parallel-computing/index.html.

PubMed Google Scholar
16 Buckner J, Wilson J, Seligman M, Athey B, Watson S, Meng F. The gputools package enables GPU computing in R. Bioinformatics 2010; 26: 134-135.

Crossref PubMed Google Scholar
17 R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria. 2012 ISBN 3-900051-07-0. Available from http://www.R-project.org/.

PubMed Google Scholar
18 CULA [Internet]. Available from. http://www.culatools.com.

PubMed Google Scholar
19 Cuda [Internet] NVidia. Available from. http://www.nvidia.com/cuda.

PubMed Google Scholar
20 Stream [Internet] AMD. Available from. www.amd.com/stream.

PubMed Google Scholar
21 Khronos OpenCL Working Group. The OpenCL Specification. 2011 Available from http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf.

PubMed Google Scholar
22 PGI Accelerator compilers [Internet] Portland Group. Available from. www.pgroup.com/resources/accel.htm.

PubMed Google Scholar
23 CUDA compiler [Internet]. Available from. www.pgroup.com/resources/cuda-x86.htm.

PubMed Google Scholar
24 OpenMP [Internet]. Available from. openmp.org.

PubMed Google Scholar
25 MPI-Forum [Internet]. Available from. www.mpi-forum.org.

PubMed Google Scholar
26 OpenMPI [Internet]. Available from. www.open-mpi.org.

PubMed Google Scholar
27 Open ACCelerators [Internet]. Available from. www.openacc.org.

PubMed Google Scholar
28 Buckner J. gputools;. R package, free for academic use. Available from. http://brainarray.mbni.med.umich.edu/Brainarray/Rgpgpu/.

PubMed Google Scholar
29 Kam-Thong T. et al GLIDE - GPU-based linear regression for detection of epistasis. Hum Hered in review. Available from. http://www.mpipsykl.mpg.de/glide.

PubMed Google Scholar
30 Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D. et al PLINK: a toolset for whole-genome association and population-based linkage analysis. Am J Hum Genet 2007; 41 (03) 559-575. Available from http://pngu.mgh.harvard.edu/purcell/plink/.

PubMed Google Scholar
31 Breiman L. Random Forests. Machine Learning 2001; 45 (01) 5-32.

Crossref PubMed Google Scholar
32 Freund Y, Shapire RE. Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. 1996: 148-156.

Google Scholar
33 Cortes C, Vapnik VN. Support-vector networks. Machine Learning 1995; 20: 273-297.

PubMed Google Scholar

Subscribe to RSS

Share / Bookmark

Cost-effective GPU-Grid for Genome-wide Epistasis Calculations

Publication History

Summary

Keywords

References