Methods Inf Med 2013; 52(01): 91-95
DOI: 10.3414/ME11-02-0049
Focus Theme – Original Articles
Schattauer GmbH

Cost-effective GPU-Grid for Genome-wide Epistasis Calculations

B. Pütz
1   MPI of Psychiatry, Statistical Genetics, Munich, Germany
,
T. Kam-Thong
1   MPI of Psychiatry, Statistical Genetics, Munich, Germany
,
N. Karbalai
1   MPI of Psychiatry, Statistical Genetics, Munich, Germany
,
A. Altmann
1   MPI of Psychiatry, Statistical Genetics, Munich, Germany
,
B. Müller-Myhsok
1   MPI of Psychiatry, Statistical Genetics, Munich, Germany
› Author Affiliations
Further Information

Publication History

received: 01 December 2011

accepted: 13 September 2012

Publication Date:
20 January 2018 (online)

Summary

Background: Until recently, genotype stud -ies were limited to the investigation of single SNP effects due to the computational burden incurred when studying pairwise interactions of SNPs. However, some genetic effects as simple as coloring (in plants and animals) cannot be ascribed to a single locus but only understood when epistasis is taken into account [1]. It is expected that such effects are also found in complex diseases where many genes contribute to the clinical outcome of affected individuals. Only recently have such problems become feasible computationally.

Objectives: The inherently parallel structure of the problem makes it a perfect candidate for massive parallelization on either grid or cloud architectures. Since we are also dealing with confidential patient data, we were not able to consider a cloud-based solution but had to find a way to process the data in-house and aimed to build a local GPU-based grid structure.

Methods: Sequential epistatsis calculations were ported to GPU using CUDA at various levels. Parallelization on the CPU was compared to corresponding GPU counterparts with regards to performance and cost.

Results: A cost-effective solution was created by combining custom-built nodes equipped with relatively inexpensive consumer-level graphics cards with highly parallel GPUs in a local grid. The GPU method outperforms current cluster-based systems on a price/performance criterion, as a single GPU shows speed performance comparable up to 200 CPU cores.

Conclusion: The outlined approach will work for problems that easily lend themselves to massive parallelization. Code for various tasks has been made available and ongoing development of tools will further ease the transition from sequential to parallel algorithms.