Methods Inf Med 2006; 45(05): 557-563
DOI: 10.1055/s-0038-1634118
Original Article
Schattauer GmbH

Deconfounding Microarray Analysis

Independent Measurements of Cell Type Proportions Used in a Regression Model to Resolve Tissue Heterogeneity Bias
M. Jacobsen*
1   Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
,
D. Repsilber*
2   Institute of Medical Biometry and Statistics, University at Lübeck, Lübeck, Germany
3   Current address: Institute for Biology and Biochemistry, University Potsdam, Potsdam-Golm, Germany
,
A. Gutschmidt
1   Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
,
A. Neher
4   Asklepios Center for Respiratory Medicine and Thoracic Surgery, Munich-Gauting, Germany
,
K. Feldmann
4   Asklepios Center for Respiratory Medicine and Thoracic Surgery, Munich-Gauting, Germany
,
H. J. Mollenkopf
5   Microarray Core Facilities, Max Planck Institute for Infection Biology, Berlin, Germany
,
S. H. E. Kaufmann
1   Department of Immunology, Max Planck Institute for Infection Biology, Berlin, Germany
,
A. Ziegler
2   Institute of Medical Biometry and Statistics, University at Lübeck, Lübeck, Germany
› Author Affiliations
Further Information

Publication History

Received: 31 January 2005

accepted: 16 January 2006

Publication Date:
07 February 2018 (online)

Summary

Objectives: Microarray analysis requires standardized specimens and evaluation procedures to achieve acceptable results. A major limitation of this method is caused by heterogeneity in the cellular composition of tissue specimens, which frequently confounds data analysis. We introduce a linear model to deconfound gene expression data from tissue heterogeneity for genes exclusively expressed by a single cell type.

Methods: Gene expression data are deconfounded from tissue heterogeneity effects by analyzing them using an appropriate linear regression model. In our illustrating data set tissue heterogeneity is being measured using flow cytometry. Gene expression data are determined in parallel by real time quantitative polymerase chain reaction (qPCR) and microarray analyses. Verification of deconfounding is enabled using protein quantification for the respective marker genes.

Results: For our illustrating dataset, quantification of cell type proportions for peripheral blood mononuclear cells (PBMC) from tuberculosis patients and controls revealed differences in B cell and monocyte proportions between both study groups, and thus heterogeneity for the tissue under investigation. Gene expression analyses reflected these differences in celltype distribution. Fitting an appropriate linear model allowed us to deconfound measured transcriptome levels from tissue heterogeneity effects. In the case of monocytes, additional differential expression on the single cell level could be proposed. Protein quantification verified these deconfounded results.

Conclusions: Deconfounding of transcriptome analyses for cellular heterogeneity greatly improves interpretability, and hence the validity of transcriptome profiling results.

* These authors contributed equally to this work.


 
  • References

  • 1 Schubert CM. Microarray to be used as routine clinical screen. Nat Med 2003; 9: 9.
  • 2 van ’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002; 415: 530-6.
  • 3 Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531-7.
  • 4 Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson Jr. J, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000; 403: 503-11.
  • 5 Stuart RO, Wachsman W, Berry CC, Wang-Rodriguez J, Wasserman L, Klacansky I, Masys D, Arden K, Goodison S, McClelland M, Wang Y, Sawyers A, Kalcheva I, Tarin D, Mercola D.. In silico dissection of cell-type-associated patterns of gene expression in prostate cancer. Proc NatlAcad Sci USA 2004; 101: 615-20.
  • 6 Lahdesmaki H, Shmulevich L, Dunmire V, Yli- Harja O, Zhang W. In silico microdissection of microarray data from heterogeneous cell populations. BMC Bioinformatics 2005; 6: 54.
  • 7 Venet D, Pecasse F, Maenhaut C, Bersini H. Separation of samples into their constituents using gene expression data. Bioinformatics 2001; 17 Suppl 1 S 279-87.
  • 8 Ghosh D.. Mixture models for assessing differential expression in complex tissues using microarray data. Bioinformatics 2004; 20 1663 9.
  • 9 Lu P, Nakorchevskiy A, Marcotte EM. Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc Natl Acad Sci USA 2003; 100: 10370-5.
  • 10 Emmert-Buck MR, Bonner RF, Smith PD, Chuaqui RF, Zhuang Z, Goldstein SR, Weiss RA, Liotta LA. Laser capture microdissection. Science 1996; 274: 998-1001.
  • 11 Szaniszlo P, Wang N, Sinha M, Reece LM, Van Hook JW, Luxon BA, Leary JF. Getting the right cells to the array: Gene expression microarray analysis of cell mixtures and sorted cells. Cytometry 2004; 59 A 191-202.
  • 12 Repsilber D, Fink L, Jacobsen M, Blasing O, Ziegler A. Sample selection for microarray gene expression studies. Methods Inf Med 2005; 44: 461-7.
  • 13 Kriete A, Boyce K. Automated tissue analysis - a bioinformatics perspective. Methods Inf Med 2005; 44: 32-7.
  • 14 Kriete A, Anderson MK, Love B, Freund J, Caffrey JJ, Young MB, Sendera TJ, Magnuson SR, Braughler JM. Combined histomorphometric and gene-expression profiling applied to toxicology. Genome Biol 2003; 4: R32.
  • 15 Laurence J.. T-cell subsets in health, infectious disease, and idiopathic CD4+ T lymphocytopenia. Ann Intern Med 1993; 119: 55-62.
  • 16 Onwubalili JK, Edwards AJ, Palmer L. T4 lymphopenia in human tuberculosis. Tubercle 1987; 68: 195-200.
  • 17 Beck JS, Potts RC, Kardjito T, Grange JM. T4 lymphopenia in patients with active pulmonary tuberculosis. Clin Exp Immunol 1985; 60: 49-54.
  • 18 Singhal M, Banavalikar JN, Sharma S, Saha K. Peripheral blood T lymphocyte subpopulations in patients with tuberculosis and the effect of chemotherapy. Tubercle 1989; 70: 171-8.
  • 19 Jacobsen M, Schweer D, Ziegler A, Gaber R, Schock S, Schwinzer R, Wonigeit K, Lindert RB, Kantarci O, Schaefer-Klein J, Schipper HI, Oertel WH, Heidenreich F, Weinshenker BG, Sommer N, Hemmer B.. A point mutation in PTPRC is associated with the development of multiple sclerosis. Nat Genet 2000; 26: 495-9.
  • 20 Bennett L, Palucka AK, Arce E, Cantrell V, Borvak J, Banchereau J, Pascual V. Interferon and granulopoiesis signatures in systemic lupus erythematosus blood. J Exp Med 2003; 197: 711-23.
  • 21 Hoffman EP, Awad T, Palma J, Webster T, Hubbell E, Warrington JA, Spira A, Wright G, Buckley J, Triche T, Davis R, Tibshirani R, Xi W, Jones W, Tompkins R, West M.. Expression profiling - best practices for data generation and interpretation in clinical trials. Nat Rev Genet 2004; 5: 229-37.
  • 22 Bakay M, Chen YW, Borup R, Zhao P, Nagaraju K, Hoffman EP. Sources of variability and effect of experimental approach on expression profiling data interpretation. BMC Bioinformatics 2002; 3: 4.
  • 23 King HC, Sinha AA. Gene expression profile analysis by DNA microarrays: promise and pitfalls. JAMA 2001; 286: 2280-8.
  • 24 Nicholson JK, Velleca WM, Jubert S, Green TA, Bryan L. Evaluation of alternative CD4 technologies for the enumeration of CD4 lymphocytes. J Immunol Methods 1994; 177: 43-54.