Summary
Objectives:
Microarray analysis requires standardized specimens and evaluation procedures to
achieve acceptable results. A major limitation of this method is caused by heterogeneity
in the cellular composition of tissue specimens, which frequently confounds data analysis.
We introduce a linear model to deconfound gene expression data from tissue heterogeneity
for genes exclusively expressed by a single cell type.
Methods:
Gene expression data are deconfounded from tissue heterogeneity effects by analyzing
them using an appropriate linear regression model. In our illustrating data set tissue
heterogeneity is being measured using flow cytometry. Gene expression data are determined
in parallel by real time quantitative polymerase chain reaction (qPCR) and microarray
analyses. Verification of deconfounding is enabled using protein quantification for
the respective marker genes.
Results:
For our illustrating dataset, quantification of cell type proportions for peripheral
blood mononuclear cells (PBMC) from tuberculosis patients and controls revealed differences
in B cell and monocyte proportions between both study groups, and thus heterogeneity
for the tissue under investigation. Gene expression analyses reflected these differences
in celltype distribution. Fitting an appropriate linear model allowed us to deconfound
measured transcriptome levels from tissue heterogeneity effects. In the case of monocytes,
additional differential expression on the single cell level could be proposed. Protein
quantification verified these deconfounded results.
Conclusions:
Deconfounding of transcriptome analyses for cellular heterogeneity greatly improves
interpretability, and hence the validity of transcriptome profiling results.
Keywords
Transcriptome - tissue heterogeneity - deconfounding