↓ Skip to main content

Genetic and Nongenetic Variation Revealed for the Principal Components of Human Gene Expression

Overview of attention for article published in Genetics, November 2013
Altmetric Badge

Citations

dimensions_citation
22 Dimensions

Readers on

mendeley
60 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Genetic and Nongenetic Variation Revealed for the Principal Components of Human Gene Expression
Published in
Genetics, November 2013
DOI 10.1534/genetics.113.153221
Pubmed ID
Authors

Anita Goldinger, Anjali K. Henders, Allan F. McRae, Nicholas G. Martin, Greg Gibson, Grant W. Montgomery, Peter M. Visscher, Joseph E. Powell

Abstract

Principal components analysis has been employed in gene expression studies to correct for population substructure and batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation, including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and nongenetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation while nongenetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways, including core immune function and metabolic pathways. The use of PC correction in two independent data sets resulted in a reduction in the number of cis- and trans-expression QTL detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 60 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 4 7%
Colombia 1 2%
Sweden 1 2%
Norway 1 2%
Spain 1 2%
Canada 1 2%
Unknown 51 85%

Demographic breakdown

Readers by professional status Count As %
Researcher 19 32%
Student > Ph. D. Student 17 28%
Student > Master 6 10%
Professor > Associate Professor 3 5%
Student > Bachelor 2 3%
Other 7 12%
Unknown 6 10%
Readers by discipline Count As %
Agricultural and Biological Sciences 35 58%
Biochemistry, Genetics and Molecular Biology 9 15%
Medicine and Dentistry 6 10%
Psychology 2 3%
Computer Science 1 2%
Other 2 3%
Unknown 5 8%