Title |
SNP interaction detection with Random Forests in high-dimensional genetic data
|
---|---|
Published in |
BMC Bioinformatics, July 2012
|
DOI | 10.1186/1471-2105-13-164 |
Pubmed ID | |
Authors |
Stacey J Winham, Colin L Colby, Robert R Freimuth, Xin Wang, Mariza de Andrade, Marianne Huebner, Joanna M Biernacka |
Abstract |
Identifying variants associated with complex human traits in high-dimensional data is a central goal of genome-wide association studies. However, complicated etiologies such as gene-gene interactions are ignored by the univariate analysis usually applied in these studies. Random Forests (RF) are a popular data-mining technique that can accommodate a large number of predictor variables and allow for complex models with interactions. RF analysis produces measures of variable importance that can be used to rank the predictor variables. Thus, single nucleotide polymorphism (SNP) analysis using RFs is gaining popularity as a potential filter approach that considers interactions in high-dimensional data. However, the impact of data dimensionality on the power of RF to identify interactions has not been thoroughly explored. We investigate the ability of rankings from variable importance measures to detect gene-gene interaction effects and their potential effectiveness as filters compared to p-values from univariate logistic regression, particularly as the data becomes increasingly high-dimensional. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 2 | 29% |
United Kingdom | 1 | 14% |
Unknown | 4 | 57% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 6 | 86% |
Members of the public | 1 | 14% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 2% |
Belgium | 2 | 1% |
Netherlands | 1 | <1% |
Brazil | 1 | <1% |
Malaysia | 1 | <1% |
India | 1 | <1% |
Turkey | 1 | <1% |
Sweden | 1 | <1% |
United Kingdom | 1 | <1% |
Other | 0 | 0% |
Unknown | 123 | 91% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 38 | 28% |
Researcher | 28 | 21% |
Student > Master | 22 | 16% |
Student > Bachelor | 8 | 6% |
Professor > Associate Professor | 8 | 6% |
Other | 20 | 15% |
Unknown | 11 | 8% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 29 | 21% |
Computer Science | 22 | 16% |
Biochemistry, Genetics and Molecular Biology | 18 | 13% |
Mathematics | 10 | 7% |
Engineering | 9 | 7% |
Other | 29 | 21% |
Unknown | 18 | 13% |