Title |
Simulating autosomal genotypes with realistic linkage disequilibrium and a spiked-in genetic effect
|
---|---|
Published in |
BMC Bioinformatics, January 2018
|
DOI | 10.1186/s12859-017-2004-2 |
Pubmed ID | |
Authors |
M. Shi, D. M. Umbach, A. S. Wise, C. R. Weinberg |
Abstract |
To evaluate statistical methods for genome-wide genetic analyses, one needs to be able to simulate realistic genotypes. We here describe a method, applicable to a broad range of association study designs, that can simulate autosome-wide single-nucleotide polymorphism data with realistic linkage disequilibrium and with spiked in, user-specified, single or multi-SNP causal effects. Our construction uses existing genome-wide association data from unrelated case-parent triads, augmented by including a hypothetical complement triad for each triad (same parents but with a hypothetical offspring who carries the non-transmitted parental alleles). We assign offspring qualitative or quantitative traits probabilistically through a specified risk model and show that our approach destroys the risk signals from the original data. Our method can simulate genetically homogeneous or stratified populations and can simulate case-parents studies, case-control studies, case-only studies, or studies of quantitative traits. We show that allele frequencies and linkage disequilibrium structure in the original genome-wide association sample are preserved in the simulated data. We have implemented our method in an R package (TriadSim) which is freely available at the comprehensive R archive network. We have proposed a method for simulating genome-wide SNP data with realistic linkage disequilibrium. Our method will be useful for developing statistical methods for studying genetic associations, including higher order effects like epistasis and gene by environment interactions. |
Twitter Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 1 | 50% |
Unknown | 1 | 50% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Scientists | 2 | 100% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 23 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 5 | 22% |
Student > Postgraduate | 5 | 22% |
Student > Ph. D. Student | 3 | 13% |
Other | 2 | 9% |
Student > Master | 1 | 4% |
Other | 3 | 13% |
Unknown | 4 | 17% |
Readers by discipline | Count | As % |
---|---|---|
Biochemistry, Genetics and Molecular Biology | 8 | 35% |
Computer Science | 4 | 17% |
Decision Sciences | 2 | 9% |
Agricultural and Biological Sciences | 2 | 9% |
Medicine and Dentistry | 2 | 9% |
Other | 2 | 9% |
Unknown | 3 | 13% |