Chapter title |
Allele identification in assembled genomic sequence datasets.
|
---|---|
Chapter number | 12 |
Book title |
Data Production and Analysis in Population Genomics
|
Published in |
Methods in molecular biology, June 2012
|
DOI | 10.1007/978-1-61779-870-2_12 |
Pubmed ID | |
Book ISBNs |
978-1-61779-869-6, 978-1-61779-870-2
|
Authors |
Dlugosch KM, Bonin A, Katrina M. Dlugosch, Aurélie Bonin, Dlugosch, Katrina M., Bonin, Aurélie |
Abstract |
Allelic variation within species provides fundamental insights into the evolution and ecology of organisms, and information about this variation is becoming increasingly available in sequence datasets of multiple and/or outbred individuals. Unfortunately, identifying true allelic variants poses a number of challenges, given the presence of both sequencing errors and alleles from other closely related loci. We outline the key considerations involved in this process, including assessing the accuracy of allele resolution in sequence assembly, clustering of alleles within and among individuals, and identifying clusters that are most likely to correspond to true allelic variants of a single locus. Our focus is particularly on the case where alleles must be identified without a fully resolved reference genome, and where sequence depth information cannot be used to infer the putative number of loci sharing a sequence, such as in transcriptome or post-assembly datasets. Throughout, we provide information about publicly available tools to aid allele identification in such cases. |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 13 | 100% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Master | 6 | 46% |
Researcher | 6 | 46% |
Student > Ph. D. Student | 1 | 8% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 9 | 69% |
Biochemistry, Genetics and Molecular Biology | 3 | 23% |
Computer Science | 1 | 8% |