Title |
Quality control of next-generation sequencing data without a reference
|
---|---|
Published in |
Frontiers in Genetics, May 2014
|
DOI | 10.3389/fgene.2014.00111 |
Pubmed ID | |
Authors |
Urmi H. Trivedi, Timothée Cézard, Stephen Bridgett, Anna Montazam, Jenna Nichols, Mark Blaxter, Karim Gharbi |
Abstract |
Next-generation sequencing (NGS) technologies have dramatically expanded the breadth of genomics. Genome-scale data, once restricted to a small number of biomedical model organisms, can now be generated for virtually any species at remarkable speed and low cost. Yet non-model organisms often lack a suitable reference to map sequence reads against, making alignment-based quality control (QC) of NGS data more challenging than cases where a well-assembled genome is already available. Here we show that by generating a rapid, non-optimized draft assembly of raw reads, it is possible to obtain reliable and informative QC metrics, thus removing the need for a high quality reference. We use benchmark datasets generated from control samples across a range of genome sizes to illustrate that QC inferences made using draft assemblies are broadly equivalent to those made using a well-established reference, and describe QC tools routinely used in our production facility to assess the quality of NGS data from non-model organisms. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United Kingdom | 7 | 24% |
United States | 5 | 17% |
India | 2 | 7% |
Ireland | 1 | 3% |
Sweden | 1 | 3% |
Brazil | 1 | 3% |
Norway | 1 | 3% |
Switzerland | 1 | 3% |
Germany | 1 | 3% |
Other | 0 | 0% |
Unknown | 9 | 31% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 19 | 66% |
Scientists | 10 | 34% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 4 | 1% |
India | 3 | <1% |
France | 2 | <1% |
Germany | 2 | <1% |
Chile | 1 | <1% |
Norway | 1 | <1% |
Netherlands | 1 | <1% |
United Kingdom | 1 | <1% |
Canada | 1 | <1% |
Other | 5 | 2% |
Unknown | 300 | 93% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Researcher | 64 | 20% |
Student > Ph. D. Student | 45 | 14% |
Student > Bachelor | 45 | 14% |
Student > Master | 45 | 14% |
Student > Doctoral Student | 16 | 5% |
Other | 36 | 11% |
Unknown | 70 | 22% |
Readers by discipline | Count | As % |
---|---|---|
Agricultural and Biological Sciences | 108 | 34% |
Biochemistry, Genetics and Molecular Biology | 81 | 25% |
Computer Science | 20 | 6% |
Immunology and Microbiology | 6 | 2% |
Chemistry | 3 | <1% |
Other | 22 | 7% |
Unknown | 81 | 25% |