↓ Skip to main content

Comparing De Novo Genome Assembly: The Long and Short of It

Overview of attention for article published in PLOS ONE, April 2011
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
98 Dimensions

Readers on

mendeley
610 Mendeley
citeulike
21 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Comparing De Novo Genome Assembly: The Long and Short of It
Published in
PLOS ONE, April 2011
DOI 10.1371/journal.pone.0019175
Pubmed ID
Authors

Giuseppe Narzisi, Bud Mishra

Abstract

Recent advances in DNA sequencing technology and their focal role in Genome Wide Association Studies (GWAS) have rekindled a growing interest in the whole-genome sequence assembly (WGSA) problem, thereby, inundating the field with a plethora of new formalizations, algorithms, heuristics and implementations. And yet, scant attention has been paid to comparative assessments of these assemblers' quality and accuracy. No commonly accepted and standardized method for comparison exists yet. Even worse, widely used metrics to compare the assembled sequences emphasize only size, poorly capturing the contig quality and accuracy. This paper addresses these concerns: it highlights common anomalies in assembly accuracy through a rigorous study of several assemblers, compared under both standard metrics (N50, coverage, contig sizes, etc.) as well as a more comprehensive metric (Feature-Response Curves, FRC) that is introduced here; FRC transparently captures the trade-offs between contigs' quality against their sizes. For this purpose, most of the publicly available major sequence assemblers--both for low-coverage long (Sanger) and high-coverage short (Illumina) reads technologies--are compared. These assemblers are applied to microbial (Escherichia coli, Brucella, Wolbachia, Staphylococcus, Helicobacter) and partial human genome sequences (Chr. Y), using sequence reads of various read-lengths, coverages, accuracies, and with and without mate-pairs. It is hoped that, based on these evaluations, computational biologists will identify innovative sequence assembly paradigms, bioinformaticists will determine promising approaches for developing "next-generation" assemblers, and biotechnologists will formulate more meaningful design desiderata for sequencing technology platforms. A new software tool for computing the FRC metric has been developed and is available through the AMOS open-source consortium.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 610 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 27 4%
United Kingdom 12 2%
Germany 9 1%
Brazil 7 1%
Japan 6 <1%
Netherlands 5 <1%
Sweden 4 <1%
Australia 4 <1%
France 4 <1%
Other 26 4%
Unknown 506 83%

Demographic breakdown

Readers by professional status Count As %
Researcher 158 26%
Student > Ph. D. Student 154 25%
Student > Master 79 13%
Student > Bachelor 44 7%
Other 27 4%
Other 101 17%
Unknown 47 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 362 59%
Biochemistry, Genetics and Molecular Biology 80 13%
Computer Science 49 8%
Engineering 12 2%
Medicine and Dentistry 11 2%
Other 39 6%
Unknown 57 9%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 09 December 2011.
All research outputs
#16,162,342
of 24,784,213 outputs
Outputs from PLOS ONE
#142,448
of 214,542 outputs
Outputs of similar age
#88,514
of 115,180 outputs
Outputs of similar age from PLOS ONE
#1,144
of 1,533 outputs
Altmetric has tracked 24,784,213 research outputs across all sources so far. This one is in the 34th percentile – i.e., 34% of other outputs scored the same or lower than it.
So far Altmetric has tracked 214,542 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 15.6. This one is in the 33rd percentile – i.e., 33% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 115,180 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 22nd percentile – i.e., 22% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 1,533 others from the same source and published within six weeks on either side of this one. This one is in the 25th percentile – i.e., 25% of its contemporaries scored the same or lower than it.