↓ Skip to main content

The Effects of Alignment Quality, Distance Calculation Method, Sequence Filtering, and Region on the Analysis of 16S rRNA Gene-Based Studies

Overview of attention for article published in PLoS Computational Biology, July 2010
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (65th percentile)
  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
8 X users
facebook
1 Facebook page

Citations

dimensions_citation
325 Dimensions

Readers on

mendeley
692 Mendeley
citeulike
19 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
The Effects of Alignment Quality, Distance Calculation Method, Sequence Filtering, and Region on the Analysis of 16S rRNA Gene-Based Studies
Published in
PLoS Computational Biology, July 2010
DOI 10.1371/journal.pcbi.1000844
Pubmed ID
Authors

Patrick D. Schloss

Abstract

Pyrosequencing of PCR-amplified fragments that target variable regions within the 16S rRNA gene has quickly become a powerful method for analyzing the membership and structure of microbial communities. This approach has revealed and introduced questions that were not fully appreciated by those carrying out traditional Sanger sequencing-based methods. These include the effects of alignment quality, the best method of calculating pairwise genetic distances for 16S rRNA genes, whether it is appropriate to filter variable regions, and how the choice of variable region relates to the genetic diversity observed in full-length sequences. I used a diverse collection of 13,501 high-quality full-length sequences to assess each of these questions. First, alignment quality had a significant impact on distance values and downstream analyses. Specifically, the greengenes alignment, which does a poor job of aligning variable regions, predicted higher genetic diversity, richness, and phylogenetic diversity than the SILVA and RDP-based alignments. Second, the effect of different gap treatments in determining pairwise genetic distances was strongly affected by the variation in sequence length for a region; however, the effect of different calculation methods was subtle when determining the sample's richness or phylogenetic diversity for a region. Third, applying a sequence mask to remove variable positions had a profound impact on genetic distances by muting the observed richness and phylogenetic diversity. Finally, the genetic distances calculated for each of the variable regions did a poor job of correlating with the full-length gene. Thus, while it is tempting to apply traditional cutoff levels derived for full-length sequences to these shorter sequences, it is not advisable. Analysis of beta-diversity metrics showed that each of these factors can have a significant impact on the comparison of community membership and structure. Taken together, these results urge caution in the design and interpretation of analyses using pyrosequencing data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 8 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 692 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 28 4%
United Kingdom 5 <1%
Brazil 5 <1%
Denmark 4 <1%
France 3 <1%
Spain 3 <1%
Belgium 3 <1%
Chile 2 <1%
Sweden 2 <1%
Other 21 3%
Unknown 616 89%

Demographic breakdown

Readers by professional status Count As %
Researcher 184 27%
Student > Ph. D. Student 173 25%
Student > Master 83 12%
Student > Bachelor 44 6%
Student > Doctoral Student 36 5%
Other 116 17%
Unknown 56 8%
Readers by discipline Count As %
Agricultural and Biological Sciences 369 53%
Biochemistry, Genetics and Molecular Biology 74 11%
Environmental Science 45 7%
Immunology and Microbiology 30 4%
Medicine and Dentistry 21 3%
Other 78 11%
Unknown 75 11%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 February 2019.
All research outputs
#7,778,730
of 25,374,917 outputs
Outputs from PLoS Computational Biology
#5,160
of 8,960 outputs
Outputs of similar age
#35,786
of 104,614 outputs
Outputs of similar age from PLoS Computational Biology
#31
of 61 outputs
Altmetric has tracked 25,374,917 research outputs across all sources so far. This one has received more attention than most of these and is in the 69th percentile.
So far Altmetric has tracked 8,960 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 20.4. This one is in the 41st percentile – i.e., 41% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 104,614 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 65% of its contemporaries.
We're also able to compare this research output to 61 others from the same source and published within six weeks on either side of this one. This one is in the 49th percentile – i.e., 49% of its contemporaries scored the same or lower than it.