↓ Skip to main content

Clusterflock: a flocking algorithm for isolating congruent phylogenomic datasets

Overview of attention for article published in Giga Science, October 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (91st percentile)
  • Above-average Attention Score compared to outputs of the same age and source (53rd percentile)

Mentioned by

blogs
1 blog
twitter
24 X users
peer_reviews
1 peer review site
facebook
1 Facebook page
googleplus
1 Google+ user
video
1 YouTube creator

Citations

dimensions_citation
6 Dimensions

Readers on

mendeley
33 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Clusterflock: a flocking algorithm for isolating congruent phylogenomic datasets
Published in
Giga Science, October 2016
DOI 10.1186/s13742-016-0152-3
Pubmed ID
Authors

Apurva Narechania, Richard Baker, Rob DeSalle, Barun Mathema, Sergios-Orestis Kolokotronis, Barry Kreiswirth, Paul J. Planet

Abstract

Collective animal behavior, such as the flocking of birds or the shoaling of fish, has inspired a class of algorithms designed to optimize distance-based clusters in various applications, including document analysis and DNA microarrays. In a flocking model, individual agents respond only to their immediate environment and move according to a few simple rules. After several iterations the agents self-organize, and clusters emerge without the need for partitional seeds. In addition to its unsupervised nature, flocking offers several computational advantages, including the potential to reduce the number of required comparisons. In the tool presented here, Clusterflock, we have implemented a flocking algorithm designed to locate groups (flocks) of orthologous gene families (OGFs) that share an evolutionary history. Pairwise distances that measure phylogenetic incongruence between OGFs guide flock formation. We tested this approach on several simulated datasets by varying the number of underlying topologies, the proportion of missing data, and evolutionary rates, and show that in datasets containing high levels of missing data and rate heterogeneity, Clusterflock outperforms other well-established clustering techniques. We also verified its utility on a known, large-scale recombination event in Staphylococcus aureus. By isolating sets of OGFs with divergent phylogenetic signals, we were able to pinpoint the recombined region without forcing a pre-determined number of groupings or defining a pre-determined incongruence threshold. Clusterflock is an open-source tool that can be used to discover horizontally transferred genes, recombined areas of chromosomes, and the phylogenetic 'core' of a genome. Although we used it here in an evolutionary context, it is generalizable to any clustering problem. Users can write extensions to calculate any distance metric on the unit interval, and can use these distances to 'flock' any type of data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 24 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 33 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 1 3%
Czechia 1 3%
Unknown 31 94%

Demographic breakdown

Readers by professional status Count As %
Researcher 12 36%
Student > Bachelor 5 15%
Student > Ph. D. Student 4 12%
Student > Master 3 9%
Student > Doctoral Student 2 6%
Other 6 18%
Unknown 1 3%
Readers by discipline Count As %
Agricultural and Biological Sciences 14 42%
Biochemistry, Genetics and Molecular Biology 7 21%
Computer Science 4 12%
Nursing and Health Professions 1 3%
Psychology 1 3%
Other 4 12%
Unknown 2 6%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 25. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 26 February 2020.
All research outputs
#1,508,471
of 25,368,786 outputs
Outputs from Giga Science
#256
of 1,167 outputs
Outputs of similar age
#27,049
of 320,930 outputs
Outputs of similar age from Giga Science
#6
of 13 outputs
Altmetric has tracked 25,368,786 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 94th percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 1,167 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 21.8. This one has done well, scoring higher than 78% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 320,930 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 91% of its contemporaries.
We're also able to compare this research output to 13 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 53% of its contemporaries.