↓ Skip to main content

Graph mining for next generation sequencing: leveraging the assembly graph for biological insights

Overview of attention for article published in BMC Genomics, May 2016
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (90th percentile)
  • High Attention Score compared to outputs of the same age and source (95th percentile)

Mentioned by

blogs
1 blog
twitter
27 X users

Citations

dimensions_citation
4 Dimensions

Readers on

mendeley
83 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Graph mining for next generation sequencing: leveraging the assembly graph for biological insights
Published in
BMC Genomics, May 2016
DOI 10.1186/s12864-016-2678-2
Pubmed ID
Authors

Julia Warnke-Sommer, Hesham Ali

Abstract

The assembly of Next Generation Sequencing (NGS) reads remains a challenging task. This is especially true for the assembly of metagenomics data that originate from environmental samples potentially containing hundreds to thousands of unique species. The principle objective of current assembly tools is to assemble NGS reads into contiguous stretches of sequence called contigs while maximizing for both accuracy and contig length. The end goal of this process is to produce longer contigs with the major focus being on assembly only. Sequence read assembly is an aggregative process, during which read overlap relationship information is lost as reads are merged into longer sequences or contigs. The assembly graph is information rich and capable of capturing the genomic architecture of an input read data set. We have developed a novel hybrid graph in which nodes represent sequence regions at different levels of granularity. This model, utilized in the assembly and analysis pipeline Focus, presents a concise yet feature rich view of a given input data set, allowing for the extraction of biologically relevant graph structures for graph mining purposes. Focus was used to create hybrid graphs to model metagenomics data sets obtained from the gut microbiomes of five individuals with Crohn's disease and eight healthy individuals. Repetitive and mobile genetic elements are found to be associated with hybrid graph structure. Using graph mining techniques, a comparative study of the Crohn's disease and healthy data sets was conducted with focus on antibiotics resistance genes associated with transposase genes. Results demonstrated significant differences in the phylogenetic distribution of categories of antibiotics resistance genes in the healthy and diseased patients. Focus was also evaluated as a pure assembly tool and produced excellent results when compared against the Meta-velvet, Omega, and UD-IDBA assemblers. Mining the hybrid graph can reveal biological phenomena captured by its structure. We demonstrate the advantages of considering assembly graphs as data-mining support in addition to their role as frameworks for assembly.

X Demographics

X Demographics

The data shown below were collected from the profiles of 27 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 83 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Brazil 2 2%
Norway 1 1%
Sweden 1 1%
Czechia 1 1%
United States 1 1%
Unknown 77 93%

Demographic breakdown

Readers by professional status Count As %
Researcher 20 24%
Student > Bachelor 11 13%
Student > Ph. D. Student 8 10%
Student > Master 8 10%
Student > Doctoral Student 7 8%
Other 12 14%
Unknown 17 20%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 16 19%
Agricultural and Biological Sciences 16 19%
Computer Science 15 18%
Medicine and Dentistry 4 5%
Engineering 4 5%
Other 9 11%
Unknown 19 23%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 21. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 02 September 2017.
All research outputs
#1,737,022
of 24,885,505 outputs
Outputs from BMC Genomics
#369
of 11,098 outputs
Outputs of similar age
#28,201
of 304,582 outputs
Outputs of similar age from BMC Genomics
#9
of 196 outputs
Altmetric has tracked 24,885,505 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 93rd percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 11,098 research outputs from this source. They receive a mean Attention Score of 4.8. This one has done particularly well, scoring higher than 96% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 304,582 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 90% of its contemporaries.
We're also able to compare this research output to 196 others from the same source and published within six weeks on either side of this one. This one has done particularly well, scoring higher than 95% of its contemporaries.