↓ Skip to main content

Extracting DNA words based on the sequence features: non-uniform distribution and integrity

Overview of attention for article published in Theoretical Biology and Medical Modelling, January 2016
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age
  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
2 X users

Citations

dimensions_citation
4 Dimensions

Readers on

mendeley
10 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Extracting DNA words based on the sequence features: non-uniform distribution and integrity
Published in
Theoretical Biology and Medical Modelling, January 2016
DOI 10.1186/s12976-016-0028-3
Pubmed ID
Authors

Zhi Li, Hongyan Cao, Yuehua Cui, Yanbo Zhang

Abstract

DNA sequence can be viewed as an unknown language with words as its functional units. Given that most sequence alignment algorithms such as the motif discovery algorithms depend on the quality of background information about sequences, it is necessary to develop an ab initio algorithm for extracting the "words" based only on the DNA sequences. We considered that non-uniform distribution and integrity were two important features of a word, based on which we developed an ab initio algorithm to extract "DNA words" that have potential functional meaning. A Kolmogorov-Smirnov test was used for consistency test of uniform distribution of DNA sequences, and the integrity was judged by the sequence and position alignment. Two random base sequences were adopted as negative control, and an English book was used as positive control to verify our algorithm. We applied our algorithm to the genomes of Saccharomyces cerevisiae and 10 strains of Escherichia coli to show the utility of the methods. The results provide strong evidences that the algorithm is a promising tool for ab initio building a DNA dictionary. Our method provides a fast way for large scale screening of important DNA elements and offers potential insights into the understanding of a genome.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 10 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 10 100%

Demographic breakdown

Readers by professional status Count As %
Student > Doctoral Student 2 20%
Lecturer 1 10%
Student > Bachelor 1 10%
Professor 1 10%
Student > Ph. D. Student 1 10%
Other 1 10%
Unknown 3 30%
Readers by discipline Count As %
Medicine and Dentistry 2 20%
Agricultural and Biological Sciences 2 20%
Biochemistry, Genetics and Molecular Biology 1 10%
Computer Science 1 10%
Engineering 1 10%
Other 0 0%
Unknown 3 30%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 January 2016.
All research outputs
#14,246,461
of 22,842,950 outputs
Outputs from Theoretical Biology and Medical Modelling
#155
of 287 outputs
Outputs of similar age
#207,895
of 396,496 outputs
Outputs of similar age from Theoretical Biology and Medical Modelling
#3
of 7 outputs
Altmetric has tracked 22,842,950 research outputs across all sources so far. This one is in the 35th percentile – i.e., 35% of other outputs scored the same or lower than it.
So far Altmetric has tracked 287 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 7.4. This one is in the 43rd percentile – i.e., 43% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 396,496 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 44th percentile – i.e., 44% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 7 others from the same source and published within six weeks on either side of this one. This one has scored higher than 4 of them.