↓ Skip to main content

TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data

Overview of attention for article published in BMC Bioinformatics, June 2012
Altmetric Badge

Mentioned by

twitter
1 tweeter

Citations

dimensions_citation
40 Dimensions

Readers on

mendeley
66 Mendeley
citeulike
4 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
TAPDANCE: An automated tool to identify and annotate transposon insertion CISs and associations between CISs from next generation sequence data
Published in
BMC Bioinformatics, June 2012
DOI 10.1186/1471-2105-13-154
Pubmed ID
Authors

Aaron L Sarver, Jesse Erdman, Tim Starr, David A Largaespada, Kevin A T Silverstein

Abstract

Next generation sequencing approaches applied to the analyses of transposon insertion junction fragments generated in high throughput forward genetic screens has created the need for clear informatics and statistical approaches to deal with the massive amount of data currently being generated. Previous approaches utilized to 1) map junction fragments within the genome and 2) identify Common Insertion Sites (CISs) within the genome are not practical due to the volume of data generated by current sequencing technologies. Previous approaches applied to this problem also required significant manual annotation. We describe Transposon Annotation Poisson Distribution Association Network Connectivity Environment (TAPDANCE) software, which automates the identification of CISs within transposon junction fragment insertion data. Starting with barcoded sequence data, the software identifies and trims sequences and maps putative genomic sequence to a reference genome using the bowtie short read mapper. Poisson distribution statistics are then applied to assess and rank genomic regions showing significant enrichment for transposon insertion. Novel methods of counting insertions are used to ensure that the results presented have the expected characteristics of informative CISs. A persistent mySQL database is generated and utilized to keep track of sequences, mappings and common insertion sites. Additionally, associations between phenotypes and CISs are also identified using Fisher's exact test with multiple testing correction. In a case study using previously published data we show that the TAPDANCE software identifies CISs as previously described, prioritizes them based on p-value, allows holistic visualization of the data within genome browser software and identifies relationships present in the structure of the data. The TAPDANCE process is fully automated, performs similarly to previous labor intensive approaches, provides consistent results at a wide range of sequence sampling depth, has the capability of handling extremely large datasets, enables meaningful comparison across datasets and enables large scale meta-analyses of junction fragment data. The TAPDANCE software will greatly enhance our ability to analyze these datasets in order to increase our understanding of the genetic basis of cancers.

Twitter Demographics

The data shown below were collected from the profile of 1 tweeter who shared this research output. Click here to find out more about how the information was compiled.

Mendeley readers

The data shown below were compiled from readership statistics for 66 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 4 6%
Brazil 3 5%
Sweden 1 2%
United Kingdom 1 2%
Unknown 57 86%

Demographic breakdown

Readers by professional status Count As %
Researcher 26 39%
Student > Ph. D. Student 13 20%
Other 6 9%
Student > Master 6 9%
Student > Bachelor 4 6%
Other 7 11%
Unknown 4 6%
Readers by discipline Count As %
Agricultural and Biological Sciences 29 44%
Biochemistry, Genetics and Molecular Biology 11 17%
Computer Science 8 12%
Medicine and Dentistry 7 11%
Immunology and Microbiology 2 3%
Other 5 8%
Unknown 4 6%

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 02 July 2012.
All research outputs
#3,151,273
of 4,505,482 outputs
Outputs from BMC Bioinformatics
#2,143
of 2,646 outputs
Outputs of similar age
#49,577
of 75,379 outputs
Outputs of similar age from BMC Bioinformatics
#60
of 81 outputs
Altmetric has tracked 4,505,482 research outputs across all sources so far. This one is in the 22nd percentile – i.e., 22% of other outputs scored the same or lower than it.
So far Altmetric has tracked 2,646 research outputs from this source. They receive a mean Attention Score of 4.2. This one is in the 12th percentile – i.e., 12% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 75,379 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 25th percentile – i.e., 25% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 81 others from the same source and published within six weeks on either side of this one. This one is in the 14th percentile – i.e., 14% of its contemporaries scored the same or lower than it.