Report for: Characterization of background noise in capture-based targeted sequencing data

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Characterization of background noise in capture-based targeted sequencing data
Published in	Genome Biology, July 2017
DOI	10.1186/s13059-017-1275-2
Pubmed ID	28732520
Authors	Gahee Park, Joo Kyung Park, Seung-Ho Shin, Hyo-Jeong Jeon, Nayoung K. D. Kim, Yeon Jeong Kim, Hyun-Tae Shin, Eunjin Lee, Kwang Hyuck Lee, Dae-Soon Son, Woong-Yang Park, Donghyun Park
Abstract	Targeted deep sequencing is increasingly used to detect low-allelic fraction variants; it is therefore essential that errors that constitute baseline noise and impose a practical limit on detection are characterized. In the present study, we systematically evaluate the extent to which errors are incurred during specific steps of the capture-based targeted sequencing process. We removed most sequencing artifacts by filtering out low-quality bases and then analyze the remaining background noise. By recognizing that plasma DNA is naturally fragmented to be of a size comparable to that of mono-nucleosomal DNA, we were able to identify and characterize errors that are specifically associated with acoustic shearing. Two-thirds of C:G > A:T errors and one quarter of C:G > G:C errors were attributed to the oxidation of guanine during acoustic shearing, and this was further validated by comparative experiments conducted under different shearing conditions. The acoustic shearing step also causes A > G and A > T substitutions localized to the end bases of sheared DNA fragments, indicating a probable association of these errors with DNA breakage. Finally, the hybrid selection step contributes to one-third of the remaining C:G > A:T and one-fifth of the C > T errors. The results of this study provide a comprehensive summary of various errors incurred during targeted deep sequencing, and their underlying causes. This information will be invaluable to drive technical improvements in this sequencing method, and may increase the future usage of targeted deep sequencing methods for low-allelic fraction variant detection.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 21 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United States	3	14%
United Kingdom	3	14%
Japan	1	5%
Germany	1	5%
Spain	1	5%
India	1	5%
Kenya	1	5%
Unknown	10	48%

Demographic breakdown

Type	Count	As %
Scientists	10	48%
Members of the public	10	48%
Science communicators (journalists, bloggers, editors)	1	5%

Mendeley readers

The data shown below were compiled from readership statistics for 94 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	94	100%

Demographic breakdown

Readers by professional status	Count	As %
Researcher	22	23%
Student > Ph. D. Student	20	21%
Student > Master	13	14%
Student > Bachelor	6	6%
Other	5	5%
Other	12	13%
Unknown	16	17%

Readers by discipline	Count	As %
Biochemistry, Genetics and Molecular Biology	35	37%
Agricultural and Biological Sciences	25	27%
Medicine and Dentistry	7	7%
Immunology and Microbiology	2	2%
Computer Science	2	2%
Other	7	7%
Unknown	16	17%

Attention Score in Context

This research output has an Altmetric Attention Score of 15. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 23 February 2023.

All research outputs

#2,415,498

of 25,736,439 outputs

Outputs from Genome Biology

#1,948

of 4,509 outputs

Outputs of similar age

#43,673

of 325,594 outputs

Outputs of similar age from Genome Biology

#37

of 60 outputs

Altmetric has tracked 25,736,439 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 90th percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.

So far Altmetric has tracked 4,509 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 27.6. This one has gotten more attention than average, scoring higher than 56% of its peers.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 325,594 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 86% of its contemporaries.

We're also able to compare this research output to 60 others from the same source and published within six weeks on either side of this one. This one is in the 38th percentile – i.e., 38% of its contemporaries scored the same or lower than it.

Characterization of background noise in capture-based targeted sequencing data

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context