↓ Skip to main content

Measuring the reproducibility and quality of Hi-C data

Overview of attention for article published in Genome Biology, March 2019
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (93rd percentile)
  • Above-average Attention Score compared to outputs of the same age and source (62nd percentile)

Mentioned by

twitter
82 X users
facebook
1 Facebook page

Citations

dimensions_citation
122 Dimensions

Readers on

mendeley
240 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Measuring the reproducibility and quality of Hi-C data
Published in
Genome Biology, March 2019
DOI 10.1186/s13059-019-1658-7
Pubmed ID
Authors

Galip Gürkan Yardımcı, Hakan Ozadam, Michael E. G. Sauria, Oana Ursu, Koon-Kiu Yan, Tao Yang, Abhijit Chakraborty, Arya Kaul, Bryan R. Lajoie, Fan Song, Ye Zhan, Ferhat Ay, Mark Gerstein, Anshul Kundaje, Qunhua Li, James Taylor, Feng Yue, Job Dekker, William S. Noble

Abstract

Hi-C is currently the most widely used assay to investigate the 3D organization of the genome and to study its role in gene regulation, DNA replication, and disease. However, Hi-C experiments are costly to perform and involve multiple complex experimental steps; thus, accurate methods for measuring the quality and reproducibility of Hi-C data are essential to determine whether the output should be used further in a study. Using real and simulated data, we profile the performance of several recently proposed methods for assessing reproducibility of population Hi-C data, including HiCRep, GenomeDISCO, HiC-Spector, and QuASAR-Rep. By explicitly controlling noise and sparsity through simulations, we demonstrate the deficiencies of performing simple correlation analysis on pairs of matrices, and we show that methods developed specifically for Hi-C data produce better measures of reproducibility. We also show how to use established measures, such as the ratio of intra- to interchromosomal interactions, and novel ones, such as QuASAR-QC, to identify low-quality experiments. In this work, we assess reproducibility and quality measures by varying sequencing depth, resolution and noise levels in Hi-C data from 13 cell lines, with two biological replicates each, as well as 176 simulated matrices. Through this extensive validation and benchmarking of Hi-C data, we describe best practices for reproducibility and quality assessment of Hi-C experiments. We make all software publicly available at http://github.com/kundajelab/3DChromatin_ReplicateQC to facilitate adoption in the community.

X Demographics

X Demographics

The data shown below were collected from the profiles of 82 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 240 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 240 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 61 25%
Researcher 49 20%
Student > Master 20 8%
Student > Bachelor 17 7%
Student > Doctoral Student 9 4%
Other 28 12%
Unknown 56 23%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 94 39%
Agricultural and Biological Sciences 46 19%
Computer Science 14 6%
Physics and Astronomy 5 2%
Medicine and Dentistry 4 2%
Other 13 5%
Unknown 64 27%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 42. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 19 July 2019.
All research outputs
#976,162
of 25,385,509 outputs
Outputs from Genome Biology
#689
of 4,468 outputs
Outputs of similar age
#22,768
of 364,391 outputs
Outputs of similar age from Genome Biology
#23
of 61 outputs
Altmetric has tracked 25,385,509 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 96th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 4,468 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 27.6. This one has done well, scoring higher than 84% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 364,391 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 93% of its contemporaries.
We're also able to compare this research output to 61 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 62% of its contemporaries.