↓ Skip to main content

Redundancy Control in Pathway Databases (ReCiPa): An Application for Improving Gene-Set Enrichment Analysis in Omics Studies and “Big Data” Biology

Overview of attention for article published in OMICS: A Journal of Integrative Biology, June 2013
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age and source

Mentioned by

twitter
2 X users

Citations

dimensions_citation
40 Dimensions

Readers on

mendeley
72 Mendeley
citeulike
1 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Redundancy Control in Pathway Databases (ReCiPa): An Application for Improving Gene-Set Enrichment Analysis in Omics Studies and “Big Data” Biology
Published in
OMICS: A Journal of Integrative Biology, June 2013
DOI 10.1089/omi.2012.0083
Pubmed ID
Authors

Juan C. Vivar, Priscilla Pemu, Ruth McPherson, Sujoy Ghosh

Abstract

Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association to obesity compared to pathways identified from the original databases.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 72 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Brazil 3 4%
United States 2 3%
Canada 1 1%
Hungary 1 1%
Unknown 65 90%

Demographic breakdown

Readers by professional status Count As %
Researcher 18 25%
Student > Ph. D. Student 16 22%
Student > Master 8 11%
Student > Doctoral Student 4 6%
Student > Bachelor 4 6%
Other 14 19%
Unknown 8 11%
Readers by discipline Count As %
Agricultural and Biological Sciences 20 28%
Computer Science 13 18%
Medicine and Dentistry 8 11%
Biochemistry, Genetics and Molecular Biology 7 10%
Mathematics 5 7%
Other 9 13%
Unknown 10 14%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 19 December 2016.
All research outputs
#17,286,645
of 25,377,790 outputs
Outputs from OMICS: A Journal of Integrative Biology
#436
of 705 outputs
Outputs of similar age
#133,242
of 210,218 outputs
Outputs of similar age from OMICS: A Journal of Integrative Biology
#5
of 8 outputs
Altmetric has tracked 25,377,790 research outputs across all sources so far. This one is in the 21st percentile – i.e., 21% of other outputs scored the same or lower than it.
So far Altmetric has tracked 705 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 6.8. This one is in the 27th percentile – i.e., 27% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 210,218 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 27th percentile – i.e., 27% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 8 others from the same source and published within six weeks on either side of this one. This one has scored higher than 3 of them.