↓ Skip to main content

Heuristic algorithms for feature selection under Bayesian models with block-diagonal covariance structure

Overview of attention for article published in BMC Bioinformatics, March 2018
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
11 Dimensions

Readers on

mendeley
10 Mendeley
Title
Heuristic algorithms for feature selection under Bayesian models with block-diagonal covariance structure
Published in
BMC Bioinformatics, March 2018
DOI 10.1186/s12859-018-2059-8
Pubmed ID
Authors

Ali Foroughi pour, Lori A. Dalton

Abstract

Many bioinformatics studies aim to identify markers, or features, that can be used to discriminate between distinct groups. In problems where strong individual markers are not available, or where interactions between gene products are of primary interest, it may be necessary to consider combinations of features as a marker family. To this end, recent work proposes a hierarchical Bayesian framework for feature selection that places a prior on the set of features we wish to select and on the label-conditioned feature distribution. While an analytical posterior under Gaussian models with block covariance structures is available, the optimal feature selection algorithm for this model remains intractable since it requires evaluating the posterior over the space of all possible covariance block structures and feature-block assignments. To address this computational barrier, in prior work we proposed a simple suboptimal algorithm, 2MNC-Robust, with robust performance across the space of block structures. Here, we present three new heuristic feature selection algorithms. The proposed algorithms outperform 2MNC-Robust and many other popular feature selection algorithms on synthetic data. In addition, enrichment analysis on real breast cancer, colon cancer, and Leukemia data indicates they also output many of the genes and pathways linked to the cancers under study. Bayesian feature selection is a promising framework for small-sample high-dimensional data, in particular biomarker discovery applications. When applied to cancer data these algorithms outputted many genes already shown to be involved in cancer as well as potentially new biomarkers. Furthermore, one of the proposed algorithms, SPM, outputs blocks of heavily correlated genes, particularly useful for studying gene interactions and gene networks.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 10 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 10 100%

Demographic breakdown

Readers by professional status Count As %
Other 1 10%
Lecturer 1 10%
Student > Bachelor 1 10%
Student > Ph. D. Student 1 10%
Student > Master 1 10%
Other 2 20%
Unknown 3 30%
Readers by discipline Count As %
Biochemistry, Genetics and Molecular Biology 3 30%
Mathematics 1 10%
Agricultural and Biological Sciences 1 10%
Sports and Recreations 1 10%
Social Sciences 1 10%
Other 0 0%
Unknown 3 30%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 29 March 2018.
All research outputs
#17,934,709
of 23,028,364 outputs
Outputs from BMC Bioinformatics
#5,970
of 7,316 outputs
Outputs of similar age
#241,522
of 332,402 outputs
Outputs of similar age from BMC Bioinformatics
#79
of 112 outputs
Altmetric has tracked 23,028,364 research outputs across all sources so far. This one is in the 19th percentile – i.e., 19% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,316 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 13th percentile – i.e., 13% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 332,402 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 22nd percentile – i.e., 22% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 112 others from the same source and published within six weeks on either side of this one. This one is in the 23rd percentile – i.e., 23% of its contemporaries scored the same or lower than it.