↓ Skip to main content

Core column prediction for protein multiple sequence alignments

Overview of attention for article published in Algorithms for Molecular Biology, April 2017
Altmetric Badge

Mentioned by

twitter
1 X user

Citations

dimensions_citation
1 Dimensions

Readers on

mendeley
9 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Core column prediction for protein multiple sequence alignments
Published in
Algorithms for Molecular Biology, April 2017
DOI 10.1186/s13015-017-0102-3
Pubmed ID
Authors

Dan DeBlasio, John Kececioglu

Abstract

In a computed protein multiple sequence alignment, the coreness of a column is the fraction of its substitutions that are in so-called core columns of the gold-standard reference alignment of its proteins. In benchmark suites of protein reference alignments, the core columns of the reference alignment are those that can be confidently labeled as correct, usually due to all residues in the column being sufficiently close in the spatial superposition of the known three-dimensional structures of the proteins. Typically the accuracy of a protein multiple sequence alignment that has been computed for a benchmark is only measured with respect to the core columns of the reference alignment. When computing an alignment in practice, however, a reference alignment is not known, so the coreness of its columns can only be predicted. We develop for the first time a predictor of column coreness for protein multiple sequence alignments. This allows us to predict which columns of a computed alignment are core, and hence better estimate the alignment's accuracy. Our approach to predicting coreness is similar to nearest-neighbor classification from machine learning, except we transform nearest-neighbor distances into a coreness prediction via a regression function, and we learn an appropriate distance function through a new optimization formulation that solves a large-scale linear programming problem. We apply our coreness predictor to parameter advising, the task of choosing parameter values for an aligner's scoring function to obtain a more accurate alignment of a specific set of sequences. We show that for this task, our predictor strongly outperforms other column-confidence estimators from the literature, and affords a substantial boost in alignment accuracy.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 9 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 9 100%

Demographic breakdown

Readers by professional status Count As %
Lecturer 1 11%
Student > Doctoral Student 1 11%
Student > Master 1 11%
Professor > Associate Professor 1 11%
Student > Postgraduate 1 11%
Other 0 0%
Unknown 4 44%
Readers by discipline Count As %
Agricultural and Biological Sciences 3 33%
Computer Science 1 11%
Medicine and Dentistry 1 11%
Unknown 4 44%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 25 April 2017.
All research outputs
#20,414,746
of 22,965,074 outputs
Outputs from Algorithms for Molecular Biology
#233
of 264 outputs
Outputs of similar age
#270,039
of 310,359 outputs
Outputs of similar age from Algorithms for Molecular Biology
#8
of 9 outputs
Altmetric has tracked 22,965,074 research outputs across all sources so far. This one is in the 1st percentile – i.e., 1% of other outputs scored the same or lower than it.
So far Altmetric has tracked 264 research outputs from this source. They receive a mean Attention Score of 3.1. This one is in the 1st percentile – i.e., 1% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 310,359 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 1st percentile – i.e., 1% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 9 others from the same source and published within six weeks on either side of this one.