↓ Skip to main content

Learning to Recognize Phenotype Candidates in the Auto-Immune Literature Using SVM Re-Ranking

Overview of attention for article published in PLOS ONE, October 2013
Altmetric Badge

Mentioned by

twitter
1 X user

Citations

dimensions_citation
10 Dimensions

Readers on

mendeley
38 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Learning to Recognize Phenotype Candidates in the Auto-Immune Literature Using SVM Re-Ranking
Published in
PLOS ONE, October 2013
DOI 10.1371/journal.pone.0072965
Pubmed ID
Authors

Nigel Collier, Mai-vu Tran, Hoang-quynh Le, Quang-Thuy Ha, Anika Oellrich, Dietrich Rebholz-Schuhmann

Abstract

The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes. This paper presents novel techniques for identifying potential complex phenotype mentions by exploiting a hybrid model based on machine learning, rules and dictionary matching. A systematic study is made of how to combine sequence labels from these modules as well as the merits of various ontological resources. We evaluated our approach on a subset of Medline abstracts cited by the Online Mendelian Inheritance of Man database related to auto-immune diseases. Using partial matching the best micro-averaged F-score for phenotypes and five other entity classes was 79.9%. A best performance of 75.3% was achieved for phenotype candidates using all semantics resources. We observed the advantage of using SVM-based learn-to-rank for sequence label combination over maximum entropy and a priority list approach. The results indicate that the identification of simple entity types such as chemicals and genes are robustly supported by single semantic resources, whereas phenotypes require combinations. Altogether we conclude that our approach coped well with the compositional structure of phenotypes in the auto-immune domain.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 38 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 2 5%
France 1 3%
Australia 1 3%
Unknown 34 89%

Demographic breakdown

Readers by professional status Count As %
Researcher 7 18%
Student > Ph. D. Student 7 18%
Student > Bachelor 4 11%
Student > Doctoral Student 4 11%
Lecturer 2 5%
Other 7 18%
Unknown 7 18%
Readers by discipline Count As %
Computer Science 13 34%
Medicine and Dentistry 7 18%
Agricultural and Biological Sciences 6 16%
Biochemistry, Genetics and Molecular Biology 3 8%
Engineering 2 5%
Other 1 3%
Unknown 6 16%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 27 October 2013.
All research outputs
#18,351,676
of 22,727,570 outputs
Outputs from PLOS ONE
#154,213
of 193,986 outputs
Outputs of similar age
#156,895
of 210,690 outputs
Outputs of similar age from PLOS ONE
#3,846
of 5,151 outputs
Altmetric has tracked 22,727,570 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 193,986 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 15.1. This one is in the 10th percentile – i.e., 10% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 210,690 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 12th percentile – i.e., 12% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 5,151 others from the same source and published within six weeks on either side of this one. This one is in the 14th percentile – i.e., 14% of its contemporaries scored the same or lower than it.