↓ Skip to main content

Generation of Silver Standard Concept Annotations from Biomedical Texts with Special Relevance to Phenotypes

Overview of attention for article published in PLOS ONE, January 2015
Altmetric Badge

About this Attention Score

  • Good Attention Score compared to outputs of the same age (70th percentile)
  • Good Attention Score compared to outputs of the same age and source (65th percentile)

Mentioned by

twitter
1 X user
patent
1 patent

Citations

dimensions_citation
19 Dimensions

Readers on

mendeley
83 Mendeley
citeulike
2 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Generation of Silver Standard Concept Annotations from Biomedical Texts with Special Relevance to Phenotypes
Published in
PLOS ONE, January 2015
DOI 10.1371/journal.pone.0116040
Pubmed ID
Authors

Anika Oellrich, Nigel Collier, Damian Smedley, Tudor Groza

Abstract

Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems' output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems' annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the ShARe/CLEF (https://sites.google.com/site/shareclefehealth/data) and i2b2 (https://i2b2.org/NLP/DataSets/) corpora needs to be requested with the individual corpus providers.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 83 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 2 2%
United Kingdom 1 1%
Netherlands 1 1%
Spain 1 1%
Unknown 78 94%

Demographic breakdown

Readers by professional status Count As %
Researcher 19 23%
Student > Ph. D. Student 12 14%
Student > Master 9 11%
Student > Bachelor 8 10%
Professor 8 10%
Other 9 11%
Unknown 18 22%
Readers by discipline Count As %
Computer Science 19 23%
Medicine and Dentistry 14 17%
Agricultural and Biological Sciences 9 11%
Engineering 5 6%
Social Sciences 4 5%
Other 12 14%
Unknown 20 24%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 4. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 18 February 2020.
All research outputs
#7,206,491
of 22,778,347 outputs
Outputs from PLOS ONE
#85,456
of 194,344 outputs
Outputs of similar age
#101,156
of 351,728 outputs
Outputs of similar age from PLOS ONE
#1,014
of 3,055 outputs
Altmetric has tracked 22,778,347 research outputs across all sources so far. This one has received more attention than most of these and is in the 67th percentile.
So far Altmetric has tracked 194,344 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 15.1. This one has gotten more attention than average, scoring higher than 54% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 351,728 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 70% of its contemporaries.
We're also able to compare this research output to 3,055 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 65% of its contemporaries.