↓ Skip to main content

Sortal anaphora resolution to enhance relation extraction from biomedical literature

Overview of attention for article published in BMC Bioinformatics, April 2016
Altmetric Badge

Mentioned by

twitter
2 X users

Citations

dimensions_citation
11 Dimensions

Readers on

mendeley
44 Mendeley
Title
Sortal anaphora resolution to enhance relation extraction from biomedical literature
Published in
BMC Bioinformatics, April 2016
DOI 10.1186/s12859-016-1009-6
Pubmed ID
Authors

Halil Kilicoglu, Graciela Rosemblat, Marcelo Fiszman, Thomas C. Rindflesch

Abstract

Entity coreference is common in biomedical literature and it can affect text understanding systems that rely on accurate identification of named entities, such as relation extraction and automatic summarization. Coreference resolution is a foundational yet challenging natural language processing task which, if performed successfully, is likely to enhance such systems significantly. In this paper, we propose a semantically oriented, rule-based method to resolve sortal anaphora, a specific type of coreference that forms the majority of coreference instances in biomedical literature. The method addresses all entity types and relies on linguistic components of SemRep, a broad-coverage biomedical relation extraction system. It has been incorporated into SemRep, extending its core semantic interpretation capability from sentence level to discourse level. We evaluated our sortal anaphora resolution method in several ways. The first evaluation specifically focused on sortal anaphora relations. Our methodology achieved a F1 score of 59.6 on the test portion of a manually annotated corpus of 320 Medline abstracts, a 4-fold improvement over the baseline method. Investigating the impact of sortal anaphora resolution on relation extraction, we found that the overall effect was positive, with 50 % of the changes involving uninformative relations being replaced by more specific and informative ones, while 35 % of the changes had no effect, and only 15 % were negative. We estimate that anaphora resolution results in changes in about 1.5 % of approximately 82 million semantic relations extracted from the entire PubMed. Our results demonstrate that a heavily semantic approach to sortal anaphora resolution is largely effective for biomedical literature. Our evaluation and error analysis highlight some areas for further improvements, such as coordination processing and intra-sentential antecedent selection.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 44 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 44 100%

Demographic breakdown

Readers by professional status Count As %
Researcher 15 34%
Student > Ph. D. Student 10 23%
Student > Master 3 7%
Student > Bachelor 2 5%
Student > Postgraduate 2 5%
Other 3 7%
Unknown 9 20%
Readers by discipline Count As %
Computer Science 18 41%
Medicine and Dentistry 7 16%
Agricultural and Biological Sciences 3 7%
Biochemistry, Genetics and Molecular Biology 2 5%
Linguistics 1 2%
Other 4 9%
Unknown 9 20%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 14 April 2016.
All research outputs
#18,451,892
of 22,862,742 outputs
Outputs from BMC Bioinformatics
#6,328
of 7,295 outputs
Outputs of similar age
#220,037
of 300,620 outputs
Outputs of similar age from BMC Bioinformatics
#94
of 107 outputs
Altmetric has tracked 22,862,742 research outputs across all sources so far. This one is in the 11th percentile – i.e., 11% of other outputs scored the same or lower than it.
So far Altmetric has tracked 7,295 research outputs from this source. They typically receive a little more attention than average, with a mean Attention Score of 5.4. This one is in the 5th percentile – i.e., 5% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 300,620 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 15th percentile – i.e., 15% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 107 others from the same source and published within six weeks on either side of this one. This one is in the 3rd percentile – i.e., 3% of its contemporaries scored the same or lower than it.