↓ Skip to main content

Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations

Overview of attention for article published in Journal of Cheminformatics, January 2015
Altmetric Badge

Citations

dimensions_citation
29 Dimensions

Readers on

mendeley
99 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations
Published in
Journal of Cheminformatics, January 2015
DOI 10.1186/1758-2946-7-s1-s9
Pubmed ID
Authors

Tsendsuren Munkhdalai, Meijing Li, Khuyagbaatar Batsuren, Hyeon Ah Park, Nak Hyeon Choi, Keun Ho Ryu

Abstract

Chemical and biomedical Named Entity Recognition (NER) is an essential prerequisite task before effective text mining can begin for biochemical-text data. Exploiting unlabeled text data to leverage system performance has been an active and challenging research topic in text mining due to the recent growth in the amount of biomedical literature. We present a semi-supervised learning method that efficiently exploits unlabeled data in order to incorporate domain knowledge into a named entity recognition model and to leverage system performance. The proposed method includes Natural Language Processing (NLP) tasks for text preprocessing, learning word representation features from a large amount of text data for feature extraction, and conditional random fields for token classification. Other than the free text in the domain, the proposed method does not rely on any lexicon nor any dictionary in order to keep the system applicable to other NER tasks in bio-text data.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 99 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
India 1 1%
Germany 1 1%
Korea, Republic of 1 1%
Unknown 96 97%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 17 17%
Student > Master 15 15%
Researcher 11 11%
Student > Doctoral Student 7 7%
Student > Bachelor 7 7%
Other 18 18%
Unknown 24 24%
Readers by discipline Count As %
Computer Science 40 40%
Agricultural and Biological Sciences 7 7%
Psychology 4 4%
Neuroscience 3 3%
Social Sciences 3 3%
Other 18 18%
Unknown 24 24%