Title |
The gene normalization task in BioCreative III
|
---|---|
Published in |
BMC Bioinformatics, October 2011
|
DOI | 10.1186/1471-2105-12-s8-s2 |
Pubmed ID | |
Authors |
Zhiyong Lu, Hung-Yu Kao, Chih-Hsuan Wei, Minlie Huang, Jingchen Liu, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tsai, Hong-Jie Dai, Naoaki Okazaki, Han-Cheol Cho, Martin Gerner, Illes Solt, Shashank Agarwal, Feifan Liu, Dina Vishnyakova, Patrick Ruch, Martin Romacker, Fabio Rinaldi, Sanmitra Bhattacharya, Padmini Srinivasan, Hongfang Liu, Manabu Torii, Sergio Matos, David Campos, Karin Verspoor, Kevin M Livingston, W John Wilbur |
Abstract |
We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
Unknown | 2 | 100% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Practitioners (doctors, other healthcare professionals) | 1 | 50% |
Members of the public | 1 | 50% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 3 | 4% |
Spain | 2 | 3% |
Netherlands | 1 | 1% |
Portugal | 1 | 1% |
Germany | 1 | 1% |
Australia | 1 | 1% |
Unknown | 66 | 88% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 16 | 21% |
Researcher | 15 | 20% |
Student > Master | 11 | 15% |
Professor > Associate Professor | 6 | 8% |
Student > Bachelor | 6 | 8% |
Other | 11 | 15% |
Unknown | 10 | 13% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 35 | 47% |
Agricultural and Biological Sciences | 12 | 16% |
Biochemistry, Genetics and Molecular Biology | 3 | 4% |
Linguistics | 2 | 3% |
Engineering | 2 | 3% |
Other | 8 | 11% |
Unknown | 13 | 17% |