Report for: Technology-assisted title and abstract screening for systematic reviews: a retrospective evaluation of the Abstrackr machine learning tool

You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.

Title	Technology-assisted title and abstract screening for systematic reviews: a retrospective evaluation of the Abstrackr machine learning tool
Published in	Systematic Reviews, March 2018
DOI	10.1186/s13643-018-0707-8
Pubmed ID	29530097
Authors	Allison Gates, Cydney Johnson, Lisa Hartling
Abstract	Machine learning tools can expedite systematic review (SR) processes by semi-automating citation screening. Abstrackr semi-automates citation screening by predicting relevant records. We evaluated its performance for four screening projects. We used a convenience sample of screening projects completed at the Alberta Research Centre for Health Evidence, Edmonton, Canada: three SRs and one descriptive analysis for which we had used SR screening methods. The projects were heterogeneous with respect to search yield (median 9328; range 5243 to 47,385 records; interquartile range (IQR) 15,688 records), topic (Antipsychotics, Bronchiolitis, Diabetes, Child Health SRs), and screening complexity. We uploaded the records to Abstrackr and screened until it made predictions about the relevance of the remaining records. Across three trials for each project, we compared the predictions to human reviewer decisions and calculated the sensitivity, specificity, precision, false negative rate, proportion missed, and workload savings. Abstrackr's sensitivity was > 0.75 for all projects and the mean specificity ranged from 0.69 to 0.90 with the exception of Child Health SRs, for which it was 0.19. The precision (proportion of records correctly predicted as relevant) varied by screening task (median 26.6%; range 14.8 to 64.7%; IQR 29.7%). The median false negative rate (proportion of records incorrectly predicted as irrelevant) was 12.6% (range 3.5 to 21.2%; IQR 12.3%). The workload savings were often large (median 67.2%, range 9.5 to 88.4%; IQR 23.9%). The proportion missed (proportion of records predicted as irrelevant that were included in the final report, out of the total number predicted as irrelevant) was 0.1% for all SRs and 6.4% for the descriptive analysis. This equated to 4.2% (range 0 to 12.2%; IQR 7.8%) of the records in the final reports. Abstrackr's reliability and the workload savings varied by screening task. Workload savings came at the expense of potentially missing relevant records. How this might affect the results and conclusions of SRs needs to be evaluated. Studies evaluating Abstrackr as the second reviewer in a pair would be of interest to determine if concerns for reliability would diminish. Further evaluations of Abstrackr's performance and usability will inform its refinement and practical utility.

View on publisher site Alert me about new mentions

X Demographics

The data shown below were collected from the profiles of 24 X users who shared this research output. Click here to find out more about how the information was compiled.

Geographical breakdown

Country	Count	As %
United Kingdom	5	21%
Canada	3	13%
Spain	1	4%
Ireland	1	4%
Colombia	1	4%
Germany	1	4%
Unknown	12	50%

Demographic breakdown

Type	Count	As %
Members of the public	15	63%
Scientists	6	25%
Practitioners (doctors, other healthcare professionals)	3	13%

Mendeley readers

The data shown below were compiled from readership statistics for 136 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country	Count	As %
Unknown	136	100%

Demographic breakdown

Readers by professional status	Count	As %
Student > Ph. D. Student	25	18%
Student > Master	21	15%
Researcher	15	11%
Librarian	6	4%
Student > Bachelor	6	4%
Other	24	18%
Unknown	39	29%

Readers by discipline	Count	As %
Medicine and Dentistry	33	24%
Social Sciences	10	7%
Computer Science	10	7%
Environmental Science	5	4%
Nursing and Health Professions	4	3%
Other	24	18%
Unknown	50	37%

Attention Score in Context

This research output has an Altmetric Attention Score of 29. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 30 November 2020.

All research outputs

#1,345,620

of 25,382,035 outputs

Outputs from Systematic Reviews

#196

of 2,227 outputs

Outputs of similar age

#29,153

of 339,299 outputs

Outputs of similar age from Systematic Reviews

of 43 outputs

Altmetric has tracked 25,382,035 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 94th percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.

So far Altmetric has tracked 2,227 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 13.1. This one has done particularly well, scoring higher than 91% of its peers.

Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 339,299 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 91% of its contemporaries.

We're also able to compare this research output to 43 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 86% of its contemporaries.

Technology-assisted title and abstract screening for systematic reviews: a retrospective evaluation of the Abstrackr machine learning tool

About this Attention Score

Mentioned by

Citations

Readers on

X Demographics

Geographical breakdown

Demographic breakdown

Mendeley readers

Geographical breakdown

Demographic breakdown

Attention Score in Context