↓ Skip to main content

Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository

Overview of attention for article published in Journal of Digital Imaging, September 2015
Altmetric Badge

About this Attention Score

  • Above-average Attention Score compared to outputs of the same age (54th percentile)
  • Above-average Attention Score compared to outputs of the same age and source (55th percentile)

Mentioned by

twitter
3 X users

Citations

dimensions_citation
36 Dimensions

Readers on

mendeley
98 Mendeley
citeulike
1 CiteULike
Title
Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository
Published in
Journal of Digital Imaging, September 2015
DOI 10.1007/s10278-015-9823-3
Pubmed ID
Authors

Saeed Hassanpour, Curtis P. Langlotz

Abstract

Radiology report narrative contains a large amount of information about the patient's health and the radiologist's interpretation of medical findings. Most of this critical information is entered in free text format, even when structured radiology report templates are used. The radiology report narrative varies in use of terminology and language among different radiologists and organizations. The free text format and the subtlety and variations of natural language hinder the extraction of reusable information from radiology reports for decision support, quality improvement, and biomedical research. Therefore, as the first step to organize and extract the information content in a large multi-institutional free text radiology report repository, we have designed and developed an unsupervised machine learning approach to capture the main concepts in a radiology report repository and partition the reports based on their main foci. In this approach, radiology reports are modeled in a vector space and compared to each other through a cosine similarity measure. This similarity is used to cluster radiology reports and identify the repository's underlying topics. We applied our approach on a repository of 1,899,482 radiology reports from three major healthcare organizations. Our method identified 19 major radiology report topics in the repository and clustered the reports accordingly to these topics. Our results are verified by a domain expert radiologist and successfully explain the repository's primary topics and extract the corresponding reports. The results of our system provide a target-based corpus and framework for information extraction and retrieval systems for radiology reports.

X Demographics

X Demographics

The data shown below were collected from the profiles of 3 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 98 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Spain 1 1%
United States 1 1%
Brazil 1 1%
Unknown 95 97%

Demographic breakdown

Readers by professional status Count As %
Researcher 15 15%
Student > Ph. D. Student 14 14%
Student > Master 11 11%
Student > Doctoral Student 9 9%
Student > Bachelor 6 6%
Other 20 20%
Unknown 23 23%
Readers by discipline Count As %
Computer Science 23 23%
Medicine and Dentistry 21 21%
Nursing and Health Professions 6 6%
Engineering 5 5%
Agricultural and Biological Sciences 4 4%
Other 11 11%
Unknown 28 29%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 3. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 03 April 2020.
All research outputs
#13,101,688
of 22,836,570 outputs
Outputs from Journal of Digital Imaging
#594
of 1,050 outputs
Outputs of similar age
#120,791
of 267,225 outputs
Outputs of similar age from Journal of Digital Imaging
#4
of 9 outputs
Altmetric has tracked 22,836,570 research outputs across all sources so far. This one is in the 42nd percentile – i.e., 42% of other outputs scored the same or lower than it.
So far Altmetric has tracked 1,050 research outputs from this source. They receive a mean Attention Score of 4.6. This one is in the 43rd percentile – i.e., 43% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 267,225 tracked outputs that were published within six weeks on either side of this one in any source. This one has gotten more attention than average, scoring higher than 54% of its contemporaries.
We're also able to compare this research output to 9 others from the same source and published within six weeks on either side of this one. This one has scored higher than 5 of them.