↓ Skip to main content

Chapter 13: Mining Electronic Health Records in the Genomics Era

Overview of attention for article published in PLoS Computational Biology, December 2012
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (94th percentile)
  • High Attention Score compared to outputs of the same age and source (85th percentile)

Mentioned by

twitter
31 X users
peer_reviews
1 peer review site
facebook
2 Facebook pages

Citations

dimensions_citation
152 Dimensions

Readers on

mendeley
377 Mendeley
citeulike
8 CiteULike
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Chapter 13: Mining Electronic Health Records in the Genomics Era
Published in
PLoS Computational Biology, December 2012
DOI 10.1371/journal.pcbi.1002823
Pubmed ID
Authors

Joshua C. Denny

Abstract

The combination of improved genomic analysis methods, decreasing genotyping costs, and increasing computing resources has led to an explosion of clinical genomic knowledge in the last decade. Similarly, healthcare systems are increasingly adopting robust electronic health record (EHR) systems that not only can improve health care, but also contain a vast repository of disease and treatment data that could be mined for genomic research. Indeed, institutions are creating EHR-linked DNA biobanks to enable genomic and pharmacogenomic research, using EHR data for phenotypic information. However, EHRs are designed primarily for clinical care, not research, so reuse of clinical EHR data for research purposes can be challenging. Difficulties in use of EHR data include: data availability, missing data, incorrect data, and vast quantities of unstructured narrative text data. Structured information includes billing codes, most laboratory reports, and other variables such as physiologic measurements and demographic information. Significant information, however, remains locked within EHR narrative text documents, including clinical notes and certain categories of test results, such as pathology and radiology reports. For relatively rare observations, combinations of simple free-text searches and billing codes may prove adequate when followed by manual chart review. However, to extract the large cohorts necessary for genome-wide association studies, natural language processing methods to process narrative text data may be needed. Combinations of structured and unstructured textual data can be mined to generate high-validity collections of cases and controls for a given condition. Once high-quality cases and controls are identified, EHR-derived cases can be used for genomic discovery and validation. Since EHR data includes a broad sampling of clinically-relevant phenotypic information, it may enable multiple genomic investigations upon a single set of genotyped individuals. This chapter reviews several examples of phenotype extraction and their application to genetic research, demonstrating a viable future for genomic discovery using EHR-linked data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 31 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 377 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 11 3%
United Kingdom 7 2%
Spain 4 1%
Brazil 4 1%
Canada 3 <1%
France 1 <1%
Sweden 1 <1%
Germany 1 <1%
Netherlands 1 <1%
Other 4 1%
Unknown 340 90%

Demographic breakdown

Readers by professional status Count As %
Researcher 90 24%
Student > Ph. D. Student 86 23%
Student > Master 37 10%
Student > Bachelor 34 9%
Other 24 6%
Other 67 18%
Unknown 39 10%
Readers by discipline Count As %
Medicine and Dentistry 76 20%
Agricultural and Biological Sciences 67 18%
Computer Science 67 18%
Biochemistry, Genetics and Molecular Biology 33 9%
Engineering 18 5%
Other 54 14%
Unknown 62 16%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 21. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 01 July 2017.
All research outputs
#1,805,439
of 25,373,627 outputs
Outputs from PLoS Computational Biology
#1,583
of 8,960 outputs
Outputs of similar age
#16,237
of 288,779 outputs
Outputs of similar age from PLoS Computational Biology
#18
of 121 outputs
Altmetric has tracked 25,373,627 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 92nd percentile: it's in the top 10% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 8,960 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 20.4. This one has done well, scoring higher than 82% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 288,779 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 94% of its contemporaries.
We're also able to compare this research output to 121 others from the same source and published within six weeks on either side of this one. This one has done well, scoring higher than 85% of its contemporaries.