↓ Skip to main content

Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing

Overview of attention for article published in BMJ Health & Care Informatics, April 2023
Altmetric Badge

About this Attention Score

  • In the top 25% of all research outputs scored by Altmetric
  • Good Attention Score compared to outputs of the same age (78th percentile)
  • Good Attention Score compared to outputs of the same age and source (73rd percentile)

Mentioned by

twitter
13 X users

Readers on

mendeley
10 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Analysis of ‘One in a Million’ primary care consultation conversations using natural language processing
Published in
BMJ Health & Care Informatics, April 2023
DOI 10.1136/bmjhci-2022-100659
Pubmed ID
Authors

Yvette Pyne, Yik Ming Wong, Haishuo Fang, Edwin Simpson

Abstract

Modern patient electronic health records form a core part of primary care; they contain both clinical codes and free text entered by the clinician. Natural language processing (NLP) could be employed to generate these records through 'listening' to a consultation conversation. This study develops and assesses several text classifiers for identifying clinical codes for primary care consultations based on the doctor-patient conversation. We evaluate the possibility of training classifiers using medical code descriptions, and the benefits of processing transcribed speech from patients as well as doctors. The study also highlights steps for improving future classifiers. Using verbatim transcripts of 239 primary care consultation conversations (the 'One in a Million' dataset) and novel additional datasets for distant supervision, we trained NLP classifiers (naïve Bayes, support vector machine, nearest centroid, a conventional BERT classifier and few-shot BERT approaches) to identify the International Classification of Primary Care-2 clinical codes associated with each consultation. Of all models tested, a fine-tuned BERT classifier was the best performer. Distant supervision improved the model's performance (F1 score over 16 classes) from 0.45 with conventional supervision with 191 labelled transcripts to 0.51. Incorporating patients' speech in addition to clinician's speech increased the BERT classifier's performance from 0.45 to 0.55 F1 (p=0.01, paired bootstrap test). Our findings demonstrate that NLP classifiers can be trained to identify clinical area(s) being discussed in a primary care consultation from audio transcriptions; this could represent an important step towards a smart digital assistant in the consultation room.

Timeline

Login to access the full chart related to this output.

If you don’t have an account, click here to discover Explorer

X Demographics

X Demographics

The data shown below were collected from the profiles of 13 X users who shared this research output. Click here to find out more about how the information was compiled.
As of 1 July 2024, you may notice a temporary increase in the numbers of X profiles with Unknown location. Click here to learn more.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 10 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 10 100%

Demographic breakdown

Readers by professional status Count As %
Student > Master 1 10%
Unknown 9 90%
Readers by discipline Count As %
Business, Management and Accounting 1 10%
Unknown 9 90%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 8. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 12 August 2023.
All research outputs
#4,821,737
of 26,617,918 outputs
Outputs from BMJ Health & Care Informatics
#115
of 517 outputs
Outputs of similar age
#88,957
of 421,016 outputs
Outputs of similar age from BMJ Health & Care Informatics
#4
of 15 outputs
Altmetric has tracked 26,617,918 research outputs across all sources so far. Compared to these this one has done well and is in the 81st percentile: it's in the top 25% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 517 research outputs from this source. They typically receive more attention than average, with a mean Attention Score of 9.9. This one has done well, scoring higher than 77% of its peers.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 421,016 tracked outputs that were published within six weeks on either side of this one in any source. This one has done well, scoring higher than 78% of its contemporaries.
We're also able to compare this research output to 15 others from the same source and published within six weeks on either side of this one. This one has gotten more attention than average, scoring higher than 73% of its contemporaries.