↓ Skip to main content

How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs

Overview of attention for article published in Frontiers in Psychology, June 2014
Altmetric Badge

About this Attention Score

  • Average Attention Score compared to outputs of the same age

Mentioned by

twitter
2 X users

Readers on

mendeley
195 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
How to assess and compare inter-rater reliability, agreement and correlation of ratings: an exemplary analysis of mother-father and parent-teacher expressive vocabulary rating pairs
Published in
Frontiers in Psychology, June 2014
DOI 10.3389/fpsyg.2014.00509
Pubmed ID
Authors

Margarita Stolarova, Corinna Wolf, Tanja Rinker, Aenne Brielmann

Abstract

This report has two main purposes. First, we combine well-known analytical approaches to conduct a comprehensive assessment of agreement and correlation of rating-pairs and to dis-entangle these often confused concepts, providing a best-practice example on concrete data and a tutorial for future reference. Second, we explore whether a screening questionnaire developed for use with parents can be reliably employed with daycare teachers when assessing early expressive vocabulary. A total of 53 vocabulary rating pairs (34 parent-teacher and 19 mother-father pairs) collected for two-year-old children (12 bilingual) are evaluated. First, inter-rater reliability both within and across subgroups is assessed using the intra-class correlation coefficient (ICC). Next, based on this analysis of reliability and on the test-retest reliability of the employed tool, inter-rater agreement is analyzed, magnitude and direction of rating differences are considered. Finally, Pearson correlation coefficients of standardized vocabulary scores are calculated and compared across subgroups. The results underline the necessity to distinguish between reliability measures, agreement and correlation. They also demonstrate the impact of the employed reliability on agreement evaluations. This study provides evidence that parent-teacher ratings of children's early vocabulary can achieve agreement and correlation comparable to those of mother-father ratings on the assessed vocabulary scale. Bilingualism of the evaluated child decreased the likelihood of raters' agreement. We conclude that future reports of agreement, correlation and reliability of ratings will benefit from better definition of terms and stricter methodological approaches. The methodological tutorial provided here holds the potential to increase comparability across empirical reports and can help improve research practices and knowledge transfer to educational and therapeutic settings.

X Demographics

X Demographics

The data shown below were collected from the profiles of 2 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 195 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 <1%
Netherlands 1 <1%
Germany 1 <1%
Unknown 192 98%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 38 19%
Researcher 29 15%
Student > Doctoral Student 25 13%
Student > Master 22 11%
Student > Bachelor 11 6%
Other 39 20%
Unknown 31 16%
Readers by discipline Count As %
Psychology 39 20%
Medicine and Dentistry 19 10%
Social Sciences 18 9%
Nursing and Health Professions 17 9%
Agricultural and Biological Sciences 8 4%
Other 49 25%
Unknown 45 23%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 2. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 04 June 2014.
All research outputs
#15,250,654
of 22,757,090 outputs
Outputs from Frontiers in Psychology
#18,260
of 29,666 outputs
Outputs of similar age
#132,625
of 228,065 outputs
Outputs of similar age from Frontiers in Psychology
#273
of 376 outputs
Altmetric has tracked 22,757,090 research outputs across all sources so far. This one is in the 32nd percentile – i.e., 32% of other outputs scored the same or lower than it.
So far Altmetric has tracked 29,666 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 12.5. This one is in the 37th percentile – i.e., 37% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 228,065 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 41st percentile – i.e., 41% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 376 others from the same source and published within six weeks on either side of this one. This one is in the 25th percentile – i.e., 25% of its contemporaries scored the same or lower than it.