ā†“ Skip to main content

Intuitive visualizations of pitch and loudness in speech

Overview of attention for article published in Psychonomic Bulletin & Review, September 2015
Altmetric Badge

Mentioned by

twitter
6 X users

Citations

dimensions_citation
8 Dimensions

Readers on

mendeley
41 Mendeley
Title
Intuitive visualizations of pitch and loudness in speech
Published in
Psychonomic Bulletin & Review, September 2015
DOI 10.3758/s13423-015-0934-0
Pubmed ID
Authors

Rebecca S. Schaefer, Lilian J. Beijer, Wiel Seuskens, Toni C. M. Rietveld, Makiko Sadakata

Abstract

Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (nā€‰=ā€‰249) was performed to evaluate spatial parameters that may optimally represent pitch and loudness of speech. To this end, five novel animated visualizations were developed and presented in pairwise comparisons, together with a static visualization. Pitch and loudness of speech were each mapped onto either the vertical (y-axis) or the size (z-axis) dimension, or combined (with size indicating loudness and vertical position indicating pitch height) and visualized as an animation along the horizontal dimension (x-axis) over time. The results indicated that firstly, there is a general preference towards the use of the y-axis for both pitch and loudness, with pitch ranking higher than loudness in terms of fit. Secondly, the data suggest that representing both pitch and loudness combined in a single visualization is preferred over visualization in only one dimension. Finally, the z-axis, although not preferred, was evaluated as corresponding better to loudness than to pitch. This relation between sound and visual space has not been reported previously for speech sounds, and elaborates earlier findings on musical material. In addition to elucidating more general mappings between auditory and visual modalities, the findings provide us with a method of visualizing speech that may be helpful in clinical applications such as computerized speech therapy, or other feedback-based learning paradigms.

X Demographics

X Demographics

The data shown below were collected from the profiles of 6 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 41 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United Kingdom 1 2%
Spain 1 2%
Colombia 1 2%
Unknown 38 93%

Demographic breakdown

Readers by professional status Count As %
Student > Master 9 22%
Student > Ph. D. Student 8 20%
Student > Bachelor 5 12%
Researcher 4 10%
Professor 3 7%
Other 7 17%
Unknown 5 12%
Readers by discipline Count As %
Psychology 11 27%
Computer Science 6 15%
Arts and Humanities 5 12%
Nursing and Health Professions 4 10%
Medicine and Dentistry 3 7%
Other 5 12%
Unknown 7 17%