↓ Skip to main content

Zipf’s word frequency law in natural language: A critical review and future directions

Overview of attention for article published in Psychonomic Bulletin & Review, March 2014
Altmetric Badge

Mentioned by

news
8 news outlets
blogs
4 blogs
twitter
33 X users
patent
1 patent
wikipedia
1 Wikipedia page
googleplus
1 Google+ user

Citations

dimensions_citation
498 Dimensions

Readers on

mendeley
420 Mendeley
Title
Zipf’s word frequency law in natural language: A critical review and future directions
Published in
Psychonomic Bulletin & Review, March 2014
DOI 10.3758/s13423-014-0585-6
Pubmed ID
Authors

Steven T. Piantadosi

Abstract

The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 33 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 420 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 7 2%
Germany 3 <1%
Italy 2 <1%
United Kingdom 2 <1%
Portugal 1 <1%
Netherlands 1 <1%
Brazil 1 <1%
Colombia 1 <1%
France 1 <1%
Other 4 <1%
Unknown 397 95%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 108 26%
Student > Master 58 14%
Researcher 48 11%
Student > Bachelor 35 8%
Student > Doctoral Student 32 8%
Other 72 17%
Unknown 67 16%
Readers by discipline Count As %
Computer Science 72 17%
Linguistics 68 16%
Psychology 56 13%
Social Sciences 21 5%
Engineering 19 5%
Other 94 22%
Unknown 90 21%