Title |
Zipf’s word frequency law in natural language: A critical review and future directions
|
---|---|
Published in |
Psychonomic Bulletin & Review, March 2014
|
DOI | 10.3758/s13423-014-0585-6 |
Pubmed ID | |
Authors |
Steven T. Piantadosi |
Abstract |
The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. |
X Demographics
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 8 | 24% |
United Kingdom | 3 | 9% |
Mexico | 2 | 6% |
Czechia | 1 | 3% |
Spain | 1 | 3% |
Germany | 1 | 3% |
New Zealand | 1 | 3% |
Taiwan | 1 | 3% |
Unknown | 15 | 45% |
Demographic breakdown
Type | Count | As % |
---|---|---|
Members of the public | 26 | 79% |
Scientists | 5 | 15% |
Science communicators (journalists, bloggers, editors) | 2 | 6% |
Mendeley readers
Geographical breakdown
Country | Count | As % |
---|---|---|
United States | 7 | 2% |
Germany | 3 | <1% |
Italy | 2 | <1% |
United Kingdom | 2 | <1% |
Portugal | 1 | <1% |
Netherlands | 1 | <1% |
Brazil | 1 | <1% |
Colombia | 1 | <1% |
France | 1 | <1% |
Other | 4 | <1% |
Unknown | 397 | 95% |
Demographic breakdown
Readers by professional status | Count | As % |
---|---|---|
Student > Ph. D. Student | 108 | 26% |
Student > Master | 58 | 14% |
Researcher | 48 | 11% |
Student > Bachelor | 35 | 8% |
Student > Doctoral Student | 32 | 8% |
Other | 72 | 17% |
Unknown | 67 | 16% |
Readers by discipline | Count | As % |
---|---|---|
Computer Science | 72 | 17% |
Linguistics | 68 | 16% |
Psychology | 56 | 13% |
Social Sciences | 21 | 5% |
Engineering | 19 | 5% |
Other | 94 | 22% |
Unknown | 90 | 21% |