# Zipf-Mandelbrot law

In probability theory and statistics, the Zipf-Mandelbrot law is a discrete probability distribution. Also known as the Pareto-Zipf law, it is a power-law distribution on ranked data, named after the Harvard linguistics professor George Kingsley Zipf (1902-1950) who suggested a simpler distribution called Zipf's law, and the mathematician Benoit Mandelbrot (born November 20, 1924), who subsequently generalized it.

The probability mass function is given by:

[itex]f_k(N,q,s)=\frac{1/(k+q)^s}{H_{N,q,s}}[itex]

where [itex]H_{N,q,s}[itex] is given by:

[itex]H_{N,q,s}=\sum_{i=1}^N \frac{1}{(i+q)^s}[itex]

which may be thought of as a generalization of a harmonic number. In the limit as [itex]N[itex] approaches infinity, this becomes the Hurwitz zeta function [itex]\zeta(q,s)[itex]. For finite [itex]N[itex] and [itex]q=0[itex] the Zipf-Mandelbrot law becomes Zipf's law. For infinite [itex]N[itex] and [itex]q=0[itex] it becomes a Zeta distribution.

## Applications

The distribution of words ranked by their frequency in a random corpus of text is generally a power-law distribution, known as Zipf's law.

If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh and Sidoro 2001).

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy