org.apache.commons.math3.distribution
Class ZipfDistribution

java.lang.Object
  extended by org.apache.commons.math3.distribution.AbstractIntegerDistribution
      extended by org.apache.commons.math3.distribution.ZipfDistribution
All Implemented Interfaces:
Serializable, IntegerDistribution

public class ZipfDistribution
extends AbstractIntegerDistribution

Implementation of the Zipf distribution.

Version:
$Id: ZipfDistribution.java 1244107 2012-02-14 16:17:55Z erans $
See Also:
Zipf distribution (MathWorld), Serialized Form

Field Summary
private  double exponent
          Exponent parameter of the distribution.
private  int numberOfElements
          Number of elements.
private  double numericalMean
          Cached numerical mean
private  boolean numericalMeanIsCalculated
          Whether or not the numerical mean has been calculated
private  double numericalVariance
          Cached numerical variance
private  boolean numericalVarianceIsCalculated
          Whether or not the numerical variance has been calculated
private static long serialVersionUID
          Serializable version identifier.
 
Fields inherited from class org.apache.commons.math3.distribution.AbstractIntegerDistribution
randomData
 
Constructor Summary
ZipfDistribution(int numberOfElements, double exponent)
          Create a new Zipf distribution with the given number of elements and exponent.
 
Method Summary
protected  double calculateNumericalMean()
          Used by getNumericalMean().
protected  double calculateNumericalVariance()
          Used by getNumericalVariance().
 double cumulativeProbability(int x)
          For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x).
private  double generalizedHarmonic(int n, double m)
          Calculates the Nth generalized harmonic number.
 double getExponent()
          Get the exponent characterizing the distribution.
 int getNumberOfElements()
          Get the number of elements (e.g.
 double getNumericalMean()
          Use this method to get the numerical value of the mean of this distribution.
 double getNumericalVariance()
          Use this method to get the numerical value of the variance of this distribution.
 int getSupportLowerBound()
          Access the lower bound of the support.
 int getSupportUpperBound()
          Access the upper bound of the support.
 boolean isSupportConnected()
          Use this method to get information about whether the support is connected, i.e.
 double probability(int x)
          For a random variable X whose values are distributed according to this distribution, this method returns P(X = x).
 
Methods inherited from class org.apache.commons.math3.distribution.AbstractIntegerDistribution
cumulativeProbability, inverseCumulativeProbability, reseedRandomGenerator, sample, sample, solveInverseCumulativeProbability
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serialVersionUID

private static final long serialVersionUID
Serializable version identifier.

See Also:
Constant Field Values

numberOfElements

private final int numberOfElements
Number of elements.


exponent

private final double exponent
Exponent parameter of the distribution.


numericalMean

private double numericalMean
Cached numerical mean


numericalMeanIsCalculated

private boolean numericalMeanIsCalculated
Whether or not the numerical mean has been calculated


numericalVariance

private double numericalVariance
Cached numerical variance


numericalVarianceIsCalculated

private boolean numericalVarianceIsCalculated
Whether or not the numerical variance has been calculated

Constructor Detail

ZipfDistribution

public ZipfDistribution(int numberOfElements,
                        double exponent)
                 throws NotStrictlyPositiveException
Create a new Zipf distribution with the given number of elements and exponent.

Parameters:
numberOfElements - Number of elements.
exponent - Exponent.
Throws:
NotStrictlyPositiveException - if numberOfElements <= 0 or exponent <= 0.
Method Detail

getNumberOfElements

public int getNumberOfElements()
Get the number of elements (e.g. corpus size) for the distribution.

Returns:
the number of elements

getExponent

public double getExponent()
Get the exponent characterizing the distribution.

Returns:
the exponent

probability

public double probability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X = x). In other words, this method represents the probability mass function (PMF) for the distribution.

Parameters:
x - the point at which the PMF is evaluated
Returns:
the value of the probability mass function at x

cumulativeProbability

public double cumulativeProbability(int x)
For a random variable X whose values are distributed according to this distribution, this method returns P(X <= x). In other words, this method represents the (cumulative) distribution function (CDF) for this distribution.

Parameters:
x - the point at which the CDF is evaluated
Returns:
the probability that a random variable with this distribution takes a value less than or equal to x

getNumericalMean

public double getNumericalMean()
Use this method to get the numerical value of the mean of this distribution. For number of elements N and exponent s, the mean is Hs1 / Hs, where

Returns:
the mean or Double.NaN if it is not defined

calculateNumericalMean

protected double calculateNumericalMean()
Used by getNumericalMean().

Returns:
the mean of this distribution

getNumericalVariance

public double getNumericalVariance()
Use this method to get the numerical value of the variance of this distribution. For number of elements N and exponent s, the mean is (Hs2 / Hs) - (Hs1^2 / Hs^2), where

Returns:
the variance (possibly Double.POSITIVE_INFINITY or Double.NaN if it is not defined)

calculateNumericalVariance

protected double calculateNumericalVariance()
Used by getNumericalVariance().

Returns:
the variance of this distribution

generalizedHarmonic

private double generalizedHarmonic(int n,
                                   double m)
Calculates the Nth generalized harmonic number. See Harmonic Series.

Parameters:
n - Term in the series to calculate (must be larger than 1)
m - Exponent (special case m = 1 is the harmonic series).
Returns:
the nth generalized harmonic number.

getSupportLowerBound

public int getSupportLowerBound()
Access the lower bound of the support. This method must return the same value as inverseCumulativeProbability(0). In other words, this method must return

inf {x in Z | P(X <= x) > 0}.

The lower bound of the support is always 1 no matter the parameters.

Returns:
lower bound of the support (always 1)

getSupportUpperBound

public int getSupportUpperBound()
Access the upper bound of the support. This method must return the same value as inverseCumulativeProbability(1). In other words, this method must return

inf {x in R | P(X <= x) = 1}.

The upper bound of the support is the number of elements.

Returns:
upper bound of the support

isSupportConnected

public boolean isSupportConnected()
Use this method to get information about whether the support is connected, i.e. whether all integers between the lower and upper bound of the support are included in the support. The support of this distribution is connected.

Returns:
true


Copyright (c) 2003-2013 Apache Software Foundation