|
Statistics.Quantile | Portability | portable | Stability | experimental | Maintainer | bos@serpentine.com |
|
|
|
|
|
Description |
Functions for approximating quantiles, i.e. points taken at regular
intervals from the cumulative distribution function of a random
variable.
The number of quantiles is described below by the variable q, so
with q=4, a 4-quantile (also known as a quartile) has 4
intervals, and contains 5 points. The parameter k describes the
desired point, where 0 ≤ k ≤ q.
|
|
Synopsis |
|
|
|
|
Quantile estimation functions
|
|
|
:: Vector v Double | | => Int | k, the desired quantile.
| -> Int | q, the number of quantiles.
| -> v Double | x, the sample data.
| -> Double | | O(n log n). Estimate the kth q-quantile of a sample,
using the weighted average method.
|
|
|
|
Parameters a and b to the continuousBy function.
| Constructors | |
|
|
|
:: Vector v Double | | => ContParam | Parameters a and b.
| -> Int | k, the desired quantile.
| -> Int | q, the number of quantiles.
| -> v Double | x, the sample data.
| -> Double | | O(n log n). Estimate the kth q-quantile of a sample x,
using the continuous sample method with the given parameters. This
is the method used by most statistical software, such as R,
Mathematica, SPSS, and S.
|
|
|
|
:: Vector v Double | | => ContParam | Parameters a and b.
| -> Int | q, the number of quantiles.
| -> v Double | x, the sample data.
| -> Double | | O(n log n). Estimate the range between q-quantiles 1 and
q-1 of a sample x, using the continuous sample method with the
given parameters.
For instance, the interquartile range (IQR) can be estimated as
follows:
midspread medianUnbiased 4 (U.fromList [1,1,2,2,3])
==> 1.333333
|
|
|
Parameters for the continuous sample method
|
|
|
California Department of Public Works definition, a=0, b=1.
Gives a linear interpolation of the empirical CDF. This
corresponds to method 4 in R and Mathematica.
|
|
|
Hazen's definition, a=0.5, b=0.5. This is claimed to be
popular among hydrologists. This corresponds to method 5 in R and
Mathematica.
|
|
|
Definition used by the S statistics application, with a=1,
b=1. The interpolation points divide the sample range into n-1
intervals. This corresponds to method 7 in R and Mathematica.
|
|
|
Definition used by the SPSS statistics application, with a=0,
b=0 (also known as Weibull's definition). This corresponds to
method 6 in R and Mathematica.
|
|
|
Median unbiased definition, a=1/3, b=1/3. The resulting
quantile estimates are approximately median unbiased regardless of
the distribution of x. This corresponds to method 8 in R and
Mathematica.
|
|
|
Normal unbiased definition, a=3/8, b=3/8. An approximately
unbiased estimate if the empirical distribution approximates the
normal distribution. This corresponds to method 9 in R and
Mathematica.
|
|
References
|
|
|
|
Produced by Haddock version 2.4.2 |