org.apache.commons.math3.stat.regression
public class MillerUpdatingRegression extends Object implements UpdatingMultipleLinearRegression
UpdatingMultipleLinearRegression
interface.
The algorithm is described in:
Algorithm AS 274: Least Squares Routines to Supplement Those of Gentleman Author(s): Alan J. Miller Source: Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 41, No. 2 (1992), pp. 458-478 Published by: Blackwell Publishing for the Royal Statistical Society Stable URL: http://www.jstor.org/stable/2347583
This method for multiple regression forms the solution to the OLS problem by updating the QR decomposition as described by Gentleman.
Modifier and Type | Field and Description |
---|---|
private double[] |
d
diagonals of cross products matrix
|
private double |
epsilon
zero tolerance
|
private boolean |
hasIntercept
boolean flag whether a regression constant is added
|
private boolean[] |
lindep
flags for variables with linear dependency problems
|
private long |
nobs
number of observations entered
|
private int |
nvars
number of variables in regression
|
private double[] |
r
the off diagonal portion of the R matrix
|
private double[] |
rhs
the elements of the R`Y
|
private double[] |
rss
residual sum of squares for all nested regressions
|
private boolean |
rss_set
has rss been called?
|
private double |
sserr
sum of squared errors of largest regression
|
private double |
sumsqy
summation of squared Y values
|
private double |
sumy
summation of Y variable
|
private double[] |
tol
the tolerance for each of the variables
|
private boolean |
tol_set
has the tolerance setting method been called
|
private int[] |
vorder
order of the regressors
|
private double[] |
work_sing
workspace for singularity method
|
private double[] |
work_tolset
scratch space for tolerance calc
|
private double[] |
x_sing
singular x values
|
Modifier | Constructor and Description |
---|---|
private |
MillerUpdatingRegression()
Set the default constructor to private access
to prevent inadvertent instantiation
|
|
MillerUpdatingRegression(int numberOfVariables,
boolean includeConstant)
Primary constructor for the MillerUpdatingRegression.
|
|
MillerUpdatingRegression(int numberOfVariables,
boolean includeConstant,
double errorTolerance)
This is the augmented constructor for the MillerUpdatingRegression class.
|
Modifier and Type | Method and Description |
---|---|
void |
addObservation(double[] x,
double y)
Adds an observation to the regression model.
|
void |
addObservations(double[][] x,
double[] y)
Adds multiple observations to the model.
|
void |
clear()
As the name suggests, clear wipes the internals and reorders everything in the
canonical order.
|
private double[] |
cov(int nreq)
Calculates the cov matrix assuming only the first nreq variables are
included in the calculation.
|
double |
getDiagonalOfHatMatrix(double[] row_data)
Gets the diagonal of the Hat matrix also known as the leverage matrix.
|
long |
getN()
Gets the number of observations added to the regression model.
|
int[] |
getOrderOfRegressors()
Gets the order of the regressors, useful if some type of reordering
has been called.
|
double[] |
getPartialCorrelations(int in)
In the original algorithm only the partial correlations of the regressors
is returned to the user.
|
boolean |
hasIntercept()
A getter method which determines whether a constant is included.
|
private void |
include(double[] x,
double wi,
double yi)
The include method is where the QR decomposition occurs.
|
private void |
inverse(double[] rinv,
int nreq)
This internal method calculates the inverse of the upper-triangular portion
of the R matrix.
|
private double[] |
regcf(int nreq)
The regcf method conducts the linear regression and extracts the
parameter vector.
|
RegressionResults |
regress()
Conducts a regression on the data in the model, using all regressors.
|
RegressionResults |
regress(int numberOfRegressors)
Conducts a regression on the data in the model, using a subset of regressors.
|
RegressionResults |
regress(int[] variablesToInclude)
Conducts a regression on the data in the model, using regressors in array
Calling this method will change the internal order of the regressors
and care is required in interpreting the hatmatrix.
|
private int |
reorderRegressors(int[] list,
int pos1)
ALGORITHM AS274 APPL.
|
private void |
singcheck()
The method which checks for singularities and then eliminates the offending
columns.
|
private double |
smartAdd(double a,
double b)
Adds to number a and b such that the contamination due to
numerical smallness of one addend does not corrupt the sum.
|
private void |
ss()
Calculates the sum of squared errors for the full regression
and all subsets in the following manner:
|
private void |
tolset()
This sets up tolerances for singularity testing.
|
private void |
vmove(int from,
int to)
ALGORITHM AS274 APPL.
|
private final int nvars
private final double[] d
private final double[] rhs
private final double[] r
private final double[] tol
private final double[] rss
private final int[] vorder
private final double[] work_tolset
private long nobs
private double sserr
private boolean rss_set
private boolean tol_set
private final boolean[] lindep
private final double[] x_sing
private final double[] work_sing
private double sumy
private double sumsqy
private boolean hasIntercept
private final double epsilon
private MillerUpdatingRegression()
public MillerUpdatingRegression(int numberOfVariables, boolean includeConstant, double errorTolerance) throws ModelSpecificationException
numberOfVariables
- number of regressors to expect, not including constantincludeConstant
- include a constant automaticallyerrorTolerance
- zero tolerance, how machine zero is determinedModelSpecificationException
- if numberOfVariables is less than 1
public MillerUpdatingRegression(int numberOfVariables, boolean includeConstant) throws ModelSpecificationException
numberOfVariables
- maximum number of potential regressorsincludeConstant
- include a constant automaticallyModelSpecificationException
- if numberOfVariables is less than 1
public boolean hasIntercept()
hasIntercept
in interface UpdatingMultipleLinearRegression
public long getN()
getN
in interface UpdatingMultipleLinearRegression
public void addObservation(double[] x, double y) throws ModelSpecificationException
addObservation
in interface UpdatingMultipleLinearRegression
x
- the array with regressor valuesy
- the value of dependent variable given these regressorsModelSpecificationException
- if the length of x
does not equal
the number of independent variables in the modelpublic void addObservations(double[][] x, double[] y) throws ModelSpecificationException
addObservations
in interface UpdatingMultipleLinearRegression
x
- observations on the regressorsy
- observations on the regressandModelSpecificationException
- if x
is not rectangular, does not match
the length of y
or does not contain sufficient data to estimate the modelprivate void include(double[] x, double wi, double yi)
x
- observations on the regressorswi
- weight of the this observation (-1,1)yi
- observation on the regressandprivate double smartAdd(double a, double b)
a
- - an addendb
- - an addendpublic void clear()
clear
in interface UpdatingMultipleLinearRegression
private void tolset()
private double[] regcf(int nreq) throws ModelSpecificationException
nreq
- how many of the regressors to include (either in canonical
order, or in the current reordered state)ModelSpecificationException
- if nreq
is less than 1
or greater than the number of independent variablesprivate void singcheck()
private void ss()
rss[] ={ ResidualSumOfSquares_allNvars, ResidualSumOfSquares_FirstNvars-1, ResidualSumOfSquares_FirstNvars-2, ..., ResidualSumOfSquares_FirstVariable}
private double[] cov(int nreq)
cov = { cov_00, cov_10, cov_11, cov_20, cov_21, cov22, ... }
nreq
- how many of the regressors to include (either in canonical
order, or in the current reordered state)private void inverse(double[] rinv, int nreq)
rinv
- the storage for the inverse of rnreq
- how many of the regressors to include (either in canonical
order, or in the current reordered state)public double[] getPartialCorrelations(int in)
corr = { corrxx - lower triangular corrxy - bottom row of the matrix } Replaces subroutines PCORR and COR of: ALGORITHM AS274 APPL. STATIST. (1992) VOL.41, NO. 2
Calculate partial correlations after the variables in rows 1, 2, ..., IN have been forced into the regression. If IN = 1, and the first row of R represents a constant in the model, then the usual simple correlations are returned.
If IN = 0, the value returned in array CORMAT for the correlation of variables Xi & Xj is:
sum ( Xi.Xj ) / Sqrt ( sum (Xi^2) . sum (Xj^2) )
On return, array CORMAT contains the upper triangle of the matrix of partial correlations stored by rows, excluding the 1's on the diagonal. e.g. if IN = 2, the consecutive elements returned are: (3,4) (3,5) ... (3,ncol), (4,5) (4,6) ... (4,ncol), etc. Array YCORR stores the partial correlations with the Y-variable starting with YCORR(IN+1) = partial correlation with the variable in position (IN+1).
in
- how many of the regressors to include (either in canonical
order, or in the current reordered state)private void vmove(int from, int to)
from
- initial positionto
- destinationprivate int reorderRegressors(int[] list, int pos1)
Re-order the variables in an orthogonal reduction produced by AS75.1 so that the N variables in LIST start at position POS1, though will not necessarily be in the same order as in LIST. Any variables in VORDER before position POS1 are not moved. Auxiliary routine called: VMOVE.
This internal method reorders the regressors.
list
- the regressors to movepos1
- where the list will be placedpublic double getDiagonalOfHatMatrix(double[] row_data)
row_data
- returns the diagonal of the hat matrix for this observationpublic int[] getOrderOfRegressors()
public RegressionResults regress() throws ModelSpecificationException
regress
in interface UpdatingMultipleLinearRegression
ModelSpecificationException
- - thrown if number of observations is
less than the number of variablespublic RegressionResults regress(int numberOfRegressors) throws ModelSpecificationException
numberOfRegressors
- many of the regressors to include (either in canonical
order, or in the current reordered state)ModelSpecificationException
- - thrown if number of observations is
less than the number of variables or number of regressors requested
is greater than the regressors in the modelpublic RegressionResults regress(int[] variablesToInclude) throws ModelSpecificationException
regress
in interface UpdatingMultipleLinearRegression
variablesToInclude
- array of variables to include in regressionModelSpecificationException
- - thrown if number of observations is
less than the number of variables, the number of regressors requested
is greater than the regressors in the model or a regressor index in
regressor array does not existCopyright (c) 2003-2013 Apache Software Foundation