org.apache.commons.math.stat.regression
Class OLSMultipleLinearRegression

java.lang.Object
  extended by org.apache.commons.math.stat.regression.AbstractMultipleLinearRegression
      extended by org.apache.commons.math.stat.regression.OLSMultipleLinearRegression
All Implemented Interfaces:
MultipleLinearRegression

public class OLSMultipleLinearRegression
extends AbstractMultipleLinearRegression

Implements ordinary least squares (OLS) to estimate the parameters of a multiple linear regression model.

OLS assumes the covariance matrix of the error to be diagonal and with equal variance.

u ~ N(0, σ2I)

The regression coefficients, b, satisfy the normal equations:

XT X b = XT y

To solve the normal equations, this implementation uses QR decomposition of the X matrix. (See QRDecompositionImpl for details on the decomposition algorithm.)

XTX b = XT y
(QR)T (QR) b = (QR)Ty
RT (QTQ) R b = RT QT y
RT R b = RT QT y
(RT)-1 RT R b = (RT)-1 RT QT y
R b = QT y

Given Q and R, the last equation is solved by back-subsitution.

Since:
2.0
Version:
$Revision: 825925 $ $Date: 2009-10-16 11:11:47 -0400 (Fri, 16 Oct 2009) $

Field Summary
private  QRDecomposition qr
          Cached QR decomposition of X matrix
 
Fields inherited from class org.apache.commons.math.stat.regression.AbstractMultipleLinearRegression
X, Y
 
Constructor Summary
OLSMultipleLinearRegression()
           
 
Method Summary
protected  RealVector calculateBeta()
          Calculates regression coefficients using OLS.
protected  RealMatrix calculateBetaVariance()
          Calculates the variance on the beta by OLS.
 RealMatrix calculateHat()
          Compute the "hat" matrix.
protected  double calculateYVariance()
          Calculates the variance on the Y by OLS.
 void newSampleData(double[] y, double[][] x)
          Loads model x and y sample data, overriding any previous sample.
 void newSampleData(double[] data, int nobs, int nvars)
          Loads model x and y sample data from a flat array of data, overriding any previous sample.
protected  void newXSampleData(double[][] x)
          Loads new x sample data, overriding any previous sample
 
Methods inherited from class org.apache.commons.math.stat.regression.AbstractMultipleLinearRegression
calculateResiduals, estimateRegressandVariance, estimateRegressionParameters, estimateRegressionParametersStandardErrors, estimateRegressionParametersVariance, estimateResiduals, newYSampleData, validateCovarianceData, validateSampleData
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

qr

private QRDecomposition qr
Cached QR decomposition of X matrix

Constructor Detail

OLSMultipleLinearRegression

public OLSMultipleLinearRegression()
Method Detail

newSampleData

public void newSampleData(double[] y,
                          double[][] x)
Loads model x and y sample data, overriding any previous sample. Computes and caches QR decomposition of the X matrix.

Parameters:
y - the [n,1] array representing the y sample
x - the [n,k] array representing the x sample
Throws:
java.lang.IllegalArgumentException - if the x and y array data are not compatible for the regression

newSampleData

public void newSampleData(double[] data,
                          int nobs,
                          int nvars)
Loads model x and y sample data from a flat array of data, overriding any previous sample. Assumes that rows are concatenated with y values first in each row. Computes and caches QR decomposition of the X matrix

Overrides:
newSampleData in class AbstractMultipleLinearRegression
Parameters:
data - input data array
nobs - number of observations (rows)
nvars - number of independent variables (columns, not counting y)

calculateHat

public RealMatrix calculateHat()

Compute the "hat" matrix.

The hat matrix is defined in terms of the design matrix X by X(XTX)-1XT

The implementation here uses the QR decomposition to compute the hat matrix as Q IpQT where Ip is the p-dimensional identity matrix augmented by 0's. This computational formula is from "The Hat Matrix in Regression and ANOVA", David C. Hoaglin and Roy E. Welsch, The American Statistician, Vol. 32, No. 1 (Feb., 1978), pp. 17-22.

Returns:
the hat matrix

newXSampleData

protected void newXSampleData(double[][] x)
Loads new x sample data, overriding any previous sample

Overrides:
newXSampleData in class AbstractMultipleLinearRegression
Parameters:
x - the [n,k] array representing the x sample

calculateBeta

protected RealVector calculateBeta()
Calculates regression coefficients using OLS.

Specified by:
calculateBeta in class AbstractMultipleLinearRegression
Returns:
beta

calculateBetaVariance

protected RealMatrix calculateBetaVariance()

Calculates the variance on the beta by OLS.

Var(b) = (XTX)-1

Uses QR decomposition to reduce (XTX)-1 to (RTR)-1, with only the top p rows of R included, where p = the length of the beta vector.

Specified by:
calculateBetaVariance in class AbstractMultipleLinearRegression
Returns:
The beta variance

calculateYVariance

protected double calculateYVariance()

Calculates the variance on the Y by OLS.

Var(y) = Tr(uTu)/(n - k)

Specified by:
calculateYVariance in class AbstractMultipleLinearRegression
Returns:
The Y variance


Copyright (c) 2003-2011 Apache Software Foundation