Linear and multiple linear least squares regression

by Trent Guidry19. July 2009 10:59
In this article I will set up the framework for doing linear and multiple linear least squares regression. Linear regression is a fairly simple way of fitting equation parameters to a set of observed data.
A multiple linear regression can be expressed by the equation below.
y = p0x0 + p1x1 + p2x2 + … + pmxm + e
Where y is the output of the regressed function, x0 to xm are different variables or functions, and e is the residual error of the regression equation at that point.
This can be written in matrix form as
Y = JP + E
n is the number of parameters.  m is the number of data points.
Mulitple Linear Regression Jacobian
Where Y is the vector of observed y points, A is the vector of coefficients, E is the vector of errors and Z is the matrix of functional values.
The square residual error Sr is calculated by rearranging and then squaring E as shown below.
Sr = E2 = (Y-JP)2 = (Y-JP)T(Y-JP) = YTY + PT(JTJ)P – 2YTJP
At the minimum square residual error, the derivatives of SR with respect to the unknowns A are equal to zero.
dSR/dP = 2JTJP – 2JTY = 0
This gives the equation shown below.
JTJP = JTY
Multiple Linear Regression 
To solve this equation, I will add a static function to the previously developed Polynomial class called Regress. The code for this function is shown below.
        public static double[] Regress(double[,] z, double[] y){//y=a0 z1 + a1 z1 +a2 z2 + a3 z3 +...//Z is the functional values.//Z index 0 is a row, the variables go across index 1.//Y is the summed value.//returns the coefficients.Debug.Assert(z != null && y != null);Debug.Assert(z.GetLength(0) == y.GetLength(0));Matrix zMatrix = z;Matrix zTransposeMatrix = zMatrix.Transpose();Matrix leftHandSide = zTransposeMatrix * zMatrix;Matrix rightHandSide = zTransposeMatrix * y;Matrix coefsMatrix = leftHandSide.SolveFor(rightHandSide);return coefsMatrix;}
To give an example of using this function, I will regress the parameters of the Clausisus Clapeyron correlation to the vapor pressure data of methanol that I got from the 6th edition of Perry's Chemical Engineers' Handbook.
The Clausisus Clapeyron equation is given as shown below.
ln Psat = A – B/T
The data that I will use to regress A and B is shown below. 
T K
P Pa
229.15
133.32249
247.85
666.61245
256.95
1333.2249
267.15
2666.4498
278.15
5332.8996
285.25
7999.3494
294.35
13332.249
307.95
26664.498
323.05
53328.996
337.85
101325.09
 
Where T is the temperature in Kelvin and P is the pressure in Pascals.
To regress these parameters, I will need to set up the Z matrix. The z functions are the functions that follow the regression parameters. For this case, z0 is simply 1 and z1 is -1/T. I also need y, which is the natural log of the pressure.
This regression is solved with the code shown below.
            double[] temperatures = new double[] { 229.15, 247.85, 256.95, 267.15, 278.15, 285.25, 294.35, 307.95, 323.05, 337.85 };double[] pressures = new double[] { 133.3224898, 666.6124489, 1333.224898, 2666.449795, 5332.899591, 7999.349386, 13332.24898, 26664.49795, 53328.99591, 101325.0922 };Func<double, double>[] zFunctions = new Func<double, double>[2];zFunctions[0] = tempeature => 1.0;zFunctions[1] = tempeature => -1.0 / tempeature;Func<double, double> yFunction = pressure => Math.Log(pressure);double[] yValues = new double[temperatures.Length];double[,] zValues = new double[temperatures.Length, zFunctions.Length];Parallel.For(0, temperatures.Length, i =>{yValues[i] = yFunction(pressures[i]);zValues[i, 0] = zFunctions[0](temperatures[i]);zValues[i, 1] = zFunctions[1](temperatures[i]);});double[] dCoefs = Polynomial.Regress(zValues, yValues);
Running this gives values of 25.46 for A and 4701.0 for B.
A comparison of the values calculated by the Clausisus Clapeyron equation using the regressed parameters verse the actual data values are shown in the table below.
T K
P Pa
Calculated
229.15
133.32249
140.5453202
247.85
666.61245
660.7495631
256.95
1333.2249
1293.504712
267.15
2666.4498
2600.997163
278.15
5332.8996
5216.386216
285.25
7999.3494
7944.511893
294.35
13332.249
13223.3454
307.95
26664.498
26770.72276
323.05
53328.996
54644.8107
337.85
101325.09
103371.4208
 
Regression Error Jacobian

Comments (4) -

xyz
xyzUnited States
7/28/2009 7:25:26 PM #

can we use the above mentioned function for 2nd order polynomial fit

Trent
TrentUnited States
7/29/2009 10:01:07 AM #

Yes.

That's a good question.  I will do a post on regressing polynomials.  Hopefully I will have time to get it out in the next couple of days.

Trent
TrentUnited States
7/29/2009 5:50:53 PM #

I finished the post.  It is at www.trentfguidry.net/.../...mial-Coefficients.aspx

It gives examples of fitting 1st, 2nd, 3rd, and 4th order poylnomials.

xyz
xyzUnited States
7/30/2009 11:55:08 PM #

Thanks for the article it was great.