In this article I will set up the framework for doing linear and multiple linear least squares regression. Linear regression is a fairly simple way of fitting equation parameters to a set of observed data.
A multiple linear regression can be expressed by the equation below.
y = p_{0}x_{0} + p_{1}x_{1} + p_{2}x_{2} + … + p_{m}x_{m} + e
Where y is the output of the regressed function, x_{0} to x_{m} are different variables or functions, and e is the residual error of the regression equation at that point.
This can be written in matrix form as
Y = JP + E
n is the number of parameters. m is the number of data points.
Where Y is the vector of observed y points, A is the vector of coefficients, E is the vector of errors and Z is the matrix of functional values.
The square residual error S_{r} is calculated by rearranging and then squaring E as shown below.
S_{r} = E^{2} = (YJP)^{2} = (YJP)^{T}(YJP) = Y^{T}Y + P^{T}(J^{T}J)P – 2Y^{T}JP
At the minimum square residual error, the derivatives of S_{R} with respect to the unknowns A are equal to zero.
dS_{R}/dP = 2J^{T}JP – 2J^{T}Y = 0
This gives the equation shown below.
J^{T}JP = J^{T}Y
To solve this equation, I will add a static function to the previously developed Polynomial class called Regress. The code for this function is shown below.
public static double[] Regress(double[,] z, double[] y){//y=a0 z1 + a1 z1 +a2 z2 + a3 z3 +...//Z is the functional values.//Z index 0 is a row, the variables go across index 1.//Y is the summed value.//returns the coefficients.Debug.Assert(z != null && y != null);Debug.Assert(z.GetLength(0) == y.GetLength(0));Matrix zMatrix = z;Matrix zTransposeMatrix = zMatrix.Transpose();Matrix leftHandSide = zTransposeMatrix * zMatrix;Matrix rightHandSide = zTransposeMatrix * y;Matrix coefsMatrix = leftHandSide.SolveFor(rightHandSide);return coefsMatrix;}
The Clausisus Clapeyron equation is given as shown below.
ln P^{sat} = A – B/T
The data that I will use to regress A and B is shown below.
T K

P Pa

229.15

133.32249

247.85

666.61245

256.95

1333.2249

267.15

2666.4498

278.15

5332.8996

285.25

7999.3494

294.35

13332.249

307.95

26664.498

323.05

53328.996

337.85

101325.09

Where T is the temperature in Kelvin and P is the pressure in Pascals.
To regress these parameters, I will need to set up the Z matrix. The z functions are the functions that follow the regression parameters. For this case, z_{0} is simply 1 and z_{1} is 1/T. I also need y, which is the natural log of the pressure.
This regression is solved with the code shown below.
double[] temperatures = new double[] { 229.15, 247.85, 256.95, 267.15, 278.15, 285.25, 294.35, 307.95, 323.05, 337.85 };double[] pressures = new double[] { 133.3224898, 666.6124489, 1333.224898, 2666.449795, 5332.899591, 7999.349386, 13332.24898, 26664.49795, 53328.99591, 101325.0922 };Func<double, double>[] zFunctions = new Func<double, double>[2];zFunctions[0] = tempeature => 1.0;zFunctions[1] = tempeature => 1.0 / tempeature;Func<double, double> yFunction = pressure => Math.Log(pressure);double[] yValues = new double[temperatures.Length];double[,] zValues = new double[temperatures.Length, zFunctions.Length];Parallel.For(0, temperatures.Length, i =>{yValues[i] = yFunction(pressures[i]);zValues[i, 0] = zFunctions[0](temperatures[i]);zValues[i, 1] = zFunctions[1](temperatures[i]);});double[] dCoefs = Polynomial.Regress(zValues, yValues);
Running this gives values of 25.46 for A and 4701.0 for B.
A comparison of the values calculated by the Clausisus Clapeyron equation using the regressed parameters verse the actual data values are shown in the table below.
T K

P Pa

Calculated

229.15

133.32249

140.5453202

247.85

666.61245

660.7495631

256.95

1333.2249

1293.504712

267.15

2666.4498

2600.997163

278.15

5332.8996

5216.386216

285.25

7999.3494

7944.511893

294.35

13332.249

13223.3454

307.95

26664.498

26770.72276

323.05

53328.996

54644.8107

337.85

101325.09

103371.4208
