This article expands on the previous post about Linear and multiple linear least squares regression in C# and gives examples of linear least squares regression of polynomial coefficients.
For this example, I will use the data shown below. I intentionally picked data that is difficult to regress a polynomial to because it would be kind of boring to regress a second order or higher polynomial to data that is pretty much linear.
|
2.3601
|
133.32
|
|
2.3942
|
666.61
|
|
2.4098
|
1333.22
|
|
2.4268
|
2666.45
|
|
2.4443
|
5332.90
|
|
2.4552
|
7999.35
|
|
2.4689
|
13332.25
|
|
2.4885
|
26664.50
|
|
2.5093
|
53329.00
|
|
2.5287
|
101325.09
|
For a first order polynomial (a line), the equation is:
Y = A + BX
For this equation, z0 is 1 and z1 is X, and y is Y.
This is done in the code shown below.
double[] x = new double[] { 2.3601, 2.3942, 2.4098, 2.4268, 2.4443, 2.4552, 2.4689, 2.4885, 2.5093, 2.5287 };
double[] y = new double[] { 133.322, 666.612, 1333.22, 2666.45, 5332.9, 7999.35, 13332.2, 26664.5, 53329, 101325 };
int nPolyOrder = 1;
double[,] z = new double[y.Length, nPolyOrder + 1];
for (int i = 0; i < y.Length; i++)
{
z[i, 0] = 1.0;
z[i, 1] = x[i];
}
double[] coefs = Polynomial.Regress(z, y);
This gives A = -1212289, B = 503788.92 and the data below.
|
X
|
Y
|
Fitted
|
|
2.3601
|
133.32
|
-23286.98
|
|
2.3942
|
666.61
|
-6123.37
|
|
2.4098
|
1333.22
|
1765.81
|
|
2.4268
|
2666.45
|
10283.15
|
|
2.4443
|
5332.90
|
19111.49
|
|
2.4552
|
7999.35
|
24626.26
|
|
2.4689
|
13332.25
|
31497.13
|
|
2.4885
|
26664.50
|
41379.54
|
|
2.5093
|
53329.00
|
51853.08
|
|
2.5287
|
101325.09
|
61653.87
|

For a second order polynomial, the equation is:
Y = A + BX + CX2
For this equation, z0 is 1, z1 is X, z2 is X2, and y is Y.
This is done in the code shown below.
double[] x = new double[] { 2.3601, 2.3942, 2.4098, 2.4268, 2.4443, 2.4552, 2.4689, 2.4885, 2.5093, 2.5287 };
double[] y = new double[] { 133.322, 666.612, 1333.22, 2666.45, 5332.9, 7999.35, 13332.2, 26664.5, 53329, 101325 };
int nPolyOrder = 2;
double[,] z = new double[y.Length, nPolyOrder + 1];
for (int i = 0; i < y.Length; i++)
{
z[i, 0] = 1.0;
z[i, 1] = x[i];
z[i, 2] = x[i] * x[i];
}
double[] coefs = Polynomial.Regress(z, y);
This gives A = 36759078.32, B = -30552595.42, C = 6347521.4 and the data below.
|
X
|
Y
|
Fitted
|
|
2.3601
|
133.32
|
8037.42
|
|
2.3942
|
666.61
|
-4722.08
|
|
2.4098
|
1333.22
|
-5643.88
|
|
2.4268
|
2666.45
|
-3144.23
|
|
2.4443
|
5332.90
|
3276.49
|
|
2.4552
|
7999.35
|
9265.54
|
|
2.4689
|
13332.25
|
18855.64
|
|
2.4885
|
26664.50
|
36789.79
|
|
2.5093
|
53329.00
|
61128.71
|
|
2.5287
|
101325.09
|
88873.81
|

For a third order polynomial, the equation is:
Y = A + BX + CX2 + DX3
For this equation, z0 is 1, z1 is X, z2 is X2, z3 is X3, and y is Y.
This is done in the code shown below.
double[] x = new double[] { 2.3601, 2.3942, 2.4098, 2.4268, 2.4443, 2.4552, 2.4689, 2.4885, 2.5093, 2.5287 };
double[] y = new double[] { 133.322, 666.612, 1333.22, 2666.45, 5332.9, 7999.35, 13332.2, 26664.5, 53329, 101325 };
int nPolyOrder = 3;
double[,] z = new double[y.Length, nPolyOrder + 1];
for (int i = 0; i < y.Length; i++)
{
z[i, 0] = 1.0;
z[i, 1] = x[i];
z[i, 2] = x[i] * x[i];
z[i, 3] = x[i] * x[i] * x[i];
}
double[] coefs = Polynomial.Regress(z, y);
This gives A = -830167848.92, B = 1034157557.04, C = -429397609.64, D = 59427310.55, and the data below.
|
X
|
Y
|
Fitted
|
|
2.3601
|
133.32
|
-1114.68
|
|
2.3942
|
666.61
|
3407.84
|
|
2.4098
|
1333.22
|
2643.43
|
|
2.4268
|
2666.45
|
1988.25
|
|
2.4443
|
5332.90
|
3291.59
|
|
2.4552
|
7999.35
|
5970.15
|
|
2.4689
|
13332.25
|
12152.31
|
|
2.4885
|
26664.50
|
28292.46
|
|
2.5093
|
53329.00
|
57429.76
|
|
2.5287
|
101325.09
|
98694.28
|

For a fourth order polynomial, the equation is:
Y = A + BX + CX2 + DX3 + EX4
For this equation, z0 is 1, z1 is X, z2 is X2, z3 is X3, z4 is X4, and y is Y.
This is done in the code shown below.
double[] x = new double[] { 2.3601, 2.3942, 2.4098, 2.4268, 2.4443, 2.4552, 2.4689, 2.4885, 2.5093, 2.5287 };
double[] y = new double[] { 133.322, 666.612, 1333.22, 2666.45, 5332.9, 7999.35, 13332.2, 26664.5, 53329, 101325 };
int nPolyOrder = 4;
double[,] z = new double[y.Length, nPolyOrder + 1];
for (int i = 0; i < y.Length; i++)
{
z[i, 0] = 1.0;
z[i, 1] = x[i];
z[i, 2] = x[i] * x[i];
z[i, 3] = x[i] * x[i] * x[i];
z[i, 4] = x[i] * x[i] * x[i] * x[i];
}
double[] coefs = Polynomial.Regress(z, y);
This gives A = 9470307327.9, B = -15832920218.37, C = 9925973640.82, D = -2765589751.22, E = 288948184.43, and the data below.
|
X
|
Y
|
Fitted
|
|
2.3601
|
133.32
|
-255.30
|
|
2.3942
|
666.61
|
1289.77
|
|
2.4098
|
1333.22
|
1932.21
|
|
2.4268
|
2666.45
|
2786.55
|
|
2.4443
|
5332.90
|
4755.05
|
|
2.4552
|
7999.35
|
7217.83
|
|
2.4689
|
13332.25
|
12526.31
|
|
2.4885
|
26664.50
|
26899.26
|
|
2.5093
|
53329.00
|
55407.42
|
|
2.5287
|
101325.09
|
100200.45
|

Polynomial regression Jacobian.

Subscribe

Hi one more question
when i modified your code
for (int i = 0; i < dY.Length; i++)
{
dZ[i, 0] = 1.0;
dZ[i, 1] = dX[i];
dZ[i, 2] = dX[i] * dX[i];
dZ[i, 3] = dX[i] * dX[i] * dX[i];
}
to
for (int i = 0; i < dY.Length; i++)
{
for (int j = 0; i <= nPolyOrder ; i++)
{
dZ[i, j] = dX[i]^j
} }
I get the same results until poly order =3 it dosent match your results when order = 4 or higher i have tested the same with excel and the results are same. When order of poly is greater 3 the results wont match. could you please shed some light on this. I was also wondering if I could get the R2 value.
Thanks
For
dX[i]^j
I am not really sure how that is working. I am guessing that you are actually using Math.Pow.
I ran this though and I can reproduce the differences.
for (int i = 0; i < dY.Length; i++)
{
for (int j = 0; j < nPolyOrder + 1; j++)
{
dZ[i, j] = Math.Pow(dX[i], (double)j);
}
}
The Math.Pow function isn’t returning exactly the same results as multiplying the variables out. For the case of 4th order, I am getting.
i, j, multiply them out, Math.Pow
3,2, 5.88935824,5.8893582400000009
3,4, 34.6845404790559,34.684540479055904
5,3,14.799962884608002,14.799962884608
Even though these differences are small, it appears that with higher orders, floating point round off error starts to become a serious problem.
This may also be a function of the data that I used to regress the functions. That data really does not regress well to polynomials. I picked it so that I could see a visible difference between the third and fourth order polynomials when I graphed them.
I would not be surprised if Excel is also running into floating point issues with this dataset as well.
yes thats Math.Pow i was using ^ in VB.Net. I agree with your comments. Thanks for the articles they were great.
How could I get the quality of the regression?
Awesome! Thanks just what the Dr. ordered!
Great post! Thanks
Hi,
I'm trying to fit a fourth order polynoimial to the NASDQ Composite Index. I have differentiated the polynomial to obtain the slope of the curve. My code is:
int nPolyOrder = 4;
double[,] z = new double[y.Length, nPolyOrder + 1];
for (int k=0; k<ntrades; k++) {
x[k] = tradeTime[k]/1000 - MarketHours.MarketOpenTime();
y[k] = (double) last[k];
z[k, 0] = 1.0;
z[k, 1] = x[k];
z[k, 2] = x[k]*x[k];
z[k, 3] = x[k]*x[k]*x[k];
z[k, 4] = x[k]*x[k]*x[k]*x[k];
}
double[] c = Polynomial.Regress(z, y);
// Y = A + BX + C*X*X + D*X*X*X + E*X*X*X*X
for (int k=0; k<ntrades; k++) {
double X = x[k];
x_fit[k] = X;
y_fit[k] = c[0] + c[1]*X + c[2]*X*X + c[3]*X*X*X + c[4]*X*X*X*X;
y_slope[k] = c[1] + 2*c[2]*X + 3*c[3]*X*X + 4*c[4]*X*X*X;
}
My question is: Is this the correct way to obtain the slope of the curve?
Any suggestion will be greatly appreciated.
Charles