Gauss Newton for multivariable nonlinear least squares regression

by Trent Guidry 9. August 2009 12:38
The Gauss Newton algorithm is a numerical method that in some cases can be used to solve nonlinear least squares regression problems.
 
Functions used in a nonlinear regression have three types of variables. These three variables are the dependent variables, the independent variables, and the regression parameters.
The independent variables are usually the variables which are used as inputs while measuring the effect of them on other variables. For example, with an equation of state, if the pressure is measured as a function of temperature and volume, then the temperature and volume are independent variables. The independent variables are denoted as x in this post.
The dependent variables are the observed outcomes of varying the independent variables. For the equation of state example given above, the pressure would be a dependent variable. The dependent variables are denoted as y in this post.
Both the independent variables and the dependent variables are model independent. They both can be determined experimentally without using a model.
A model is one or more equations that are used to correlate the dependent variables with respect to the independent variables.  For the equation of state example given above, the model would be the equation of state used to correlate the dependent variable (pressure) with respect to the independent variables (temperature and volume).  For this case, example model equations could include models such as an ideal gas model, a Van der Waals model, a Peng Robinson model, a Soave Redlich Kwon model, etc.    The model functions are denoted as f in this post.
The regression parameters are parameters of the model used to correlate the dependent and independent variables and are model dependent. The regression parameters are the unknown variables to be solved for in a nonlinear regression. Regression parameters are denoted as p in this post.
In a model, the dependent variables are functions of the independent variables and the model parameters. The difference between the values of the dependent variables predicted by the model and the values of the dependent variables actually observed is the error of the model, which is donated as e in this post. This is shown in the equation below.
y = f(x,p) + e
This can be written in matrix form as shown below.
Y = F + E
This can be rearranged to the equation shown below.
E = Y - F
A Taylor series for E written about a set of regression parameters while holding the dependent and independent variables constant gives.
Regression Error Jacobian
 
Ep1 = Ep0 + JE(P1 – P0)
Where JE is the Jacobian of the error functions with respect to the parameters.
Using E = Y – F and noting that the Y column matrix is constant with respect to the parameters gives
JE = -J
Where J is the Jacobian of the model functions with respect to the parameters. Substituting in gives.
Ep1 = Ep0 - J(P1 – P0)
The square residual error S­r about P1 can be written as shown below.
Sr = E p12 = (Ep0- J(P1 – P0))2 = (Ep0- J(P1 – P0))T(Ep0- J(P1 – P0)) = Ep0T Ep0 + (P1 – P0)T(J T J) (P1 – P0) – 2 Ep0T J(P1 – P0)
At the minimum square residual error, the derivatives of Sr with respect to P1-P0 are equal to zero.
dSR/d(P1-P0) = 2 J TJ(P1 – P0) – 2 J T Ep0 = 0
Which gives the Gauss Newton equation that is shown below.
Gauss Newton
The Gauss Newton algorithm works by starting with an initial guess of the parameters. These are used as P0. The residual error and the Jacobian are then solved for. The equation above is then used to solve for P1, which becomes the new P0 and the algorithm is repeated until it either converges or goes unstable.
The Gauss Newton algorithm is implemented in the GaussNewton class, which is shown at the bottom of the post.
There are two constructors for the GaussNewton class. The first takes an array of parameters to be regressed, a parameter array of the independent variables, a double array of the data, and an integer that indicates the number of points to use when numerically computing the derivatives in the Jacobian. The second constructor is similar to the first except that it defaults the number of parameters to use when computing the derivatives to 3. The constructor for this class allocates memory for the Jacobian, residual error, and initial regression parameter values that are stored in Matrix fields. It also sets the IsSolvedFor property of the independent variables to false, since a variable cannot be both an independent variable and a regression parameter at the same time. It also creates the derivatives class instance that will be used to compute the Jacobian.
The Iterate function invokes an iteration of the Gauss Newton algorithm. This function first iterates through the data points and computes the Jacobian matrix using numerical partial derivatives. The Gauss Newton equation is then used to solve for the next set of parameters and the model parameter values are then set to these values.
 
namespace NumericalMethods
{
    public class GaussNewton
    {
        private Matrix _jacobian;
        private Matrix _residuals;
        private Matrix _regressionParameters0;
        private Derivatives _derivatives;
        private Parameter[] _regressionParameters;
        private Parameter[] _observedParameters;
        private Func<double> _regressionFunction;
        private double[,] _data;
 
        public GaussNewton(Func<double> regressionFunction, Parameter[] regressionParameters, Parameter[] observedParameters, double[,] data, int numberOfDerivativePoints)
        {
            Debug.Assert(data.GetLength(0) == observedParameters.Length + 1);
            _data = data;
            _observedParameters = observedParameters;
            _regressionParameters = regressionParameters;
            _regressionFunction = regressionFunction;
            int numberOfParameters = _regressionParameters.Length;
            int numberOfPoints = data.GetLength(1);
 
            _derivatives = new Derivatives(numberOfDerivativePoints);
 
            _jacobian = new Matrix(numberOfPoints, numberOfParameters);
            _residuals = new Matrix(numberOfPoints, 1);
            _regressionParameters0 = new Matrix(numberOfParameters, 1);
        }
 
        public GaussNewton(Func<double> function, Parameter[] regressionParameters, Parameter[] observedParameters, double[,] data) :
            this(function, regressionParameters, observedParameters, data, 3)
        {
        }
 
        public void Iterate()
        {
            int numberOfPoints = _data.GetLength(1);
            int numberOfParameters = _regressionParameters.Length;
 
            double currentResidual = 0.0;
            for (int i = 0; i < numberOfPoints; i++)
            {
                for (int j = 0; j < _observedParameters.Length; j++)
                {
                    _observedParameters[j].Value = _data[j, i];
                }
                double functionValue = _regressionFunction();
                double residual = _data[_observedParameters.Length, i] - functionValue;
                _residuals[i, 0] = residual;
                currentResidual += residual * residual;
                for (int j = 0; j < numberOfParameters; j++)
                {
                    _jacobian[i, j] = _derivatives.ComputePartialDerivative(_regressionFunction, _regressionParameters[j], 1, functionValue);
                }
            }
 
            for (int i = 0; i < numberOfParameters; i++)
            {
                _regressionParameters0[i, 0] = _regressionParameters[i];
            }
 
            Matrix matJacobianTrans = _jacobian.Transpose();
            Matrix matJTranResid = matJacobianTrans * _residuals;
            Matrix matJTranJ = matJacobianTrans * _jacobian;
 
            Matrix matDelta = matJTranJ.SolveFor(matJTranResid); ;
 
            Matrix matNewRegPars = _regressionParameters0 + matDelta;
 
            for (int i = 0; i < numberOfParameters; i++)
            {
                _regressionParameters[i].Value = matNewRegPars[i, 0];
            }
        }
    }
}

Comments (10) -

jan mojzis Slovakia
4/24/2010 7:39:20 AM #

Hi, thanx for the algorithm. I was looking for it quite a while. However, I would be very grateful, if you can post/upload all neccessary classes (eg. MatrixNumeric , ...). Please can you post/upload all required? Or the algo is rather pseudo, and there doesn't exist these?

Thanx,
Jan

Reply

Trent United States
4/25/2010 2:02:34 PM #

I posted the source code for the numeric methods at www.trentfguidry.net/.../...gression-Download.aspx

Reply

vinod India
10/10/2010 8:34:07 PM #

Very good post. I was looking for this only.

you have given good example for Newton Raphson method.if you can provide one for Gauss Newton that will be great.

Reply

vinod India
10/10/2010 9:14:30 PM #

I have used example from wiki link:
en.wikipedia.org/.../Gauss%E2%80%93Newton_algorithm

Here is the code:
////RATE = ((B1 * X)/(B2 + X)) Where X is substrate concentration
            double[] x = new double[] {  0.038, 0.194, 0.425, 0.626, 1.253, 2.500, 3.740};
            double[] y = new double[] { 0.050 ,0.127 ,0.094 ,0.2122, 0.2729, 0.2665, 0.3317 };

            double[,] z = new double[2, x.Length];

            for (int i = 0; i < x.Length; i++)
            {
                z[0, i] = x[i];
                z[1, i] = y[i];
            }

            Parameter A = new Parameter(20.0);
            Parameter B = new Parameter(5000.0);
            Parameter C = new Parameter(0.0);
            Parameter T = new Parameter(1.0);

            Parameter B1 = new Parameter(0.09);
            Parameter B2 = new Parameter(0.02);
            Parameter X = new Parameter(1.0);
            
            Func<double> regressionFunction = () =>  ((B1 * X)/(B2 + X));

            Parameter[] regressionParameters = new Parameter[] { B1,B2 };
            Parameter[] observedParameters = new Parameter[] { X };


            GaussNewton gn = new GaussNewton(regressionFunction, regressionParameters, observedParameters, z);

            for (int i = 0; i < 5; i++)
            {
                gn.Iterate();
            }      

Reply

Vlad Bezden United States
12/30/2010 10:17:16 AM #

How can I find out if my iteration is converged?  Each parameter has IsSolvedFor property, which is always true, so I think it is not something that tells me if my numbers are converged.  

Please advise.

BTW, library works like a charm.  Great work!!!

Thank you very much.

Sincerely,
Vlad.

Reply

Trent Guidry United States
1/4/2011 7:50:50 AM #

When it converges, each iteration gives a result that is more accurate than the previous value.

At some point though, more iterations don't really help very much, especially when the remaining error is around the precision of a double.  

For example, if say 15 iterations give you 5 or more digits of accuracy and you only need 4 digits of accuracy, then 15 iterations is enough.  If you wanted 7 digits of accuracy, then in that case, you would need to do more iterations.

Reply

Vlad Bezden United States
12/29/2010 10:20:33 AM #

Thank you very much for such a great post.

I downloaded your library and I am using GaussNewton method to calculate regression parameters.  My model (regressionFunction) is very complex and it uses Math.Exp(a + b*x + c*y), and my problem is that my values inside of the Math.Exp sometimes provide values bigger than 709 and Math.Exp() function returns double.Infinite value if Math.Exp() value > 709.  If I get any Infinite values library fails, because as I understand it doesn't know how to deal with such a huge numbers.

Could you please advise what can be done to make it run.  Maybe change derivativeStep or DerivativeStepType for Parameter?

Thank you very much for such a great work.

Reply

Trent Guidry United States
12/30/2010 7:24:46 AM #

If the function is finite at the starting guesses and during the iterations, it becomes inifinite, then most likely the iterations diverged.  

One way around it is to use better starting guesses for the initial values.  That might get it to converge.  Alternatively, you could use something like the Levenberg-Marquardt algorithm, which generally converges better than Gauss Netwon.

Reply

Vlad Bezden United States
12/30/2010 9:58:39 AM #

Hi Trent,

I used Levenberg-Marquardt algorithm and it worked!!!  Thank you very much!!!  I really appreciate your help.

Thank you very much for such a great library!!!

Best wishes and Happy New Year.

Reply

Vlad Bezden United States
6/6/2011 5:46:37 AM #

Hi All,

Quesiton, does order of the records important for calculation?  I am using PLINQ and my records every time are sorted in unpredictable order.  Seems to me it is not important, since every time I am getting close results, but I am not mathematician and wanted to make sure I am doing right things.

Thanks for your help and great work.

Reply

Add comment




  Country flag
biuquote
  • Comment
  • Preview
Loading