Solving Equation 1 with Secant Method

In summary, the article explains how to fit a curve to data points and how to solve a least squares problem to find estimates for a and b. equation 1 is the equation that is fit to the data points, and equation 2 is the equation that is used to find estimates for a and b.
  • #1
a.mlw.walker
148
0
So I have studied some of the Numberical methods such as Newton Raphson etc, but I want to try and solve this particular problem using the secant method. The problem is something I am reading in an article, and I am trying to follow the maths through.

I am slightly wondering though how it is done.

Equation 1 (attached) is the equation I start with. The data sets I have are 50 values for t(k.2[tex]\Pi[/tex]) I also know C0, but I am trying to find estimates for a and b.

Please can someone explain how this may be done.

The article explains fitting the equation to a curve, and states that a, b and C0 in equation 2 (attached) appear non-linearly

It then goes on to say, that once the curve has been fitted after minimizing equation 2 becomes equation 3 (attached).

Does anyone understand what this means, and what I would need to do to solve equation 1 with the secant method for a and b.

Thank you
 

Attachments

  • equation1.bmp
    67.1 KB · Views: 522
  • equation2.bmp
    104.3 KB · Views: 559
  • equation3.bmp
    116.8 KB · Views: 503
Engineering news on Phys.org
  • #2
I can explain equation 2. If I have a series of data points (x i, y i) that I want to fit with a function y = f(x,a,b) in a least squares sense, the task is to minimize the sum of squares of the the vertical distance between each data point and the curve (ie, y - f). Thus, I'm trying to minimize:

[tex] S(a,b) = \sum \big( y_i - f(x_i,a,b) \big)^2 [/tex]

I have no idea on equation 3. Question - can you post the article or the relevant section?
 
  • #3
If you want to know the steps for solving nonlinear least squares problems, it is summarized below:

1. Linearize the function [itex] f(x_i,a,b) [/itex] at intial guesses for [itex]a = a_0[/itex] and [itex]b = b_0[/itex] using the first two terms of the Taylor series:

[tex] f \approx f_0 + f_a \delta a + f_b \delta b [/tex]

where [itex] f_a = \partial f / \partial a [/itex] and [itex] f_b = \partial f / \partial b [/itex]

2. Form the least squares problem using the linearized function:

[tex] F =\sum ( f_0 + f_a \delta a + f_b \delta b - y_i )^2 [/tex]

3. To minimize F, set partial derivatives of F with respect to [itex] \delta a [/itex] and [itex] \delta b [/itex] equal to zero:

[tex] \begin{align*}
F_{\delta a} &= 0 = 2 \sum ( f_0 + f_a \delta a + f_b \delta b - y_i ) f_a \\
F_{\delta b} &= 0 = 2 \sum ( f_0 + f_a \delta a + f_b \delta b - y_i ) f_b
\end{align*} [/tex]

The next result is the following normal equation for nonlinear least squares analysis:

[tex] J^T J z = J^T (y - f) [/tex]

where [itex] z = \begin{bmatrix} \delta a & \delta b \end{bmatrix}^T [/itex] and J is the Jacobian of first partial derivatives of f.

4. Solve the linear equation from step 3 for z and update guesses for a & b:

[tex] \begin{align*}
a_1 &= a_0 + \delta a \\
b_1 &= b_0 + \delta b
\end{bmatrix} [/tex]

5. Upate J and f based on updated values for a and b

6. Repeat steps 4 & 5 until [itex] \delta a [/itex] and [itex] \delta b [/itex] are sufficiently small.
 
Last edited:
  • #4
The tie to Levenberg-Marquardt is as follows. The (square) matrix [itex] J^T J [/itex] can often be ill-conditioned at intial guesses of the function coefficients making the equation [itex] J^T J z = J^T (y-f) [/itex] difficult if not impossible to solve. The (simpler) version of Levenberg-Marquardt modifies the problem by adding a small number [itex] \lambda [/itex] to the diagonals of [itex] J^T J [/itex], resulting in the following modified equation:

[tex] (J^T J + \lambda I) z = J^T (y-f) [/tex]

which is more numerically stable the larger [itex] \lambda [/itex] is. The trick is to make [itex] \lambda [/itex] as small as possible to solve the problem, then gradually reduce it's magnitude to zero as the solution progresses.

One other note. The matrix [itex] J^T J [/itex] doesn't need to be computed at all if QR matrix factorization is used. Then, all that's needed is to factor [itex] J = QR [/itex] then solve the triangular system [itex] R z = Q^T (y-f) [/itex]. See link below for more explanation.

http://www.alkires.com/teaching/ee103/Rec8_LLSAndQRFactorization.htm

The next question might be how to add [itex] \lambda [/itex] to the diagonals of [itex]J^T J[/itex] if [itex]J^T J[/itex] is never computed? Actually, it is done by augmenting J and (y-f) as follows:

[tex] J^* = \begin{bmatrix} J \\ \sqrt{\lambda} I \end{bmatrix}
\qquad \qquad
(y-f)^* = \begin{bmatrix} (y-f) \\ 0 \end{bmatrix} [/tex]

Then factor [itex] J^* = QR [/itex] and solve [itex] R z = Q^T (y-f)^* [/itex].

This is probably more than you wanted to know...
 
Last edited:
  • #5
So in laymans terms, can you see how I can use the Levenberg-Marquardt to solve equation 1 to find approximations of a and b, if i have lots of data for Tk?
 
  • #6
Yes, but I presume you mean ordered pairs [itex]\big( k_i, t(k_i) \big) [/itex]
 
  • #7
well if tk = equation 1, i want to solve equation 2 for a and b, is that the same?
 
  • #8
Sort of. Need to be careful of terminology: tk is an experimentally determined data point. Equation 1 is the idealized function that is suppose to represent the data, but doesn't precisely do so because of experimental error associated with the data points. The method to estimate the parameters a & b is least squares. Equation 2 is the least squares problem to solve based on the desired fitting function (i.e. equation 1).

Thus, if you have a bunch of experimentally determined data points tk, estimates of the parameters a & b can be found by minimizing equation 2. I hope this helps explain the situation.

One thing that seems odd is the use of k in equation 2. The index k is meant to represent the index of data points, but k is also used within the fitting function itself (i.e. equation 1). Either that's a typo or the "x" values associated with the data points is really 0,1,2...n.
 
  • #9
OK, will check the k issue, thanks for that. So what you are saying is once you have the data points for tk, i can use the least squares method to solve for a and b for non linear estimates.

So it may just be syntax, but the S(a,b,co) confuses me a bit, does that mean I am solving for a,b and co? Does normal least squares method work for finding estimates of two unknown variables?
 
  • #10
So what you are saying is once you have the data points for tk, i can use the least squares method to solve for a and b for non linear estimates.

Correct. Least squares can be used to find estimates for a and b. But, since Equation 1 is nonlinear in a & b, the nonlinear least squares algorithm needs to be used, which means an iterative solution process based on initial guesses for a & b.

So it may just be syntax, but the S(a,b,co) confuses me a bit, does that mean I am solving for a,b and co?

Hard to say w/o seeing the article, but it may be subtle terminology. Out of context I would read equation 2 as involving three unknowns. But, you said in an earlier post that you already knew c0. If that's true, I would write equation 2 as S(a,b) = ...

Does normal least squares method work for finding estimates of two unknown variables?

Least squares can be used for finding estimates for as many unknowns as you have (and the computational resourses to calculate). I've heard of problems with thousands of unknowns or more.
 
Last edited:
  • #11
OK, so will i end up with a couple of simultaneous equations to get each approximaiton for a and b?
 
  • #12
Correct, but it is an iterative process because the problem is nonlinear. You start with initial guesses for a & b. Then solve the least squares problem for [itex]\delta a[/itex] and [itex]\delta b[/itex] based on a linearized version of equation 1, update the values of a & b, and repeat until you have converged values for a & b. That's the way nonlinear least squares works.

I think you are at a disadvantage because you are jumping into a nonlinear least squares problem before understanding the basics of linear least squares.
 
  • #13
Hi again, sorry long delay.

Have been looking over what i have on least squares curve fitting, and i was wondering how to tell to what order my curfe fit will have to be to give an acurate representation of the data, the article i got it from used that Levenberg-Marquardt method, did that mean that they didnt need to worry about the order, it just found a good result for a and b and c0? My notes on least squares say that the final equation representing the data will be a quadratic?
 
  • #14
This is confusing. Pls explain what you mean by "order" and "quadratic". It almost sounds like you are talking about polynomial fitting functions.

Can't you post relevant pages of the article? You are asking how to interpret an article we can't see.
 
  • #15
Sorry I was confusing both of us, when I tried to math your post you wrote at Sep21-09, 02:56 PM and my notes on least squares. I will have to find a good tutorial on this Levenberg Marquardt thing now. I understand that it is doing linear regression then Newton-Gauss method, to minimise, but not too sure about the Newton-Gauss bit at mo.
 
  • #16
Wow, I never saw that tutorial at the bottom of all your posts:

3. Linear Least Squares, Non-Linear Least Squares, Optimization

Better have a read.
 

FAQ: Solving Equation 1 with Secant Method

What is the Secant Method?

The Secant Method is a numerical method used to find the root of a function. It is an iterative process that approximates the root by using two initial guesses and calculating a new estimate based on the slope of the function between the two guesses.

How does the Secant Method differ from other root-finding methods?

Unlike other methods like the Bisection Method or Newton's Method, the Secant Method does not require the function to be continuous or have a derivative. It also does not need two initial guesses that bracket the root. However, it may not always converge and can be sensitive to initial guesses.

When is the Secant Method most useful?

The Secant Method is most useful when the function is not continuous or does not have a derivative, as it can still approximate the root. It is also useful when the function is not easily solvable by hand.

What is the convergence rate of the Secant Method?

The convergence rate of the Secant Method is approximately 1.618, also known as the golden ratio. This means that the number of correct digits approximately doubles with each iteration.

How do I know if the Secant Method has converged to the root?

To determine if the Secant Method has converged to the root, you can check if the current estimate is close enough to the previous estimate. If the difference between the two estimates is smaller than a predetermined tolerance level, then the method can be considered to have converged. It is also important to check if the function value at the estimate is close to zero.

Similar threads

Back
Top