Error bounds for Newton's Method

In summary, the Newton's Method theorem proves that if the initial guess for a root of an equation is sufficiently close to the root, then the Newton approximations converge quickly to the root.
  • #1
mcastillo356
Gold Member
592
320
TL;DR Summary
Would like to understand this theorem, or at least the statement. The textbook says demonstration lies on Mean-Value Theorem, but doesn´t provide, because "is of little practical use". The question I make is to just understand why, if ##K/(2L)<1##, the theorem shows that ##x_n## converges quickly to ##r## once ##n## becomes large enough that ##|x_n-r|<1##
Hi, PF

Sometimes it is not easy to find roots of functions. Newton gave a nice clue: the Newton's Method formula: ##x_{n+1}=x_n-\dfrac{f(x_n}{f'(x_n)}##. My concern is, now that I have understood and practiced it, comprehend what I've sketched in the summary. This is all taken from "Calculus, 9th edition, by Robert A. Adams and Christopher Essex".Theorem 2 Error bounds for Newton's Method
The following theorem gives sufficient conditions for the Newton approximations to converge to a root ##r## of the equation ##f(x)=0## if the initial guess ##x_0## is sufficiently close to the root ##r##.
Suppose ##f##, ##f'## and ##f''## continuous on an interval ##I## containing ##x_n##, ##x_{n+1}## and a root ##x=r## of ##f(x)=0##. Suppose also that exist ##K>0## and ##L>0## such that for all ##x## in ##I## we have

(i) ##|f''(x)|\leq{K}## and

(ii) ##|f'(x)|\geq{L}##

Then

(a) ##|x_{n+1}-r|\leq{\dfrac{K}{2L}|x_{n+1}-x_n|^2}## and

(b) ##|x_{n+1}-r|\leq{\dfrac{K}{2L}|x_n-r|^2}##

(i) y (ii) assert that near ##r## the slope of ##y=f(x)## is not too small in size and doesn't change too rapidly. If ##K/(2L)<1##, the theorem shows that ##x_n## converges quickly to ##r## once ##n## becomes large enough that ##|x_n-r|<1##.

Thanks in advance!
 
Physics news on Phys.org
  • #2
Hi,

No quick responders so I'll try to give it a kick-off:

Is the proof of the theorem as presented here clear to you ?

##\ ##
 
  • Love
Likes mcastillo356
  • #3
Thanks, @BvU. Now is 2.13 AM, but I will remain sleepless. I was stucked, now I'm hopeful.
Love
 
  • Like
Likes BvU
  • #4
Hi, @BvU, not familiar still with Taylor's Theorem; just found a proof that relies on Mean-Value Theorem. Could work out aspects I find difficult?
file:///C:/Users/usuario/Desktop/Newton-MVT.pdf
But first of all, is this link ok?
 
  • #6
Link looks OK to me. Can't say I find the top half of page 2 easy to follow, though...

##\ ##
 
  • #7
@BvU , the same feeling. Second page is difficult for me

Since ##T'(\hat{x})=0##, in the neigborhood of the root we will be able to select a decay constant ##c<1##, so the root is an attractive fixed point of ##T##. In particular let ##D=[\hat{x}-r,\hat{x}+r]## where ##|T'(x)|\leq{c<1}## for all ##x\in{D}##. (If ##f'(\hat{x})\neq{0}## we can always find an ##r## such that the inequality holds for the given constant ##c##.) Then if ##u,v\in{D}##, the Mean Value Theorem gives ##T(u)-T(v)=T'(\tilde{u})(u-v)## for some ##\tilde{u}\in{D}## between ##u## and ##v##. Thus ##|T(u)-T(v)|\leq{c|u-v|}## for all ##u,v\in{D}##. In particular, ##|x_n-\hat{x}|\leq{c\;|x_{n-1}-\hat{x}|}\leq{...}\leq{c^{n}|x_0-\hat{x}|}##

How can I get to this late expression from the previous one?
 
  • #8
1- ##|T(u)-T(v)|\leq{c\;|u-v|}##, ##u,v\in{D_T}##

2- Domain of T: ##D=[\hat{x}-r,\hat{x}+r]##, where ##|T'(x)|\leq{c<1},\;\forall{x\in{D}}##

3- Recall MVT: ##\exists{\;u<\tilde{u}<v}##, s.t. ##T'(\tilde{u})=\dfrac{T(u)-T(v)}{u-v}##

4- Take absolute values on 3: ##|T'(\tilde{u})|=\dfrac{|T(u)-T(v)|}{|u-v|}##

5- Set ##u=x_{n-1}## and ##v=\hat{x}##. Remember we are looking for ##T(\hat{x})=\hat{x}##

6- ##c^0|(T(x_{n-1})=x_n)-(T(\hat{x})=\hat{x})|\leq{\;c^1|x_{n-1}-\hat{x}|}##

7- Recursively, ##|x_n-\hat{x}|\leq\;c\;|x_{n-1}-\hat{x}|\leq...\leq\;c^n\;|x_0-\hat{x}|##

Right? Take it as a question.
 
  • #9
mcastillo356 said:
How can I get to this late expression from the previous one?
Yeah, sorry I didn't understand what the question was here. I'm sure you can see the unrolling from ##n## via ##n-1## all te way to ##0##, adding a factor ##c## at each step.

And ##T(u)-T(v)## to ##x_{n} - \hat x## doesn't seem nonsensical either...

mcastillo356 said:
Right? Take it as a question.
I'm not very good at rubber stamping approvals :wink: . All I can say is I can't find fault, but that doesn't really help you.

Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ? Anyway, they follow the same thread of reasoning to prove a theorem 2.6 (using a fixed point theorem 2.4).

More important is a concluding remark there: the proof is all fine and good, bit it doesn't tell us how big ##D## is. So its practical value is minimal. That means that in real life one starts up Newton with the best available initial guess and if it doesn't converge, one retreats to a plan B to improve the starting point.

This all sounds corny for a 1D case, but for applications like flowsheeting you can have over ##10^5## equations with as many unknowns and each and every one of those ##10^5## can throw you off a convergent track. Because there are minute numerical discontinuities (e.g. caused by if-statements in physical properties), numerical singularities, numerical differentiation, ##0=0## equations, and so on.

##\ ##
 
  • Informative
Likes mcastillo356
  • #10
BvU said:
Yeah, sorry I didn't understand what the question was here. I'm sure you can see the unrolling from ##n## via ##n-1## all te way to ##0##, adding a factor ##c## at each step.

And ##T(u)-T(v)## to ##x_{n} - \hat x## doesn't seem nonsensical either...I'm not very good at rubber stamping approvals :wink: . All I can say is I can't find fault, but that doesn't really help you.

Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ? Anyway, they follow the same thread of reasoning to prove a theorem 2.6 (using a fixed point theorem 2.4).

More important is a concluding remark there: the proof is all fine and good, bit it doesn't tell us how big ##D## is. So its practical value is minimal. That means that in real life one starts up Newton with the best available initial guess and if it doesn't converge, one retreats to a plan B to improve the starting point.

This all sounds corny for a 1D case, but for applications like flowsheeting you can have over ##10^5## equations with as many unknowns and each and every one of those ##10^5## can throw you off a convergent track. Because there are minute numerical discontinuities (e.g. caused by if-statements in physical properties), numerical singularities, numerical differentiation, ##0=0## equations, and so on.
@BvU, thanks. My intention now is to turn into a maths native forum, and keep this thread in standby, until clarifying all, i.e., understand the quadratic convergence proof provided, and the Example 1 where approximates ##\sqrt{7}##. I don't want to bore this site.
##D## will be one of the questions, but for now target achieved:
If ##x_0\in{D}##, then so are all of the ##x_n\in{D}## and ##x_n\rightarrow{\hat{x}}## as ##n\rightarrow{\infty}##
Two personal remarks: hope PF checks this thread at all times; and hope it is all consistent, coherent, and useful to someone else. Ultimately that's what my vanity asks for.
 
  • #11
BvU said:
Is the proof of the theorem as presented here clear to you ?

BvU said:
Thought I'd posted a snippet from Burden & Faires: Numerical analysis 9th ed, but perhaps it was removed by a mentor for copyright reasons ?
I don't see any history of your post being edited. Are you talking about the link in the first quote of yours that I copied?
 
  • #12
In that case I must have lost my draft.
1642797193530.png

treated in the same way as in the link in #5, showing all the hypotheses in
1642797376700.png

are satisfied. Then for the error estimate
1642797458800.png


---

Coming back to #4: the entire Newton method is nevertheless based on the Taylor expansion of ##f## around ##p## (with ##\, f(p)=0\,##). So getting familiar with Taylor (MacLaurin) is more or less required (and very useful for maany other applications as well).

##\ ##
 
  • Informative
Likes mcastillo356
  • #13
BvU said:
the entire Newton method is nevertheless based on the Taylor expansion of ##f## around ##p## (with ##\, f(p)=0\,##). So getting familiar with Taylor (MacLaurin) is more or less required (and very useful for maany other applications as well).
Sure, but still haven't got in touch, though soon have to face.

I've released two different quotes: the one at first post:
The casual expressions "not too small" and "too rapidly" kept me wondering. (i) and (ii) are just bounds.

Regarding the pdf, there is a typo in Example 1. It should be ##\sqrt{7}## instead of ##\sqrt{2}##. And for me it's more comprehensible if it would be ##|f''(x)|\leq B## instead of ##|f''(x)|< B##, and ##|f'(x)|\geq A## instead of ##|f'(x)|> A##, when it comes to introduce ##A## and ##B##.

Thanks, @BvU, PF!
 
  • #14
mcastillo356 said:
Regarding the pdf, there is a typo in Example 1. It should be ##\sqrt{7}## instead of ##\sqrt{2}##.
Yes, I see that the author inadvertently wrote ##\sqrt 2## in one place.
mcastillo356 said:
And for me it's more comprehensible if it would be ##|f''(x)|\leq B## instead of ##|f''(x)|< B##, and ##|f'(x)|\geq A## instead of ##|f'(x)|> A##, when it comes to introduce ##A## and ##B##.
This is of little consequence. All it is saying is that |f''(x)| is bounded and that |f'(x)| is bounded below. It doesn't make any difference between < and ##\le## for the first, and similar for the bound on f'(x).
 
  • Informative
  • Like
Likes mcastillo356 and BvU

FAQ: Error bounds for Newton's Method

What is Newton's Method and why is it important in scientific research?

Newton's Method is an iterative algorithm used to find the roots of a function. It is important in scientific research because it allows for efficient and accurate approximation of solutions to complex equations, which are often encountered in various fields of study.

How does Newton's Method work?

Newton's Method involves taking an initial guess for the root of a function and using the derivative of the function to iteratively refine the guess until a desired level of accuracy is achieved. This process is repeated until the root is found.

What are error bounds and why are they important in relation to Newton's Method?

Error bounds refer to the maximum possible error between the actual solution and the approximation obtained through Newton's Method. They are important because they provide a measure of the accuracy of the solution and help determine when the algorithm has converged to the desired solution.

How can one calculate error bounds for Newton's Method?

Error bounds can be calculated using the Taylor series expansion of the function and its derivatives. The error bound is then determined by evaluating the remainder term of the series at the current approximation.

What are some common sources of error in Newton's Method and how can they be minimized?

Some common sources of error in Newton's Method include choosing a poor initial guess, encountering singular points or multiple roots, and using a limited number of iterations. These errors can be minimized by carefully selecting an initial guess, ensuring the function is well-behaved, and increasing the number of iterations or using more advanced convergence criteria.

Similar threads

Back
Top