- #1
"Don't panic!"
- 601
- 8
I'm currently reviewing my knowledge of calculus and trying to include rigourous (ish) proofs in my personal notes as I don't like accepting things in maths on face value. I've constructed a proof for the chain rule and was wondering if people wouldn't mind checking it and letting me know if it is correct or not (and what improvements may need to be made). Thanks for your time.
From the definition of the derivative of a differentiable function [itex]f:\mathbb{R}\rightarrow\mathbb{R}[/itex] (one-dimensional case), we have that [tex]f'(x)=\frac{df}{dx}=\lim_{\Delta x\rightarrow 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}= \lim_{\Delta x\rightarrow 0}\frac{\Delta f}{\Delta x}[/tex]
This implies that [tex]\frac{\Delta f}{\Delta x}= f'(x) +\varepsilon (x)\quad\Rightarrow\quad\Delta f = \left(f'(x)+\varepsilon (x)\right)\Delta x [/tex]
where [itex]\varepsilon (x)[/itex] is some error function which accounts for the difference between the actual (finite) change in [itex]f[/itex] and its linear approximation [itex]f'(x)\Delta x[/itex]. Furthermore, [itex]\varepsilon (x)[/itex] satisfies the property [tex]\lim_{\Delta x\rightarrow 0}\varepsilon (x)=0[/tex] such that as [itex]\Delta x \rightarrow 0,\quad\frac{\Delta f}{\Delta x}\rightarrow f'(x)[/itex].
Now, consider a function [itex]y=f\circ g(x)=f(g(x))[/itex] and let [itex]u=g(x)[/itex]. We have then, that [tex]\Delta u = g(x+\Delta x)-g(x)=\left(g'(x)+\varepsilon_{1}(x)\right)\Delta x[/tex] [tex]\Delta y = f(u+\Delta u)-f(u)=\left(f'(u)+\varepsilon_{2}(u)\right)\Delta u[/tex]
Note that [itex]\varepsilon_{2}(u)\rightarrow 0[/itex] as [itex]\Delta u\rightarrow 0[/itex]. However, since [itex]\Delta u\rightarrow 0[/itex] as [itex]\Delta x\rightarrow 0[/itex], this implies that [itex]\varepsilon_{2}(u)\rightarrow 0[/itex] as [itex]\Delta x\rightarrow 0[/itex].
And so,
[tex]f(u+\Delta u)-f(u)=\left(f'(u)+\varepsilon_{2}(u)\right)\Delta u[/tex] [tex]\Rightarrow f(g(x+\Delta x))-f(g(x))=f\circ g(x+\Delta x)-f\circ g(x)\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=\left(f'(g(x))+\varepsilon_{2}(g(x))\right)\cdot\left(g'(x)+\varepsilon_{1}(x)\right)\Delta x\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=f'(g(x))f'(g(x))g'(x)\Delta x +\left(f'(g(x)) \varepsilon_{1}+g'(x)\varepsilon_{2}+\varepsilon_{1}\varepsilon_{2}\right)\Delta x\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=f'(g(x))g'(x)\Delta x +\varepsilon_{3}\Delta x[/tex]
where [itex]\varepsilon_{3}\equiv f'(g(x)) \varepsilon_{1}+g'(x)\varepsilon_{2}+\varepsilon_{1}\varepsilon_{2}[/itex]. We see from this that as [itex]\Delta x\rightarrow 0,\quad\varepsilon_{3}\rightarrow 0[/itex]. Hence,
[tex]\lim_{\Delta x\rightarrow 0}\frac{f\circ g(x+\Delta x)-f\circ g(x)}{\Delta x}= (f\circ g)'(x)=f'(g(x))g'(x)[/tex]
From the definition of the derivative of a differentiable function [itex]f:\mathbb{R}\rightarrow\mathbb{R}[/itex] (one-dimensional case), we have that [tex]f'(x)=\frac{df}{dx}=\lim_{\Delta x\rightarrow 0}\frac{f(x+\Delta x)-f(x)}{\Delta x}= \lim_{\Delta x\rightarrow 0}\frac{\Delta f}{\Delta x}[/tex]
This implies that [tex]\frac{\Delta f}{\Delta x}= f'(x) +\varepsilon (x)\quad\Rightarrow\quad\Delta f = \left(f'(x)+\varepsilon (x)\right)\Delta x [/tex]
where [itex]\varepsilon (x)[/itex] is some error function which accounts for the difference between the actual (finite) change in [itex]f[/itex] and its linear approximation [itex]f'(x)\Delta x[/itex]. Furthermore, [itex]\varepsilon (x)[/itex] satisfies the property [tex]\lim_{\Delta x\rightarrow 0}\varepsilon (x)=0[/tex] such that as [itex]\Delta x \rightarrow 0,\quad\frac{\Delta f}{\Delta x}\rightarrow f'(x)[/itex].
Now, consider a function [itex]y=f\circ g(x)=f(g(x))[/itex] and let [itex]u=g(x)[/itex]. We have then, that [tex]\Delta u = g(x+\Delta x)-g(x)=\left(g'(x)+\varepsilon_{1}(x)\right)\Delta x[/tex] [tex]\Delta y = f(u+\Delta u)-f(u)=\left(f'(u)+\varepsilon_{2}(u)\right)\Delta u[/tex]
Note that [itex]\varepsilon_{2}(u)\rightarrow 0[/itex] as [itex]\Delta u\rightarrow 0[/itex]. However, since [itex]\Delta u\rightarrow 0[/itex] as [itex]\Delta x\rightarrow 0[/itex], this implies that [itex]\varepsilon_{2}(u)\rightarrow 0[/itex] as [itex]\Delta x\rightarrow 0[/itex].
And so,
[tex]f(u+\Delta u)-f(u)=\left(f'(u)+\varepsilon_{2}(u)\right)\Delta u[/tex] [tex]\Rightarrow f(g(x+\Delta x))-f(g(x))=f\circ g(x+\Delta x)-f\circ g(x)\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=\left(f'(g(x))+\varepsilon_{2}(g(x))\right)\cdot\left(g'(x)+\varepsilon_{1}(x)\right)\Delta x\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=f'(g(x))f'(g(x))g'(x)\Delta x +\left(f'(g(x)) \varepsilon_{1}+g'(x)\varepsilon_{2}+\varepsilon_{1}\varepsilon_{2}\right)\Delta x\\ \qquad\qquad\qquad\qquad\qquad\quad\;\;=f'(g(x))g'(x)\Delta x +\varepsilon_{3}\Delta x[/tex]
where [itex]\varepsilon_{3}\equiv f'(g(x)) \varepsilon_{1}+g'(x)\varepsilon_{2}+\varepsilon_{1}\varepsilon_{2}[/itex]. We see from this that as [itex]\Delta x\rightarrow 0,\quad\varepsilon_{3}\rightarrow 0[/itex]. Hence,
[tex]\lim_{\Delta x\rightarrow 0}\frac{f\circ g(x+\Delta x)-f\circ g(x)}{\Delta x}= (f\circ g)'(x)=f'(g(x))g'(x)[/tex]