Is my proof for the Chain Rule invalid?

In summary, the conversation revolves around the validity of a proof for the Chain Rule that involves manipulating notation and infinitesimals. One person suggests a simpler proof involving the definition of a derivative, while another questions the arbitrary use of new definitions. The conversation ends with a discussion about the limit of the ratio used in the proof.
  • #1
prasannapakkiam
All the proofs I have found for the Chain Rule involve limits and the fundamental theorem of Algebra...

So I came up with a PROOF, not a derivation. But my teacher claims that my proof is invalid. Is it? If so, why?


let:
u=z(x)
y=f(u)=f(z(x))

therefore: dy/du = f ' (u)
therefore: dy = (f ' (u)) * du -->

therefore: du/dx = z ' (x)
therefore: 1/dx = (z ' (x))/du -->


therefore dy/dx = dy*(1/dx)
substituting...
therefore: dy/dx = ((f ' (u))*du)*(z ' (x))/du
which simplifies to:
dy/dx=(f ' (u))*(z ' (x))=(f ' (z(x)))*(z ' (x)) ==>
or alternatively substituting...
dy/dx=dy/du*du/dx ==>
 
Mathematics news on Phys.org
  • #2
therefore: dy = (f ' (u)) * du -->

Uhh... is this rigorous? You're just manipulating notation. While extremely convenient, the use in a true proof is questionable.

Of course a proof of the chain rule will involve limits... because the definition of a derivative is based on a limit, and the chain rule is a proof about a derivative. You can often expect proofs to rely on definitions
 
  • #3
Your proof is valid. Though there is a simpler proof.

du / dx = z'(x)

df / du = f'(u)

If we multiply both, we get df/dx, which is what we are looking for. Hence df / du = f'(u) * z'(x)
 
  • #4
Well, (to "Office Shredder"), the reason, I used arrows, was because my teacher could not follow my proof. Anyway, thanks. I just wanted to check whether my proof is valid. I insist that it is not a derivation. I see it like induction - this concept involves proving the statement/equation by proving that it will work for all the numbers defined within the set in which it claims that it will work... Although this is a simplistic thought, it underlies; the reason why I think my proof is valid. If I am "manipulating notation", then can you tell me why this is invalid in a 'Proof'?
 
  • #5
Office_Shredder said:
Uhh... is this rigorous? You're just manipulating notation. While extremely convenient, the use in a true proof is questionable.

Of course a proof of the chain rule will involve limits... because the definition of a derivative is based on a limit, and the chain rule is a proof about a derivative. You can often expect proofs to rely on definitions

Manipulation of infinitesimals as such is valid because the meaning of the limit they represent is not lost.

Edit: e.g. With the expression 1/dx = = (z ' (x))/du, the OP substituted such as dy/dx = ((f ' (u))*du)*(z ' (x))/du. This expression, interpreted as a limit, is still valid. Since du is a function of dx such as du = du(dx), if we let dx go to a very small value, so does du and the expression (z ' (x))/du becomes closer and closer to 1/dx. Since it gets closer and closer to 1/dx, ((f ' (u))*du)*(z ' (x))/du gets closer and closer to (f ' (u))*du / dx. Hence, if (f ' (u))*du / dx has a limit, the other expression will have the same limit because the two expressions become increasingly close as dx goes to 0.
 
Last edited:
  • #6
Werg22 said:
Your proof is valid. Though there is a simpler proof.

du / dx = z'(x)

df / du = f'(u)

If we multiply both, we get df/dx, which is what we are looking for. Hence df / du = f'(u) * z'(x)

How exactly is this a proof? du/dx and df/du are not fractions.
 
  • #7
d_leet said:
How exactly is this a proof? du/dx and df/du are not fractions.

As said before, infinitesimals can be treated like numbers because the the ending result always represents the limit we are looking for once we "convert" this ending result to a limit. This said, the simpler proof can be understood in another way: as in limits, dy, df and dx are not 0. The expressions dy/dx and df/dx are thus fractions. Their factor, df/dx, is also a fraction. However, it is more convenient to look at df/dx as a product. The definition of the derivative is dy/dx = f'(x) + k where k is increasingly small for dx going to 0. Hence we would have

df/dx = (z'(x) + k )*(f'(u) + l ) = z'(x)*f'(u) + lk + l(...) + k(...)

You can see that as dx becomes closer to 0, the values l and k become very small, and so do the terms lk + l(...) + k(...). If we introduce a variable m, such as m = lk + l(...) + k(...), we obtain

df/dx = z'(x)*f'(u) + m

Since m can be made as small as we wish, this new expression fits the definition of the derivative and hence z'(x)*f'(u) is the derivative.
 
Last edited:
  • #8
Thanks. Your Proof to my proof is exactly what I asked for...
 
  • #9
werg, your definition of dy/dx doesn't mean anything as far as I can tell; can you clarify it?
 
  • #10
my stomach hurts. this is horse****
 
  • #11
Office_Shredder said:
werg, your definition of dy/dx doesn't mean anything as far as I can tell; can you clarify it?

I have no definition of dy/dx: I just assigned dy/dx to a fraction for the sake of being practical. dy/dx in my explanation really means f(x+h) - f(x)/h. Mathwonk, please elaborate...
 
  • #12
Werg22 said:
I have no definition of dy/dx: I just assigned dy/dx to a fraction for the sake of being practical.

It doesn't seem like such a great idea, at least for rigorous proofs, to arbitrarily assign new definitions to things that are already well defined.

Werg22 said:
dy/dx in my explanation really means f(x+h) - f(x)/h.

But that is not what dy/dx is, it is the limit of this ratio.
 
  • #13
Werg22 said:
I have no definition of dy/dx: I just assigned dy/dx to a fraction for the sake of being practical. dy/dx in my explanation really means f(x+h) - f(x)/h. Mathwonk, please elaborate...
Then you hit the wall that

[tex]lim_{h->0}\frac{f(x+h)-f(x)}{h}[/tex] =/= [tex]\frac{lim_{h->0}f(x+h)-f(x)}{lim_{h->0}h}[/tex]
 
  • #14
mathwonk said:
my stomach hurts. this is horse****

While I agree with the sentiment... not your most constructive post ever.

(Nor mine :rolleyes: )
 
  • #15
Werg22, I am all for your treatment of the differentials in such a manner, but as a nice trick that use in calculations, not as a *proof*. If you wish to justify your treatment rigorously, please prove that in every case the differentials can be treated as such, retaining its original definition.
 
  • #16
Okay, the arguments seem to have swerved. But nobody has explained whether my PROOF (NOT derivation) is valid. If not, nobody has yet stated why...
 
  • #17
In The strictest sense, it is not valid as you have treated notation which just appears to look like a fraction, as a fraction. We have discussed the pros and cons and if you read our response you would have realized we have already stated why it is not valid.
 
  • #18
d_leet said:
It doesn't seem like such a great idea, at least for rigorous proofs, to arbitrarily assign new definitions to things that are already well defined.
But that is not what dy/dx is, it is the limit of this ratio.

I don't think you got the whole principle. Forget dy/dx. Pretend we are talking about f(u+h) - f(u) / i

where h is a small change change in u that is itself a function of the change in x, which we shall denote i.

If we multiply the expression by the change h top and bottom, we get the fraction

f(u+h) - f(u) / h * h/i

Note that this a fraction and no limit has been evaluated. Now as i - > 0, so does h. But however small i and h, the expression simplifies to f(u+h) - f(u)/i.
Hence, the limit as i goes to 0 of THAT expression, is the same as the limit of f(u+h) - f(u)/i as i goes to 0. We have (definition of a limit):

f(u+h) - f(u) / h = f'(u) + k, h/i = z'(x) + l

where k and l are increasingly small for smaller and smaller i and h. This gives us the following expression for f(u+h) - f(u) / i:

f(u+h) - f(u) / i = f(u+h) - f(u) / h * h/i = f'(u)*z'(x) + kl + k(...) + l(...)

Here again, I have only dealt with fractions. Now look at what happens: if we let i go to 0, so does h. Hence, for a very small i, k and l will be very small. Now you can see that the expression kl + k(...) + l(...) approaches 0 and is, from a certain point, constantly approaching 0 without being bounded to a value close to it. Hence we can write

f(u+h) - f(u) / i = f'(u)*z'(x) + m(i)

where m(i) is a function of i and is equal to kl + k(...) + l(...). Now since we have deduced that m(i) is increasingly small (absolute value) from a certain point and approaches 0, this new expression fits exactly what we mean by a limit. Hence f'(u)*z'(x) is defined as the limit. This is exactly what I did with my explanation that made dy/dx a real fraction: it was for the sake of being pragmatic. In the explanation, dy/dx were no longer infinitesimals, but just values such as f(u+h) - f(u), i and u. Now that I didn't use any of this, I hope it's clearer. Anyway, the real use of dy/dx spares us allot of time as it spares us to consider everything with limits. And for being rigorous, the explanation itself is plenty rigorous, just not the presentation - a ridiculous expectation on a forum.
 
Last edited:
  • #19
Office_Shredder said:
Then you hit the wall that

[tex]lim_{h->0}\frac{f(x+h)-f(x)}{h}[/tex] =/= [tex]\frac{lim_{h->0}f(x+h)-f(x)}{lim_{h->0}h}[/tex]

No wall at all, h is left to be the same on top and bottom so it's the same expression.
 
  • #20
Except the RHS of that inequality doesn't even exist.

So you've proven no function has a derivative. Congratulations

Notice how, when being rigorous, one can only prove lim(a/b) = lim(a)/lim(b) only if lim(a), lim(b) both exist and lim(b) =/= 0 (assuming lim(a/b) exists here)
 
  • #21
In general

[tex]\lim_{x\to c} \frac{f(x)}{g(x)} \not= \frac{\lim_{x\to c} f(x)}{\lim_{x\to c} g(x)}[/tex]
 
  • #22
Office_Shredder said:
Except the RHS of that inequality doesn't even exist.

So you've proven no function has a derivative. Congratulations

Notice how, when being rigorous, one can only prove lim(a/b) = lim(a)/lim(b) only if lim(a), lim(b) both exist and lim(b) =/= 0 (assuming lim(a/b) exists here)

RHS? Meaning? And the condition lim (b) = 0 is the whole foundation of differential calculus, so you're going to have to explain what you mean here. And as I calrified before, h on the top and h on the bottom are kept equal, which does not change the limit.
 
  • #23
Gib Z said:
In general

[tex]\lim_{x\to c} \frac{f(x)}{g(x)} \not= \frac{\lim_{x\to c} f(x)}{\lim_{x\to c} g(x)}[/tex]

If the change h is kept equal on the denominator and numerator, the limit IS the same! And what do you mean in general? This inequality is only true if the limit of the denominator is 0.
 
  • #24
RHS means right hand side (of the equation).

And the condition lim (b) = 0 is the whole foundation of differential calculus

No, the foundation of differential calculus is that limits of the form lim(a/b) where lim(a) and lim(b) go to zero can converge. This doesn't mean it converges to the expression lim(a)/lim(b) (since then it wouldn't, because lim(a)/lim(b) doesn't even exist).

Do you even know what the definition of a limit is? Because this should be pretty obvious

EDIT

If the change h is kept equal on the denominator and numerator, the limit IS the same! And what do you mean in general? This inequality is only true if the limit of the denominator is 0.

That's the point, if the denominator is 0 or if either limit doesn't exist, then it's not true.

For example, if c=0, f(x)=1/x, g(x)=1/x2, then limf(x)/limg(x) doesn't exist, but lim[f(x)/g(x)] = 0
 
  • #25
Office_Shredder said:
RHS means right hand side (of the equation).
No, the foundation of differential calculus is that limits of the form lim(a/b) where lim(a) and lim(b) go to zero can converge. This doesn't mean it converges to the expression lim(a)/lim(b) (since then it wouldn't, because lim(a)/lim(b) doesn't even exist).

Do you even know what the definition of a limit is? Because this should be pretty obvious

This is very frustrating. I will repeat it one last time. It converges towards the same value if the value of h in f(x+h) - f(x) and h itself are kept the same! If h = 0.001 on top, it's equal to 0.001 on the bottom!

f(x)=1/x, g(x)=1/x:

f(0+ h) = 1/h , g(0+h) = 1/h^2

now the expression f(h)/g(h) = h. And to what value does that expression converge, provided that the value of h is kept the same in both functions, as h goes to 0? 0.
 
Last edited:
  • #26
Noo... once you split it into two limits,, there's no reason to think the h's go to zero just as fast or anything like that (again, definition of a limit?).

The h is just a dummy variable and could as easily be called k and j, or whatever.

I'm not responding anymore until you post the definition of a limit, because otherwise the conversation is meaningless, because you're not actually arguing from a logical basis
 
  • #27
Office_Shredder said:
Noo... once you split it into two limits,, there's no reason to think the h's go to zero just as fast or anything like that (again, definition of a limit?).

The h is just a dummy variable and could as easily be called k and j, or whatever.

I'm not responding anymore until you post the definition of a limit, because otherwise the conversation is meaningless, because you're not actually arguing from a logical basis

The irony!
 
  • #28
a proof is a logical deduction starting from meaningful definitions.

in your argument you have made assertions without justification, like dy/du times du/dx equals dy/dx.

this assertion is in fact the statement of the chain rule that you are trying to prove.

this is not trivial, at least not with the usual limit definitions of dy/du and du/dx. any other non standard definitions, involing say "infinitesimals" need to be accompanied by and justified in terms of, appropriate defijnitions of infinitesimals.
 
  • #29
here is a fairly complete proof of the chain rule.

f is differentiable at z0 if and only if the limit of [f(z)-f(z0)]/(z-z0) exists and is finite. if so this limit is denoted f'(z0).

let u(x) be differentiable at x0 and also y(u) be differentiable at u0 = y(x0).

then we claim y(u(x)) is differentiable at xo with derivative y'(u0).u'(x0).

lemma: u is also continuous at x0.
proof: since [u(x)-u(x0)]/(x-x0) -->u'(x0), as x-x0-->0, then

[u(x)-u(x0)] = [u(x)-u(x0)]/(x-x0) . [x-x0]-->0, as x-x0-->0. QED


Now we must show that [y(u(x))-y(u(x0))]/(x-x0)-->y'(uo).u'(x0),

as x-->x0.

case 1) assume u(x) is never equal to u0 on some punctured nbhd of x0.

then the fraction [y(u(x))-y(u(x0))]/(x-x0), equals the product

{[y(u(x))-y(u(x0))]/(u(x)-u0)}.{[u(x)-u(x0)]/(x-x0).

since in the product, the factors approach y'(uo) and u'(x0) respectively, the product rule for limits finishes it.

case 2) u(x) equals u0 for some x on every punctured nbhd of xo. now we cannot consider the denominator u(x)-uo as x-->x0, for such points x but no matter.

as x-->xo through other values, the previous argument still works, and now we know that the limit u'(x0)= 0, since it exists and equals zero on every punctured nbhd of x0.

thus as x-->xo through values where u(x) differs from u0, the limit of

[y(u(x))-y(u(x0))]/(x-x0) is zero.

But also at the other points x where u(x) = u(x0), now
the numerator of [y(u(x))-y(u(x0))]/(x-x0) equals zero, so the limit is also zero as x-->x0 through points where u(x) = u(x0).
 
  • #30
I don't see what's the point of going through all of that. If we have f(z(x + h)) - f(z(x)) / z(x + h) - z(x), certainly the limit of that expression as z(x + h) - z(x) goes to 0 is f'(z(x)). if we multiply the expression f(z(x + h)) - f(z(x)) / h by [z(x + h) - z(x)] / [z(x + h) - z(x)], the limit doesn't change since this is equal to 1. So:

lim h - > 0 f(z(x + h)) - f(z(x)) / h = lim h - > 0 f(z(x + h)) - f(z(x)) / z(x + h) - z(x) * z(x + h) - z(x) / h.

The limit of the last expression is f'(z(x))*z(x), and so is the limit of the original expression. If this is not rigorous, then I might as well just go to a nuts house.
 
  • #31
Werg22 said:
If the change h is kept equal on the denominator and numerator, the limit IS the same! And what do you mean in general? This inequality is only true if the limit of the denominator is 0.

please explain what you mean by "keeping the change equal" in a limit using the formal definition of a limit...
 
Last edited:
  • #32
What I mean by this is: Suppose we have f(x+h) - f(x) /h. Now we chose a very small value of h that is the same on top and bottom. For example h = 0.000001. We get f(x+0.000001) - f(x) /0.000001. If we let h approach 0 whilst respecting that condition, then the expression f(x+h) - f(x) /h tends towards the limit. In the expression of the derivative such as f(z(x+i)) - f(z(x)) /i, the same concept applies. If we multiply top and bottom by h = z(x+i) - z(x), then we can rewrite the expression as the product of two quotients. Now, if we let i go to very small values, so does h. The expression h/i thus approaches the limit z'(x), and the expression f(z(x+i) - f(z(x))/ h approaches f'(z(x)). Expressing the expression in such a way shows us that

f(z(x+i)) - f(z(x)) /i = f'(z(x))*z'(x) + kl + l(..) + k(...)

Where k and l are values that get smaller and smaller in scale as i approaches 0. You can see that kl + l(..) + k(...) doesn't have any lower bound: we make it as small as we wish by conveniently reducing i. This is the very definition of a limit: the limit of a function f(x) at c, is a constant C such as that we have an equality

[tex] f(c + h) = C + \epsilon (h) [/tex]

in which epsilon can be made as small in scale as we wish by making h conveniently smaller. THIS is the definition of a limit, or at least the definition I have always believed in.
 
  • #33
werg, you do not seem to realize that it is possible the fraction you wrote has a zero denominator for infiniteoy many values of x on every nbhd of x0.

then it is not at all clear that multiplying by 0/0 is multiplying by 1.

i.e. you say "certainly the limit of that expression as z(x + h) - z(x) goes to 0 is f'(z(x))"

but the point is that the denominator has to go to zero through non zero values for that statement to be true.

and when you are just letting x go to x0, that may not be the case.
 
  • #34
I don't really understand the meaning of this... z(x + i) - z(x) is never 0, it's a function of i, which is itself never 0. z(x + i) - z(x) goes through strictly positive or negative values past a certain lower bound on h, this depending on the monotonicity of the function z on on the point at which we are evaluating the derivative. This said z(x + i) - z(x) tends towards 0, meaning it can be made as small as we wish.

but the point is that the denominator has to go to zero through non zero values for that statement to be true.

To repeat myself, z(x + i) - z(x) is never 0, as i is never 0. This is best shown by assigning the value z(x + i) - z(x) to h. We write in simplified terms:

f(z(x+i)) - f(z(x)) / z(x + i) - z(x) = f(z(x) + h) - f(z(x)) / h.

Since h tends towards 0, this expression, here again, tends towards the derivative at point z(x). The same goes with h/i. And since their product of the two ratios is always f(z(x+i)) - f(z(x)) / i, here again reasserting the fact that neither h nor i are 0, we get the expression

f(z(x+i)) - f(z(x)) / i = f'(z(x))*z(x) + m

Where m gets increasingly small as the other two ratios get closer to the actual value of the their respective derivatives. The above expression is exactly what we are looking for: f'(z(x))*z(x) is the limit as i goes to 0, because m is a function of i and can be made as small as we wish.
 
Last edited:
  • #35
Sorry about this late Reply. But I asked another teacher. He showed that my proof is valid:

BUT TO FULLY PROVE IT, I must take the limit of x-->0 in the END.

This creates:

lim(x-->0) (dy/du*du/dx)
= lim(x-->0) (dy/du) + lim(x-->0) (du/dx)

This proves it. But there is a problem with: lim(x-->0) (dy/du)

But as x-->0, u-->0
so this proves it. Agreed?
 

Similar threads

Replies
13
Views
2K
Replies
22
Views
2K
Replies
7
Views
1K
Replies
3
Views
1K
Replies
1
Views
1K
Replies
2
Views
38K
Replies
3
Views
3K
Replies
6
Views
2K
Replies
6
Views
3K
Back
Top