The Chain Rule, death to anyone that breaks the rule

Cyrus · May 26, 2005

Ok gotcha! STUPID ME, I looked at a second calc book I have, Swokowksi, which explains this VERY problem! Oh well, I wouldent have REALLY understood it if I just read it. It was worth the three days of pitiful thinking to come up with such a stupidly simple anwser. One problem was that I was trying to take the limit using the differential equation i.e with dz and not delta z! And that made me say to myself, what the hell is a limit of dz/dx, and how does that turn into partial z /partial x! Aye, stupid mistake.

Ok, so now things are looking ALOT better. The only thing now is to show how THIS final equation which we both, hopefully agree on, is equal to

[tex] \frac{\partial Q}{\partial x}+ \frac{\partial Q}{\partial z} \frac{\partial z}{\partial z} [/tex]

Cyrus · May 26, 2005

Ahhhhhhhhh yes, the only thing left to do now is make x'=X(x,y)=x equal to just the variable x. and like wise to y. Then partial x' / partial x is just x'=X(x,y) = x =x (these two x's have different meanings. One is the for the x term in the Q function Q(x,y,z) and the other is the x inside the parthensis (x,y). Confusing ant it :-). but this just means the partial of x' with respect to x, which is 1! YIPPIE! Also, y'=Y(x,y)=y=y so partial y'/ partial x is zerooooooooo double YIPPIE!. So now we get galileo's anwser via a much more rigerously correct proof. Finally, this Chain rule stuff makes sense, whew, I guess that means I get to live for another day!

mathwonk · May 26, 2005

try and get on the freeway, and off the feeder roads, at some point, and understand what i am telling you.

Cyrus · May 26, 2005

I tried to follow your proof of the chain rule but don't understand some of the things you presented. What is this little o big O deal? Does it have anything to do with the epsilon, or is it another concept? Sorry, but I have to plea ignorant here.

P.S. Now that I have figured out how to prove this using the more confusing and tedious way, and I am sure your way is much better, I wanted to make sure I fully understand BOTH ways. So now I will go back and read YOUR posts. Please don't feel that I have not appreciated your help you have provided even though I haven't looked through it yet.

Oh, and to arildno; did I do the right thing by making x'=x, and y'=y and z'=f(x,y), so that the partials work out, because both you and galileo stated it as x(x,y), but I don't see how you can reduce x(x,y) so that partial x / partial x is equal to one, unless x(x,y) is only a function of the inner x only.

arildno · May 27, 2005

cyrusabdollahi said:

Ahhhhhhhhh yes, the only thing left to do now is make x'=X(x,y)=x equal to just the variable x. and like wise to y. Then partial x' / partial x is just x'=X(x,y) = x =x (these two x's have different meanings. One is the for the x term in the Q function Q(x,y,z) and the other is the x inside the parthensis (x,y). Confusing ant it :-). but this just means the partial of x' with respect to x, which is 1! YIPPIE! Also, y'=Y(x,y)=y=y so partial y'/ partial x is zerooooooooo double YIPPIE!. So now we get galileo's anwser via a much more rigerously correct proof. Finally, this Chain rule stuff makes sense, whew, I guess that means I get to live for another day!

You've got it.

Cyrus · May 27, 2005

Thanks arildno, your a life saver, and you saved me twice already. Once here, and once before in surface integrals!

Now for another side question.

The definition of lineraization for a curve is based on a visual proof. Similarly, the definition of lineraization of a surface is based on a visual proof using the tangent plane. But for higher dimensions, like in our case, the linearization is a (tangent surface?). But there is seemingly NO way to visualize how this formula is derived. Do we just define it to be written the way it is (using our knowledge of symmetry of going from a curve to a surface, we extend that symmetry in hope that it still holds true for higher n-dimensional spaces.)?

mathwonk · May 27, 2005

a derivative is a linear map that approximates a given non linear map locally at some point. the whole point in defining a derivative is to say precisely how good an approximation the linear map must be to the given non linear map, in order to be considered its derivative. in the usual one variable case we say the derivative of f at a equals f'(a) provided [f(x)-f(a)]/(x-a) converges to f'(a) as x goes to a.

this treats the derivative f'(a) as a number. of course the derivative is really the linear function f'(a)(x-a) which approximates to the function f(x)-f(a) in the sense that their graphs are tangent to each other at x=a.

so how close is this linear function f'(a)(x-a) to the non linear function f(x)-f(a)?

well, when you subtract them you get an error term, [f(x)-f(a) - f'(a)(x-a)] which not only goes to zero as x does, but even the ratio
[f(x)-f(a) - f'(a)(x-a)] /(x-a) also still goes to zero as x-a does.

We say a function with this property, that it not only goes to zero, but also the ratio after dividing it by |x-a|, goes to zero with x-a, i.e. a function whose graph is tangent to the x-axis at x=a, is "little oh" of x-a.

then the definition of the derivative of f at a, is that it is a linear function of x-a whose graph is tangent to the graph of f(x)-f(a) at x=a, i.e. the graph of their difference is tangent to the x-axis at a, i.e. they differ by a function which is "little oh", i.e. very small.

This definition works in all dimensions and even in infinite dimensions. I.e. a function is little oh if its graph is tangent to the source space axis, i.e. if o(v)/|v| goes to zero as v does.

then a derivative of f at a, is a linear function L(x-a) of x-a such that the difference

f(x)-f(a)-L(x-a), is little oh at x=a,

i.e. a linear function L is the derivative of f at a, if and only if |f(x)-f(a)-L(x-a)|/|x-a| goes to zero as x-a does.

so the visual intuition of linearity in terms of flatness of the graph, is replaced by the algebraic notion of linearity of the function, i.e. L(v+w) = L(v)+L(w), etc...

and the geometric notion of tangency is replaced by the analytic description of tangency of the graphs, i.e. the slope of the distance between the graphs goes to zero, i.e. not only does [f(x)-f(a)] - L(x-a)] approach zero, but the slope of this error term, i.e. {[f(x)-f(a)] - L(x-a)}/|x-a| also goes to zero, as x-a does.

if you think about it this says the graph of the difference looks kind of like an upside down umbrella, and is tangent to the "x axis". at x=a.

so to repeat: the first thing to define is what it means to have derivative zero at a, and we say o(v) has derivative zero (at v=0), iff o(v))/|v| approaches zero as v does.

then we say f has derivative L at a, iff L is a linear function and

f(a+v)-f(a)-L(v) has derivative zero at v=0.

I promise you this is worth learning.

Cyrus · May 28, 2005

Im trying to make sense of your post math wonk. Well take it one line at a time.

well, when you subtract them you get an error term, [f(x)-f(a) - f'(a)(x-a)] which not only goes to zero as x does, but even the ratio
[f(x)-f(a) - f'(a)(x-a)] /(x-a) also still goes to zero as x-a does.

Im not seeing what you mean by it goes to zero as x does, but even as the ratio... Does it not go to zero as x goes to a?, not x goes to zero.

mathwonk · May 28, 2005

right you are. my error.

the whole point is define "tangent to zero."

a function o is "tangent to zero" if its graph is tangent to the graph of the zero function, i.e. the slope uniformly in every direction is zero, i.e. o(v)/|v| goes to zero as v does.

then two functions that both vanish at zero are tangent to each other (at zero) if their difference is tangent to zero.

then f is differentiable at a if and only if f(a+v)-f(a), as a function of v, is tangent to some linear function of v.

i.e. iff there exists some linear function L such the difference

f(a+v)-f(a)-L(v) is tangent to zero,

iff [f(a+v)-f(a)-L(v)]/|v| goes to zero as v does.

this is the universally agreed upon correct definition of a derivative, in use at least since G Hardy's "Pure Mathematics" in 1910 or so.

It is this definition that makes the proof of the chain rule most natural, in all dimensions at once, as I have outlined above.

I learned it from Lynn Loomis, see his Advanced Calculus, joint with Shlomo Sternberg, or Jean Dieudonne's Foundations of modern Analysis.

Cyrus · May 29, 2005

[f(x)-f(a) - f'(a)(x-a)] /(x-a) also still goes to zero as x-a does.

When x goes to a, the deminator goes to zero, could to expand on what you mean by that please. Are you assuming this based on the use of L'Hospitals rule, and defining this function such that it will go to zero when L'Hopitals rule is applied?

mathwonk · May 29, 2005

this is just the definition of a derivative, i.e. the usual definition is that

f'(a) is the derivative of f at a if and only if [f(x)-f(a)]/(x-a) approaches f'(a) as x goes to a.

i.e. if and only if [f(x)-f(a)]/(x-a) - f'(a) approaches zero as x goes to a,

if and only if [f(x)-f(a)]/(x-a) - [f'(a)(x-a)/(x-a)] approaches zero as x goes to a,

if and only if [f(x)-f(a) - f'(a)(x-a)]/(x-a)] approaches zero as x goes to a.

Cyrus · May 29, 2005

We say a function with this property, that it not only goes to zero, but also the ratio after dividing it by |x-a|, goes to zero with x-a, i.e. a function whose graph is tangent to the x-axis at x=a, is "little oh" of x-a.

I don't see what you mean by tangent to the x-axis at x=a. It is not tangent to the curve f(x)?

Cyrus · May 30, 2005

Ahhhhhhhhhhh, i did not read your post VERY carefully Mathwonk,

this is just the definition of a derivative, i.e. the usual definition is that

f'(a) is the derivative of f at a if and only if [f(x)-f(a)]/(x-a) approaches f'(a) as x goes to a.

i.e. if and only if [f(x)-f(a)]/(x-a) - f'(a) approaches zero as x goes to a,

if and only if [f(x)-f(a)]/(x-a) - [f'(a)(x-a)/(x-a)] approaches zero as x goes to a,

if and only if [f(x)-f(a) - f'(a)(x-a)]/(x-a)] approaches zero as x goes to a.

I read you loud and clear now on that issue.

mathwonk · May 30, 2005

great! may i say you are one of the most patient and excellent students i have encountered on this thread.

Cyrus · May 30, 2005

We say a function with this property, that it not only goes to zero, but also the ratio after dividing it by |x-a|, goes to zero with x-a, i.e. a function whose graph is tangent to the x-axis at x=a, is "little oh" of x-a.

Could you please help me out on this line. I am not seeing what you mean by tangent to the x-axis at x=a. Also, is little oh of x-a, what your calling the function f(x)?

Also, a stupid question, but ill ask it anyways (I should know this by now). How come you do not run into trouble when divide by (x-a). As x approaches a, you approach division by zero. This is also true in the very definition of a derivative, because your dividing by h, as h approaches zero.

mathwonk · May 30, 2005

the graph is tangent to the x axis.

as to your second question, do you know about limits?

you seem to need some review of this basic topic. the fundamental idea is that a limit has a priori nothing at all to do with the value at the point. so the fact that (f(x)-f(a))/(x-a) is undefiend at x= a has nothing at all to do with the limit at x=a.

for example, (x-a)/(x-a) has a limit as x goes to a, even though the bottom goes to zero.

Cyrus · May 30, 2005

Right, that's why I asked earlier about having to use L'hospitals rule for the limit to be equal to a defined value as the denominator and numerator go to zero. Oh, opps, READ THE FINE PRINT! IT says in my clac 1 book as x approaches a but NOT EQUAL TO A! Ok, I get it! Ill just blame the tv with the miss universe pagent in the background for the sudden lapse of judgement! Aye caramba, I picked the wrong major!

mathwonk · May 30, 2005

it makes no sense to speak of using l'hopitals rule to evaluate derivatives since you must know what a derivative is before even stating l'hopitals rule.

you may need to go back pretty much to the beginning, and learn about limits and derivatives properly. i recommend spivak's calculus book.

Cyrus · May 30, 2005

Its been a while since I deat with limits. But I take back what I said about l'hopitals rule. The limit is as x approaches a, but does not EQUAL it. I forgot that minor, but important detail. Hopefully, were both on the same page on that issue now though.

Cyrus · May 30, 2005

See here's what I mean though,

If we have as you state: [f(x)-f(a) - f'(a)(x-a)]/(x-a)] . Then we can call this the same as, [f(x)-f(a) - f'(a)(x-a)]= F(x) and (x-a) as G(x). Then we can re-write this limit as F(X)/G(X) as X approaches a. The consequence is that it converges to zero over zero, and thus my earlier statement about l'hospitals rule. And the derivative of the bottom function would be d/dx (x -a) , which is just one, and the numerator would be f'(x)-f'(a) - f'(a)(x'), which is just f'(a) - 0 - f'(a) = 0.

Thus, you get zero over 1, and the limit converges to zero. I am a little rusty, I may be wrong though. My whole issue is that you are talking about a limit that approaches zero in the denominator and numerator, and that's a problem.

Cyrus · May 30, 2005

I have to go to bed now, but I will put up a simple example to better show my question. For instance, the definition of a derivative says the limit as x-->0 of
[f(a+x)-f(a)]/(x-a)

If we have something like, f(x)= x^2, and a=1, then the limit simplifies and you can factor the x-a term out of the denominator, thus elminating the divison by zero problem. But this works only for nice functions like this, and does not work for all functions in general, i.e. its not always possible to elmininate the division by zero.

The Chain Rule, death to anyone that breaks the rule

Similar threads

Hot Threads

Recent Insights