What is the Connection Between the Chain Rule and Differentials?

In summary: So while the chain rule is a very simple theorem that can be proven in standard calculus, proving it in a more rigorous way using infinitesimal calculus would be more appropriate.In summary, the author is asking why a proof of the chain rule can't be done using properties of real numbers, and is inconclusive as to why this isn't an acceptable proof. Additionally, the author is asking about books on differentials and whether there are any that go into more depth than what is found in a standard calculus book.
  • #1
MathStudent
281
1
Hi,
I've seen a couple of proofs for the chain rule, and I know this probably sounds stupid, but I'm wondering why it can't be proved as follows:

given the real valued functions [itex]y=f(u), u=g(x) [/itex]
since [itex]dy, du, dx, [/itex] are all real valued functions as well
can't you just state:
[tex]\frac{dy}{dx}=\frac{dy}{du}\frac{du}{dx} [/tex]
by the properties of real numbers?

can someone explain why this isn't an acceptable proof?

Also since I'm on the subject of differentials, does anyone know of any good books on the theories of differentials, because I've spent a lot of time thinking about this concept, and it seems to have a different meaning for different applications. I've heard that there are plenty of theories that explain what a differential is, and explains more about it's uses ( beyond the scope of a Calc 1-3 book). Any info would be greatly appreciated.

Thanks in advance!
 
Physics news on Phys.org
  • #2
by the properties of real numbers?

can someone explain why this isn't an acceptable proof?

And what property would that be? Remember that, for instance, dy is not a real number, and dy/dx is not a ratio of two real numbers.

To make an acceptable proof, you have to apply your idea to the correct definitions of the terms involved.
 
  • #3
Thanks for your reply hurkyl, but I don't understand why dy is not a real number? from what I know,

[itex]dx = \Delta x[/itex] -is an increment of x which is the difference of two real numbers, which itself is a real number.

and since [itex]dy = f'(x)dx [/itex] and [itex]f'(x)[/itex] evaluated at some x is a real number

so it seems to me that both dx and dy take on real values and thus can be treated as real numbers, and so dy/dx could be treated as the ratio of two real numbers.

Please pardon my ignorance here I realize I must be missing something important.
 
Last edited:
  • #4
That is certainly the inspiration for differentials, but it's not that easy. And while the notation is such that you can usually manipulate them as if they were real numbers, that's not always the case.

Matt grime likes to give this identity involving three dependent variables:

[tex]
\frac{dx}{dy} \frac{dy}{dz} \frac{dz}{dx} = -1
[/tex]

The actual definition of the derivative is:

[tex]
\frac{dy}{dx} := \lim_{\Delta x \rightarrow 0} \frac{\Delta y}{\Delta x}
[/tex]

So it's not simply the ratio of two real numbers, but the limit of such ratios. For the proof to be valid, you have to factor in the limit. For the case of the chain rule, it just means that you are cancelling real numbers inside the limit.
 
  • #5
Hurkyl said:
For the case of the chain rule, it just means that you are cancelling real numbers inside the limit.

Just to clarify, you are stating that the we must cancel real numbers inside the limit before evaluating the limit in order for a proof of the chain rule to be valid?

Also I appologize, I didn't realize that there is already a thread created on this subject which I found to be helpful for anyone else that has similar problems.
https://www.physicsforums.com/showthread.php?t=57419

It seems that there are subjects that go deeper into the theory of differentials and infinitessimals than what can be found in a standard calculus book.
Are there course that go deeper into this theory?

Thanks by the way for all your help!
(I'm very impressedd with this site!)
 
  • #6
Look closely at your calculus book! The one I am looking at (Calculus by Salas, Hille and Etgen, ninth edition) says "If Δx is small then df is approximately f'Δx". The statement "dx= Δx" is not true: it is an approximation.

[itex]\frac{dy}{dx}[/itex] is approximately equal to [itex]\frac{\Delta y}{\Delta x}[/itex].
 
  • #7
you are stating that the we must cancel real numbers inside the limit before evaluating the limit in order for a proof of the chain rule to be valid?

I don't know all possible proofs of the chain rule -- I was just referring to the one I think is most straightforward, and highlighting the key difference between it and your invalid argument.


Your calc 1 book says exactly what df/dx means; you don't need to appeal to anything else.


Differentials are something else (but similar), but you wouldn't use them until you start doing differential geometry, or the like.


Infinitessimals are a different subject entirely. In standard analysis, the only infinitessimal is 0, so it's not a particularly useful concept.
 
  • #8
Actually you can prove the chain rule by just asserting
[tex]\frac{dy}{dt}= \frac{dy}{dx}\frac{dx}{dt}[/tex]
provided you have defined dy, dx, and dt as "infinitesmals". In order to define infinitesmals themselves, you have to go to "non-standard" analysis which requires sophisticated notions from logic (specifically, the "compactness property", that if every finite subset of a set of axioms has a model, then the entire set of axioms has a model).
 
  • #9
Almost, but not quite. In nonstandard analysis, dy/dx is defined to be equal to the standard part of Δy/Δx, provided that this exists and is the same for all choices of the infinitessimal Δx.

Taking the standard part of a number means to round to the nearest standard (i.e. real) number.

So, for the proof to be accurate, you need a theorem about how multiplication interacts with the standard part operation -- that std (xy) = (std x)(std y), given the appropriate hypothesis.
 
  • #10
HallsofIvy said:
Look closely at your calculus book! The one I am looking at (Calculus by Salas, Hille and Etgen, ninth edition) says "If Δx is small then df is approximately f'Δx". The statement "dx= Δx" is not true: it is an approximation.

[itex]\frac{dy}{dx}[/itex] is approximately equal to [itex]\frac{\Delta y}{\Delta x}[/itex].
hmm ...that's interesting
I have never looked in that book, but that is something I have never heard before. Everywhere I've seen has a slightly dissimilar definition.
They let
[tex]dx = \Delta x[/tex]
where the define
[tex]dy = f'(x)dx[/tex]

so if [itex]\Delta x[/itex] is small then
[itex]\Delta y[/itex] is approximately dy
 
  • #11
In general a proof first requires a definition. So it is clearly true that if you define dx to be deltax, and define dy to be f'(x) deltax, then obviously f'(x) = dy/dx.

the modern differential geometry definition of df, for any differentiable function f, is that it is a function on tangent vectors to the real line. i.e. given a point p on the real line, and a tangent vector v at that point, then df(v) = the derivative of f in the direction v. now the standard tangent vector is the unit vector e in the positive x direction, and the derivative of f in that direction is the usual derivative f'(p) = dfp(e).

If v is any tangent vector one can always write it as a scalar multiple of the standard unit vector, v = ce, and then one has dfp(v) = cdf(e) = cf'(x). So df is a linear function on the tangent space at p.

now x is a function on the x axis, namely the identity function, and as such it has a differential dx, whose value at any point p and any vector v, where v = ce, is simply dx(v) = dx(ce) = cdx(e) = c.1 = c. Since a tangent vector v at p is merely the vector from p to p+v, we also call v = delta x. thus in this sense, dx(v) does equal deltax, i.e. it equals the difference v between x and x+v.

now if v = ce, since dfp(v) equals cdfp(e) = cf'(p), and dxp(v) = cdxp(e) = c, it follws that indeed dfp is a function which on every tangent vector at p, equals exactly f'(p) times what dx equals. thus the quotient of the two linear functions, dfp and dxp, is a constant function with value f'(p).

In this sense dfp/dxp = f'(p) as a quotient of linear functions, for all p, and hence df/dx = f' is true as a quotient.

the definition of dx as deltax, while well meaning, is misleading since it should say that for all p, the function dxp on tangent vectors v at p, equals the function deltax,p. namely both of them, acting on the point x+h, yield the number h.


in geometric terms, df is the family of linear functions whose family of graphs is simply the family of tangent lines to the graph of f. thus dx is the family of tangent lines to the graph of y=x, namely the family of lines of slope 1, one copy for each point p on the x axis. thus dividing dfp by dxp, for a given p, means dividing these two linear functions, which amounts to dividing their slopes. this gives f'(p)/1 = f'(p). i.e. the function taking x to f'(p)x divided by the function taking x to x, can be said to equal the constant function taking x to f'(p), i.e. the number f'(p).
 
Last edited:

FAQ: What is the Connection Between the Chain Rule and Differentials?

What is the chain rule in calculus?

The chain rule is a calculus formula used to find the derivative of composite functions. It states that the derivative of a composite function is equal to the derivative of the outer function multiplied by the derivative of the inner function.

How do you use the chain rule to find the derivative of a function?

To use the chain rule, first identify the outer function and the inner function. Then, take the derivative of the outer function and multiply it by the derivative of the inner function. Remember to use the chain rule again if the inner function is also a composite function.

What is the purpose of the chain rule in real-world applications?

The chain rule is used to find the rate of change of a dependent variable with respect to an independent variable in complex systems. It is commonly used in physics, engineering, and economics to model and analyze real-world phenomena.

What is a differential in calculus?

A differential is an infinitesimal change in the value of a function with respect to a given variable. It is represented by the symbol "dx" and is used to calculate the slope of a tangent line to a curve at a specific point.

How do differentials relate to the chain rule?

The chain rule can be used to find the differential of a composite function. By taking the derivative of the composite function and multiplying it by the differential of the independent variable, we can calculate the differential of the dependent variable.

Similar threads

Replies
6
Views
3K
Replies
2
Views
2K
Replies
5
Views
2K
Replies
4
Views
2K
Replies
9
Views
2K
Replies
5
Views
2K
Replies
10
Views
3K
Replies
22
Views
3K
Replies
2
Views
1K
Back
Top