Why does MTW keep calling the "product rule" the "chain rule"?

In summary, the chain rule is a new term that is used in place of the "product rule" in the exercises in MTW p 257.
  • #36
FreeThinking said:
Also, starting with the last two paragraphs at the bottom of page 208, we establish that ## \boldsymbol {e}_\beta ## and ## \boldsymbol {\omega}^\alpha ## are general bases dual to each other. Continuing onto page 209, equation 8.19a says that ## {\boldsymbol \nabla}_\gamma \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ## . Then further down the page, equation 8.20 defines ## T^\beta_{\alpha,\gamma} \equiv {\boldsymbol \nabla}_\gamma T^\beta_\alpha \equiv \partial_{{\boldsymbol e}_\gamma} T^\beta_\alpha \equiv \partial_\gamma T^\beta_\alpha ##. With a general basis, not a local Lorentz frame, why are we defining the directional derivative ## {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} \equiv {\boldsymbol \nabla}_\gamma ## to be a partial derivative ## T^\beta_{\alpha,\gamma} \equiv \partial_\gamma T^\beta_\alpha ##? If we were using a coordinate basis, say ## \left \lbrace {\boldsymbol {\xi}_\gamma} \right \rbrace ##, it would make sense since ## {\boldsymbol {\xi}_\gamma} \equiv {\boldsymbol \nabla}_{{\boldsymbol e}_\gamma} ##, the directional derivative operator along the coordinate curve ## {\boldsymbol {\xi}_\gamma} ##. Perhaps if we stare at this section long enough, it might dawn on us what they actually mean. ... I tried to work through equation 8.19a & 8.20, but I'm still not getting the same result they seem to get.

Ok, I've stared it a while longer, and here's what I'm seeing:

Based on how MTW defines things, as described above, I get ## \boldsymbol \nabla_\gamma T^\beta_\alpha = \Lambda^\mu_\gamma T^\beta_{\alpha,\mu} ##, using ## \boldsymbol e_\gamma = \Lambda^\sigma_\gamma \boldsymbol \xi_\sigma ## where ## \boldsymbol \xi_\sigma ## are the coordinate basis vectors. But if I replace the ## \boldsymbol e_\gamma ## with ## \boldsymbol \xi_\gamma ##, I get ## \boldsymbol \nabla_\gamma T^\beta_\alpha = T^\beta_{\alpha,\gamma} ## which seems to be what MTW says it should be.

But, I see several problems with this:
  • MTW has used ## \boldsymbol \nabla ## is such a way that it generates gamma correction terms when applied to a general tensor. But applying it to just the components of a tensor does not generate those components unless we interpret it as the semicolon operator, which they do not seem to do in (8.20).
  • MTW just defined ## \boldsymbol e_\beta ## to be a general basis, not necessarily a coordinate basis. Yet in (8.20) the ## \Lambda^\sigma_\gamma ## needed to define the general basis is nowhere to be found. It is as if MTW has suddenly changed ## \boldsymbol e_\beta ## to be a coordinate basis.

This is a case where the math itself confuses me even if we ignore the text. Which is why, when I arrive at other places in MTW that use the nabla symbol, I'm never sure what they mean at that particular point. I have to work the problem multiple ways until I stumble on the same result.

So, this is a question I would like to have answered: How is one to think about this? Is it a typo? Have they just switched back to using e as a coordinate basis? Is ## \boldsymbol \nabla_\gamma ## intended to be just the simple, elementary, partial derivative at this particular point in the text? Or, which I always consider to be the most likely case, what am I not understanding?
 
Physics news on Phys.org
  • #37
FreeThinking said:
Based on how MTW defines things, as described above, I get##\boldsymbol \nabla_\gamma T^\beta_\alpha = \Lambda^\mu_\gamma T^\beta_{\alpha,\mu}## , using ##\boldsymbol e_\gamma = \Lambda^\sigma_\gamma \boldsymbol \xi_\sigma## where ##\boldsymbol \xi_\sigma## are the coordinate basis vectors.

I don't understand what you're doing here. The Lorentz transformation ##\Lambda## doesn't appear anywhere in the section of MTW you're referring to, and anyway you don't use a Lorentz transformation to transform from local inertial coordinates to general curvilinear coordinates.

FreeThinking said:
but if I replace the ##\boldsymbol e_\gamma## with ##\boldsymbol \xi_\gamma## , I get ##\boldsymbol \nabla_\gamma T^\beta_\alpha = T^\beta_{\alpha,\gamma}## which seems to be what MTW says it should be.

I don't understand what you're doing here either. It doesn't help that you're throwing in your own notation ##\boldsymbol \xi_\gamma##, which doesn't appear anywhere in MTW. MTW always uses ##\boldsymbol e## for the basis vectors, not ##\boldsymbol \xi##.

FreeThinking said:
MTW has used ##\boldsymbol \nabla## is such a way that it generates gamma correction terms when applied to a general tensor. But applying it to just the components of a tensor does not generate those components unless we interpret it as the semicolon operator, which they do not seem to do in (8.20).

You are confused. You don't apply the ##\boldsymbol \nabla## operator to the components of a tensor.

##\boldsymbol \nabla##, by itself, with no subscripts, is a differential operator that takes an ##(m, n)## tensor (a tensor with ##m## upper indexes and ##n## lower indexes, or, in MTW's coordinate-free terminology, a tensor with ##m## slots that accept 1-forms and ##n## slots that accept vectors) to an ##(m, n+1)## tensor. (This is all explained in section 3.5 of MTW.) In other words, if I have a tensor ##\boldsymbol T##, then ##\boldsymbol \nabla \boldsymbol T## is another tensor with one more lower index. Applying ##\boldsymbol \nabla## by itself to the components of a tensor makes no sense.

If I want to express ##\boldsymbol \nabla \boldsymbol T## in component notation, then if ##\boldsymbol T## is a ##(1, 1)##, tensor, i.e., in components it is ##T^\alpha{}_\beta##, then ##\boldsymbol \nabla \boldsymbol T## will be ##T^\alpha{}_{\beta ; \gamma}##.

MTW also use the notation ##\boldsymbol \nabla_{\boldsymbol u}##, i.e., ##\boldsymbol \nabla## with a subscript, to denote a different operator, the directional derivative along the 4-vector ##\boldsymbol u##. In component notation, ##\boldsymbol \nabla_{\boldsymbol u} T## is ##u^\gamma T^\alpha{}_{\beta ; \gamma}##.

In neither case described above do we apply the operator ##\boldsymbol \nabla## (with or without a subscript) to the components of a tensor.

FreeThinking said:
MTW just defined ##\boldsymbol e_\beta## to be a general basis, not necessarily a coordinate basis. Yet in (8.20) the ##\Lambda^\sigma_\gamma## needed to define the general basis is nowhere to be found

I don't know where you're getting this from. You don't use a Lorentz transformation to go to general curvilinear coordinates. See above.

FreeThinking said:
This is a case where the math itself confuses me even if we ignore the text.

Have you encountered covariant derivatives in other textbooks? Have they confused you there?

For example, Carroll discusses covariant derivatives in his lecture notes. Were you able to follow his presentation?
 
  • #38
Peter: My apologies. I was trying to be brief and may have pulled an MTW on you. Just ignore it for now and don't spend any more time on it. I'm working on a longer version that will hopefully explain things more clearly. I've got things going on so it may be a few days before I can post it. I want to make sure I get it right this time.
 
  • #39
I have a question.

MTW says that the covariant derivative is a machine with slots that accepts inputs and produces an output. Looking specifically on page 255, Box 10.3, part A, sub-parts 3 through 5, here's how I interpret what they're saying there:

## \boldsymbol \nabla ## is a machine, called the covariant derivative, with 3 slots. If we plug certain types of objects into the appropriate slots, we get out new machines depending on which slots we fill. These new machines also have slots that accept the proper kind of object. Depending on which slots we fill, we get different outputs: One selection gives us a directional derivative, another selection gives us a gradient, and filling all the slots gives us a number.

The key point of all of this is that the machine called the "directional derivative" and the machine called the "gradient" are both instances of the more general machine called "covariant derivative". A "directional derivative" is a "covariant derivative", but a "covariant derivative" is not necessarily a "directional derivative".

It's analogous to how a car, a truck, and a bus are each an instance of a motor vehicle, but not every motor vehicle is a car; not each is a truck; etc.

So, when MTW writes the term "covariant derivative" but then writes a mathematical expression that looks all the world like a directional derivative, this practice is consistent with their definition of the "covariant derivative" being the generator of other kinds of derivatives.

Is this the correct view to take of how and why MTW keeps calling expressions that are the directional derivative by the name "covariant derivative"?
 
  • #40
All derivatives are directional derivatives per construction. However, we can consider the direction as a variable, a slot to be filled. This makes it a covariant derivative, since the direction hasn't been specified yet. The tricky point here and what MTW tried to describe via slot machine, is the fact that a derivative can be considered from many different perspectives, resulting in a different object.

It is the path from the narrow high school perspective as ##f'(x)## being a "slope" to ##D_p f(v)## being a covariant derivative. As you can see, we can consider the differential process ##D##, the evaluation at a certain point ##p##, or in a certain direction ##v## or all of them to get a number ##D_pf(v)##. Even the function ##f## can be considered as a variable for the process ##D##.
It is always more or less the same thing, only differing in the point of view. But the objects are different as well. We have e.g. ##D_p(f+g)(v) =D_p(f)(v)+D_p(g)(v)## and ##D_p(f)(v+w)=D_p(f)(v)+D_p(f)(w)## but the same cannot be done on the location level ##p##. We also have ##D_p(f\cdot g)(v)=f(g(p)) \cdot D_p(g)(v)+D_{g(p)}(f)(v)\cdot g(p)## but this is not true on the direction level ##v##. So depending on what you consider variable, you get different results from the slot machine ##D##. And in the end, you can even consider all these on the component level with different coordinate systems.

Here's a list I once gathered:
https://www.physicsforums.com/insights/journey-manifold-su2mathbbc-part/and "slope" wasn't even mentioned. If you want to read more, have a look at
https://www.physicsforums.com/insights/the-pantheon-of-derivatives-i/
 
  • Like
Likes FreeThinking and vanhees71
  • #41
@fresh_42: Thanks for your reply.

The first part of your first paragraph sounds to me like you are saying the exact opposite of what I said:

Me: Dir deriv is a case of the more general covar deriv.
You: Covar deriv is a case of the more general dir deriv.
The last part of your first paragraph sounds to me like the "deriviatve" (with no adjectives) is the main thing & what kind of derivative (with adjectives) we get depends on how we look at it. Seems like what we call things just depends on our mood at the time. Ok, if that's the case, I'll adapt.

The rest of your post and the two references you gave will certainly keep me quiet for awhile, which is always a worthy goal. I had already skimmed through those Insights some months ago trying to answer my own questions, but they seemed well over my head. I'll give them a closer look in light of what I've learned since last I read them.

Finally, since my last post, I have picked up a copy of Hobson. On first, preliminary skimming, it appears to me that he uses terms & symbols much closer to what I am used to from previous books I've read. So I'm going to concentrate on Hobson for awhile before I return to re-reading MTW.

So, I've got a lot of reading to do now. Thanks to everyone for your help.
 
  • #42
FreeThinking said:
Me: Dir deriv is a case of the more general covar deriv.
You: Covar deriv is a case of the more general dir deriv.
This is because "more general" is without meaning! A covariant derivative is certainly a rather abstract construction. E.g. Wikipedia says: "A covariant derivative is a certain connection on a tensor bundle". So in this sense a covariant derivative is more general, as almost nothing is specified. If we say directional derivative, then we automatically ask: which direction? which function?, and this is less abstract.

What I wanted to say is, that whatever you take, at its kernel a derivative is a linear approximation of something curved. I was referring to this underlying principle, which is "more general".

If you choose specific examples, then the covariant derivative is probably "more general", but this is semantics as long as you do not define a measure for generality.

We always have derivative = (a topological object ##\mathcal{T}##, a location ##P##, a linear approximation ##D##, a flow ##\phi(t)## with direction ##v##). The topological object is usually the only part which isn't considered variable. This means depending on what is variable, the thing has different names:
  • ##(\mathcal{T}, - , - , - , -) = \text{ manifold }##
  • ##(\mathcal{T}, - , - , \phi(t) , -) = \text{ flow }##
  • ##(\mathcal{T}, - , -, \phi(t), v ) = \text{ vector field }##
  • ##(\mathcal{T}, - , D, - , - ) = \text{ differentiation }##
  • ##(\mathcal{T}, - , D , - , - ) = \text{ derivation }##
  • ##(\mathcal{T}, - , D , - , v ) = \text{ tangent bundle }##
  • ##(\mathcal{T}, - , D, \phi(t) , - ) = \text{ cotangent bundle }##
  • ##(\mathcal{T}, - , D, \phi(t), v) = \text{ connection }##
  • ##(\mathcal{T}, P , -, - , v) = \text{ tangent }##
  • ##(\mathcal{T}, P , D, - , -) = \text{ section }##
  • ##(\mathcal{T}, P , D, - , v) = \text{ directional derivative }##
  • ##(\mathcal{T}, P , D, - , x_i) = \text{ partial derivative }##
  • ##(\mathcal{T}, P , D, - , (x_1,\ldots,x_n) ) = \text{ total differential }##
  • ##(\mathcal{T}, P , D, \phi(t) , v) = \text{ slope }##
However, please, please, do not take this literally, even less than your "more general". This list is very loosely speaking and only meant to stress from how many different sides you can look at what in the end is only a slope. I pressed some terms into the scheme, that's why it does not serve as a definition - only as an impression! The most general term in the sense of common language is probably 'section of a fiber bundle'.
 
  • #43
o_O

Ok, thanks. This is way, way, way over my head, but I get your drift. There's more than one way to look at this, and one can get very abstract about it. For now, I'm going to have to stick to a more concrete view until I get more experience with it all.

I've been reading Hobson and I find that he definitely uses the same terms & notations that I picked up from previously read books. Coupled with his providing more steps in the derivations, I find him much easier to follow than MTW, even easier than "A first course ..." by Schutz. He defines the covariant derivative as I first learned it: ## \nabla_\gamma T^\rho_\lambda = T^\rho_{\lambda ; \gamma} = T^\rho_{\lambda , \gamma} + \Gamma^\rho_{\sigma \gamma} T^\sigma_\lambda - \Gamma^\sigma_{\lambda \gamma} T^\rho_\sigma ## .

I'll keep a copy of your post & keep it handy as I read your Insights. Thanks for taking so much time & effort for me. I sincerely appreciate it.
 

Similar threads

Replies
36
Views
2K
Replies
4
Views
3K
Replies
15
Views
2K
Replies
16
Views
2K
Replies
11
Views
2K
Replies
1
Views
2K
Replies
1
Views
1K
Back
Top