Confusion about Einstein notation

In summary, the summation occurs for upper indices and its repeated but lower indices. I have some confusion about the difference between the two conventions for matrix-vector multiplication or matrix-matrix multiplication. The transpose of a vector does not remain the same, and contraction of indices does not preserve all information in the original tensor.
  • #1
TimeRip496
254
5
In Einstein summation convention, the summation occurs for upper indices and its repeated but lower indices. However I have some confusion

1) $${\displaystyle v=v^{i}e_{i}={\begin{bmatrix}e_{1}&e_{2}&\cdots &e_{n}\end{bmatrix}}{\begin{bmatrix}v^{1}\\v^{2}\\\vdots \\v^{n}\end{bmatrix}},\ \qquad w=w_{i}e^{i}={\begin{bmatrix}w_{1}&w_{2}&\cdots &w_{n}\end{bmatrix}}{\begin{bmatrix}e^{1}\\e^{2}\\\vdots \\e^{n}\end{bmatrix}}}$$
Won't the above gives me a scalar each? And most text seems to label the v here as a vector, including wikipedia. I understand the vector components labelled as vi and its coordinate basis as ei or is the definition of vector different in Einstein convention?

In addition, how does the above transpose then work?
E.g. $$v^T=v^ie^i$$
Does it only change the coordinate basis but not the coefficient?

1a) For transpose of matrix, we just need to switch the two indices around. What about the transpose of a vector? Does it remains the same?

2) Inner product of vectors
To do inner product of two vectors, I first need to convert the other into a covector right? In that case, inner product of vector should be expressed as
$$v.u=v^iu_i=g_{ij}v^iu^j$$

3) For similar indices on a 4th order tensor, can I rewrite it as a 2nd order tensor without it losing its meaning?
E.g.
$$R^{\mu}_{\nu\mu\kappa}=R_{\nu\kappa}$$
Is the above equivalent valid? It doesn't seem correct to me as the same indices require summation and thus removing them will remove the summation which seems to contain less info.

4) For matrix-vector multiplication or matrix-matrix multiplication, they can only be done when the upper and lower similar indices from each tensor must be side by side right?
E.g. $$u_i=A^{j}_iv_j=v^jA_j^i$$
But this multiplication is not possible right, $$u_i\neq A_{ij}v_j$$

5) As for derivative, can the partial derivative tensor be arrange anywhere throughout the equation?
E.g. $$A_{ij}\partial_{\mu}\partial_{\nu}f(x)=\partial_{\nu}A_{ij}\partial_{\mu}f(x)$$
It should be possible based on the commutative property of partial derivative unless is covariant derivative.
 
Last edited:
Physics news on Phys.org
  • #2
TimeRip496 said:
Won't the above gives me a scalar each?

It depends on what the ##e##'s are supposed to be. I don't know what textbooks or other sources you are looking at (Wikipedia is not the best place to learn about this stuff rigorously), but there are two notation conventions here that are easy to confuse.

Convention #1 says that a thing with upper indexes, like ##v^i##, is a vector, and a thing with lower indexes, like ##w_i##, is a covector (also called a 1-form). Then you can use the Einstein summation convention to form the scalar ##v^i w_i##. In fact, this can be taken as the definition of a covector: a linear mapping from vectors to scalars.

Convention #2 says that you express a vector in components by multiplying each component by its matching basis vector. So you would write ##\vec{v} = v^i e_i##, where ##e_i## is the basis vector with index ##i##. However, calling this the Einstein summation convention is a bit of a misnomer, because the lower index on the basis vector ##e_i## does not mean it's a covector or a 1-form, and the sum is not a scalar, it's the vector ##\vec{v}##.

TimeRip496 said:
What about the transpose of a vector?

The idea of a "transpose" doesn't really apply to a vector. But in particular cases (basically, when you are working in a metric space), you can find a one-to-one correspondence between vectors and covectors, and then you can think of the covector ##v_i## that corresponds to the vector ##v^i## as being the "transpose" of a vector. However, this terminology has limited usefulness, since it really depends on thinking of vectors as columns and covectors as rows, with operators as matrices, and that representation gets problematic when you start working with more complicated vector spaces.

TimeRip496 said:
To do inner product of two vectors, I first need to convert the other into a covector right?

Yes, which means you can only do this if you have a metric, i.e., a correspondence between vectors and covectors. If you don't have a metric, then the concept of an inner product of two vectors has no meaning, nor does the concept of an inner product of two covectors. Only the inner product of a vector and a covector has meaning.

TimeRip496 said:
For similar indices on a 4th order tensor, can I rewrite it as a 2nd order tensor without it losing its meaning?

This operation is called "contraction", and it does not preserve all of the information in the original tensor. But it is a valid operation when you have an upper and lower index on a tensor, yes.

TimeRip496 said:
For matrix-vector multiplication or matrix-matrix multiplication, they can only be done when the upper and lower similar indices from each tensor must be side by side right?

Again, a proper understanding of this requires separating vectors and tensors from their representations as rows (or columns for covectors) and matrices. You don't multiply matrices and vectors or matrices and matrices. You combine vectors and tensors to form new vectors and tensors. For example, ##A^i_j v^j## is really two separate operations: first, combining the (1-1) tensor ##A^i_j## and the vector, or (1-0) tensor, ##v^k##, into the (2-1) tensor ##A^i_j v^k##, and then contracting the lower index with the second upper index. These operations have meaning even if the vectors and tensors do not have matrix representations.

Also, the matrix representation of a tensor is ambiguous: it doesn't really distinguish between a (2-0) tensor, a (1-1) tensor, and a (0-2) tensor. This is a key reason for keeping vectors and tensors separate conceptually from particular representations.

TimeRip496 said:
the partial derivative tensor

The partial derivative is not a tensor; it's an operator. More precisely, it's an operation that can be used to build various operators on vectors and tensors.

TimeRip496 said:
E.g.
$$
A_{ij}\partial_{\mu}\partial_{\nu}f(x)=\partial_{\nu}A_{ij}\partial_{\mu}f(x)
$$

I don't know what this equation is supposed to mean.
 
  • Like
Likes TimeRip496 and Buzz Bloom
  • #3
My 2 cents: I strongly suggest that for basisvectors, you put parentheses around the index,

[tex]
V = V^i e_{(i)}
[/tex]

to emphasize that the index i of ##e_{(i)}## labels not components, but whole basisvectors.
 
  • #4
TimeRip496 said:
In Einstein summation convention, the summation occurs for upper indices and its repeated but lower indices. However I have some confusion

1) $$v=v^{i}e_{i}$$

This is not a lot different from the "standard" notation for linear algebra, except the sum has been suppressed:
$$\mathbf{v} = \sum_{i=1}^{n} v^i \mathbf{e}_i = \sum_i v^i \mathbf{e}_i = v^i \mathbf{e}_i$$
 
  • Like
Likes Klystron
  • #5
haushofer said:
My 2 cents: I strongly suggest that for basisvectors, you put parentheses around the index,

[tex]
V = V^i e_{(i)}
[/tex]

to emphasize that the index i of ##e_{(i)}## labels not components, but whole basisvectors.

PeroK said:
This is not a lot different from the "standard" notation for linear algebra, except the sum has been suppressed:
$$\mathbf{v} = \sum_{i=1}^{n} v^i \mathbf{e}_i = \sum_i v^i \mathbf{e}_i = v^i \mathbf{e}_i$$

@PeroK 's post clarifies that [itex]v^i[/itex] is the [itex]i[/itex]th component and [itex]\mathbf{e}_i[/itex] is the [itex]i[/itex]th basis vector. Instead of boldface [which isn't so easy to do when writing], you could use the arrowhead notation.

$${\vec v} = v^i {\vec e}_i \qquad\mbox{implied summation}$$

Introducing greek abstract indices [not to be summed over, but the label of a "slot"] instead of the arrowheads,
$${v^\mu} = v^i {{ e}_i}^\mu \qquad \mbox{implied summation}$$

In column-vector form, for example,
[tex]
\begin{bmatrix}v^{1}\\v^{2}\\\vdots \\v^{n}\end{bmatrix}=
v^{1}\begin{bmatrix}1\\0\\\vdots \\0\end{bmatrix}+
v^{2}\begin{bmatrix}0\\1\\\vdots \\0\end{bmatrix}+
\cdots+
v^{n}\begin{bmatrix}0\\0\\\vdots \\1\end{bmatrix}
[/tex]
 
  • #6
robphy said:
Introducing greek abstract indices [not to be summed over, but the label of a "slot"] instead of the arrowheads

I don't think it's correct to "mix" slot notation and component notation like this. Where are you getting this from?
 
  • #7
PeterDonis said:
I don't think it's correct to "mix" slot notation and component notation like this. Where are you getting this from?

It's not ideal to have both (multiple) types of indices... especially for a novice... but sometimes it might be needed.

Here are sections from Penrose & Rindler's Spinors and Spacetime

From p.93 in vol I, in Ch 2 of the Abstract Index Notation
upload_2018-12-11_15-19-54.png

then at the top of the next page
upload_2018-12-11_15-28-42.png
Here's something from p.81 showing an explicit summation symbol for a different combination of indices
upload_2018-12-11_15-24-58.png
 

Attachments

  • upload_2018-12-11_15-19-54.png
    upload_2018-12-11_15-19-54.png
    15.1 KB · Views: 795
  • upload_2018-12-11_15-24-58.png
    upload_2018-12-11_15-24-58.png
    19.8 KB · Views: 739
  • upload_2018-12-11_15-28-42.png
    upload_2018-12-11_15-28-42.png
    7.5 KB · Views: 701
  • #8
robphy said:
Here are sections from Penrose & Rindler's Spinors and Spacetime

Thanks for the reference! I admit there are a lot of complexities in this subject that I am not expert on.
 
  • #9
PeterDonis said:
Thanks for the reference! I admit there are a lot of complexities in this subject that I am not expert on.

Although different folks might have the same general idea of they want to say,
there is a wide variety of [possibly idiosyncratic] notations that they employ,.
Unfortunately, sometimes the notation get too condensed and too abstract (pun intended).
Many times, I feel I need a translator to unpackage the notation.
 
  • #10
PeterDonis said:
It depends on what the ##e##'s are supposed to be. I don't know what textbooks or other sources you are looking at (Wikipedia is not the best place to learn about this stuff rigorously), but there are two notation conventions here that are easy to confuse.

Convention #1 says that a thing with upper indexes, like ##v^i##, is a vector, and a thing with lower indexes, like ##w_i##, is a covector (also called a 1-form). Then you can use the Einstein summation convention to form the scalar ##v^i w_i##. In fact, this can be taken as the definition of a covector: a linear mapping from vectors to scalars.

Convention #2 says that you express a vector in components by multiplying each component by its matching basis vector. So you would write ##\vec{v} = v^i e_i##, where ##e_i## is the basis vector with index ##i##. However, calling this the Einstein summation convention is a bit of a misnomer, because the lower index on the basis vector ##e_i## does not mean it's a covector or a 1-form, and the sum is not a scalar, it's the vector ##\vec{v}##.
The idea of a "transpose" doesn't really apply to a vector. But in particular cases (basically, when you are working in a metric space), you can find a one-to-one correspondence between vectors and covectors, and then you can think of the covector ##v_i## that corresponds to the vector ##v^i## as being the "transpose" of a vector. However, this terminology has limited usefulness, since it really depends on thinking of vectors as columns and covectors as rows, with operators as matrices, and that representation gets problematic when you start working with more complicated vector spaces.
Yes, which means you can only do this if you have a metric, i.e., a correspondence between vectors and covectors. If you don't have a metric, then the concept of an inner product of two vectors has no meaning, nor does the concept of an inner product of two covectors. Only the inner product of a vector and a covector has meaning.
This operation is called "contraction", and it does not preserve all of the information in the original tensor. But it is a valid operation when you have an upper and lower index on a tensor, yes.
Again, a proper understanding of this requires separating vectors and tensors from their representations as rows (or columns for covectors) and matrices. You don't multiply matrices and vectors or matrices and matrices. You combine vectors and tensors to form new vectors and tensors. For example, ##A^i_j v^j## is really two separate operations: first, combining the (1-1) tensor ##A^i_j## and the vector, or (1-0) tensor, ##v^k##, into the (2-1) tensor ##A^i_j v^k##, and then contracting the lower index with the second upper index. These operations have meaning even if the vectors and tensors do not have matrix representations.

Also, the matrix representation of a tensor is ambiguous: it doesn't really distinguish between a (2-0) tensor, a (1-1) tensor, and a (0-2) tensor. This is a key reason for keeping vectors and tensors separate conceptually from particular representations.
The partial derivative is not a tensor; it's an operator. More precisely, it's an operation that can be used to build various operators on vectors and tensors.
I don't know what this equation is supposed to mean.
Thanks a lot!
 

FAQ: Confusion about Einstein notation

What is Einstein notation?

Einstein notation, also known as index notation or tensor notation, is a mathematical notation used to represent and manipulate multilinear functions in the context of tensor calculus. It was developed by physicist Albert Einstein and is commonly used in physics and engineering.

Why is Einstein notation used?

Einstein notation simplifies the representation and manipulation of multilinear functions, particularly in the context of tensor calculus. It allows for the concise expression of complex equations and helps to avoid repetitive calculations.

How does Einstein notation work?

In Einstein notation, repeated indices imply summation over all possible values of that index. This is known as the Einstein summation convention. This means that instead of writing out each individual term in a summation, we can simply use repeated indices to represent all the terms. Additionally, indices in Einstein notation are often used to indicate the type of tensor being represented (e.g. upper indices for contravariant tensors, lower indices for covariant tensors).

What are the benefits of using Einstein notation?

Using Einstein notation can greatly simplify the representation and manipulation of complex equations in tensor calculus. It also allows for a more compact and efficient notation, making it easier to identify patterns and symmetries in equations.

Are there any limitations to using Einstein notation?

While Einstein notation is useful for representing and manipulating multilinear functions, it may not be the most intuitive notation for those who are unfamiliar with tensor calculus. Additionally, it may not be the most efficient notation for certain types of calculations, such as matrix operations or vector calculus.

Similar threads

Replies
10
Views
987
Replies
62
Views
4K
Replies
14
Views
1K
Replies
7
Views
2K
Replies
8
Views
2K
Back
Top