Basic Question That Will Eat Your Brains

  • Thread starter sponsoredwalk
  • Start date
In summary, the conversation revolves around the topic of matrix multiplication and confusion over its various definitions and explanations. The speaker expresses frustration with the lack of clear explanations in textbooks and forums, but praises Sal from Khan Academy for providing a thorough explanation. The concept of a transpose is also brought up and compared to a vector. The speaker also mentions their previous struggles with linear algebra and the use of algorithms and proofs. The conversation ends with the speaker questioning the use of magic or shortcuts in mathematics, particularly in regards to differential forms. The summary also raises the question of whether there is a logical reason behind these concepts or if they are simply defined because they work.
  • #1
sponsoredwalk
533
5
I'm so angry :mad: You know basic matrix multiplication? Every book I've
looked in, & I've spent all frickin' day on googlebooks & amazon checking
this out
, defines matrix multiplication in the shorthand summation notation
or else as the standard row-column algorithm you should parrot off.
Every forum I've read defines matrix multiplication as these things -
I mean it's definition, who am I to question it? - or else says
'
it's because it works, it's just a convenient way to define it this way'
or else uses some linear transformation explanation that I haven't studied
yet (but will in 1 chapter!) but this linear transformation thing doesn't look
convincing to me from what I understand of it. Basically the only person
who went to the trouble of explaining this properly was Sal of
khanacademy :cool: http://www.khanacademy.org/video/linear-algebra--matrix-product-examples?playlist=Linear Algebra" Did you know about this (I'll explain in a minute)?
Where &/or in what book did you learn about it?
Well, not every book defines it as these things. http://books.google.ie/books?id=2w-...resnum=4&ved=0CDIQ6AEwAw#v=onepage&q&f=false" by Lang gives
a slightly better explanation in terms of dot products but it wasn't
satisfying enough. It was enough of a hint at the right way to do this
but he didn't explain it properly unfortunately.

Basically I am partly posting this to find out more about a specific operation
known as the transpose. I'm actually a little confused because in one
book, http://books.google.ie/books?id=Gv4...resnum=1&ved=0CCwQuwUwAA#v=onepage&q&f=false", he defines a vector in two ways:

X = (x,y,z)

or

___|x|
X= |y|
___|z|
(Obviously the ___ are just to get the | | things to form a column shape :p)

which are equivalent but then in the video I linked to above Sal calls
the vector:
___|x|
X= |y|
___|z|
as if it's normal but calls X = (x,y,z) the transpose, Xᵀ = (x,y,z).

I remember from my old http://tutorial.math.lamar.edu/Classes/LinAlg/LinAlg.aspx" of linear algebra (which I hated & quit
because it made no sense memorizing algorithms and faking my way through proofs)
that
the transpose is different somehow and is used in inverting a matrix
I think but is a vector and it's transpose the same thing or something?

Anyway, using this idea of a transpose it makes the whole concept of
matrix multiplication 100% completely, lovingly, passionately, painfully,
hatefully, relievingly intelligible. I think the picture is clear,

http://img155.imageshack.us/img155/2433/blaea.jpg



In part 3 you just take the transpose of each row of the 2x3 matrix &
dot it with the 3x1 matrix. I just wrote part 4 in as well because
in that book around page 4 he defines both modes of dot product as
being the same, which they are.
It seems like a trick the way I've decomposed the matrix though,
I mean I could use these techniques to multiply matrices regardless
of their dimensions.
I've just written down a method using these to multiply two matrices
of dimensions 2x3, i.e. (2x3)•(2x3) and gotten a logical answer.

If we copy the exact algorithm I've used in the picture then multiplying
two matrices of equal size is indeed meaningless as you take the dot
product of two differently sized vectors but if I play with the techniques
used to decompose a big matrix I can swing it so that I get a logical
answer.

[1,2,3][a,b,c] = [1,2,3][a] [1,2,3][b] [1,2,3][c]
[4,5,6][d,e,g] = [4,5,6][d] [4,5,6][e] [4,5,6][g]

(This is the same as I do in the picture, then instead of transposing
straight away I just use more of this decomposition only this time I
decompose the left matrix instead of the right one):

[1,2,3][a] [1,2,3][b] [1,2,3][c] = [1][a] [2][a] [1][b] [2][b] [3][b] ...
[4,5,6][d] [4,5,6][e] [4,5,6][g] = [4][d] [5][d] [6][d] [4][e] [5][e] ...

I can view this as a dot product:

[1]•[a] [2]•[a] [1]•[b] [2]•[b] [3]•[b] ...
[4]•[d] [5]•[d] [6]•[d] [4]•[e] [5]•[e] ...

(Obviously I just wrote 2 •'s @ each row vector to keep it neat ;))

and I end up with some ridiculously crazy matrix essentially being
meaningless but still following the "rules" I came up with something.

This is important, my little knowledge of Hamilton is that he just defined
ijk = -1 because it worked. Maybe this is true, and from what I know
using this kind of algebra is useful in special relativity but I think it's
literally a cheat, there is no explanation other than "it works".
Hopefully I'm wrong! But, with my crazy matrix up here, why is it wrong?
Is it really just that "it works" when we do it the way described in
the picture but it doesn't work (i.e. it doesn't describe physical reality)
when I do it the way I did here? What does this say about mathematics
being independent from reality when we ignore things like this, my
ridiculous matrix, and focus on the ones that describe reality?
I know it's stupid but I don't knw why :-p

I'm also worried because just magically defining these things seems to
be common, looking at differential forms I think, & this is because I
haven't studied them properly, that you literally invoke this [B]witchcraft [/B]
when doing algebra with the dx's and dy's.

I seriously hope that there are reasons behind all of this, thinking
about the [URL]https://www.physicsforums.com/showthread.php?t=423992"[/URL] I made there was a perfect reason why
things like [B]i[/B]x[B]j[/B] = [B]k[/B] & [B]j[/B] x [B]i[/B] = -[B]k[/B] make sense but here I'm worried.
In the cross product example the use of a determinant, an abuse of
notation, is a clear sign we're invoking magic spells to get the right
answers but with matrix multiplication I haven't even located the
source of the sourcery yet & it's driving me crazy :blushing:
Honestly, tell me now if I've got more of this to expect with
differential forms or will I get a solid answer? :-p

[B]TL;DR[/B] - The method in the picture of multiplying matrices seems to me
to be the most logical explanation of matrix multiplication, but why is
it done that particular way & not the way I described in this part:
[1,2,3][a,b,c] = [1,2,3][a] [...
of the post? Also, with differential forms when you multiply differential's
dx's and dy's etc... you are using magic sorcery adding minuses yada
yada yada by definition, how come? Is there a beautiful reason for
all of this like that described in the [URL]https://www.physicsforums.com/showthread.php?t=423992"[/URL]? Oh, and what's
the deal with transposes? Transposing vectors is the reason why I can
use this method in the picture, but I mean I could stupidly take the matrix
[1,2,3]
[4,5,6]
as being either:
(1,2,3) transposed from it's column vector form, or
(1,4) & (2,5) & (3,6) as being the vectors, it's so weird...
Also, I could have taken part 2 of the picture differently, multiplying
the Y matrix by 3 1x2 X vectors, again it's so weird... :cry:

[SIZE=1][I]/pent_up_rant...[/I][/SIZE]
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
Let me tell you how I think matrix multiplication should be explained. First you explain the relationship between linear operators and matrices, e.g. like this:
Fredrik said:
Suppose [itex]A:U\rightarrow V[/itex] is linear, and that [itex]\{u_j\}[/itex] is a basis for U, and [itex]\{v_i\}[/itex] is a basis for V. Consider the equation y=Ax, and expand in basis vectors.

[tex]y=y_i v_i[/tex]

[tex]Ax=A(x_j u_j)=x_j Au_j= x_j (Au_j)_i v_i[/tex]

I'm using the Einstein summation convention: Since we're always supposed to do a sum over the indices that appear exactly twice, we can remember that without writing any summation sigmas (and since the operator is linear, it wouldn't matter if we put the summation sigma to the left or right of the operator). Now define [itex]A_{ij}=(Au_j)_i[/itex]. The above implies that

[tex]y_i=x_j(Au_j)_i=A_{ij}x_j[/tex]
This gives you the motivation for the definition of the product of a n×m matrix and an m×1 matrix.

To find the motivation for the general case, just use the above on the composition of two linear operators, and show that [itex](A\circ B)_{ij}=A_{ik}B_{kj}[/itex]. It's not hard if you understand the stuff I quoted. This next quote does it for the special case when the domain and codomain of the linear operators are the same vector space (and the basis we're working with is written as {ei}).

Fredrik said:
Suppose that A and B are linear operators on a vector space V, and that {ei} is a basis for V. We want to show that the ij component of [itex]A\circ B[/itex] is given by matrix multiplication. In the notation used in the post I linked to above, the proof is

[tex](A\circ B)_{ij}=(A\circ B(e_j))_i=(A(Be_j))_i=(A((Be_j)_k e_k))_i=(A(B_{kj}e_k))_i=(Ae_k)_i B_{kj}=A_{ik}B_{kj}[/tex]

The stuff in the first quote tells you that (for each choice of bases of U and V) there's a matrix associated with each linear operator. What I said after that means that if we define AB to be the matrix associated with the composition [itex]A\circ B[/itex], then matrix multiplication must be defined the way it is.

Don't be angry at your books and teachers for defining matrix multiplication in component form or for using linear operators. Be angry at them for not defining linear operators in the first chapter of the book and the first or second day of the course.
 
Last edited:
  • #3
Not really :redface: I did read your post in that thread many times yesterday & other posts
of yours and everyones but none explain matrices the way I've described in my post.
If you watch the khanacademy video I linked to you'll see what a cool idea it is & I'd
like to explore that way of looking at it before I get more formal & do it the way
you've written it. While I'm sure the definition works when thinking of the basic equation
y = Ax and is set up to satisfy it I just get the feeling it is too unnatural & is a formalism
that came after multiplication was defined in the manner I've shown in that bit yellow
picture I gave. When thinking about multiplication of matrices as being a load of dot
products in disguise it just makes more sense in a way but all the questions I asked in
my OP are still there. I'll just have to keep studying to learn the proper theory of bases,
all I know is the physics-vector idea. I am sure I'll get it but I just wonder is there any
more depth to the ideas I mentioned in my post as I'd like to have that in my head clear
before I get more formal on this you know.
 
  • #4
sponsoredwalk said:
While I'm sure the definition works when thinking of the basic equation
y = Ax and is set up to satisfy it I just get the feeling it is too unnatural & is a formalism
that came after multiplication was defined in the manner I've shown in that bit yellow
picture I gave. When thinking about multiplication of matrices as being a load of dot
products in disguise it just makes more sense in a way but all the questions I asked in
my OP are still there.
It's fine to think of matrix multiplication as a bunch of dot products. Row i, column j of AB is the dot product of the ith row of A with the jth column of B. But this is exactly what the definition

[tex](AB)_{ij}=\sum_k A_{ik}B_{kj}[/tex]

says. This is easiest to see in the notation that puts the row index upstairs. From now on I'll write [itex]A^i_j[/itex] instead of [itex]A_{ij}[/itex]. I will also write [itex]A^i[/itex] for the ith row of A, and [itex]A_i[/itex] for the ith column of A. (This is why I changed the notation). I will also use the convention to not write any summation sigmas, because it's easy to remember that there's always a sum over every index that appears twice. In this notation, the definition takes the form

[tex](AB)^i_j=A^i_k B^k_j=(A^i)_k (B_j)^k=\vec{A^i}\cdot\vec{B_j}[/tex]

Note that the definition of matrix multiplication also implies that

[tex](A^i)_k (B_j)^k=A^i B_j[/tex]

What you have discovered is a good way to remember the algorithm you despise, but it's nothing more than that. The reason why matrix multiplication is defined this way, is the relationship between linear operators and matrices that I described, and the choice to make sure that the matrix corresponding to [itex]T\circ S[/itex] is the matrix corresponding to T times the matrix corresponding to S.

sponsoredwalk said:
I'll just have to keep studying to learn the proper theory of bases,
Yes, the concepts "vector space", "linear", "linearly independent", "span" and "basis" are so important that you will have a hard time learning anything in linear algebra (or quantum mechanics) without understanding all of them perfectly.
 
  • #5
Thanks very much, it shouldn't be too long until I've gotten there in my book so once I finish
that I'll see how I'll re-read your explanations & see how I feel about everything.
 
  • #6
Holy jesus I've cracked it!

I went back and read some of Arthur Cayley's "A Memoir on the Theory of
Matrices"
(.pdf) because I was confronted with the issue of matrix multiplication
again. Needless to say the idea of linear mappings never satisfied me in the
way what I'm about to write does.

If you read Cayley's paper he denotes his matrices as follows:

(a, b, c)
|α, β, γ|
|θ, λ, μ|

He includes the "( )" in the top line of his matrix, why? Well I thought
it was some notational hazard seeing as he's writing in 1858, basically it
just put me on the alert for typo's relating to age.

Then, he wrote something very novel, he tells us that a more convenient
way to represent the system of linear equations:

T = ax + by + cz
U = αx + βy + γz
V = θx + λy + μz

is as follows:

(T,U,V) = (a, b, c)(x,y,z)
________|α, β, γ|
________|θ, λ, μ|Notice the shape of the (T,U,V) & (x,y,z)! It just looks far more natural
this way. At first I thought it was a notational hazard, (1858!), but then
I thought no because looking at the paper he certainly took advantage
of http://ecx.images-amazon.com/images/I/61H1FSKP88L.jpg" so I thought about it & jeesh is this notation far clearer!
I want to stress this point because if we write the above system as
a linear combination in the way the notation clearly suggests it is
extremely clear what's going on (first of all) and later on it is pivotal,
so I'll do it:

(T,U,V) = (a, b, c)(x,y,z) = _|a| __ |b| __ |c| = ax + by + cz
________|α, β, γ| _____ = x|α| + y|β| + z|γ| = αx + βy + γz
________|θ, λ, μ| _____ = _|θ| __ |λ| __ |μ| = θx + λy + μzHe goes on to explain that

(T,U,V) = (a, b, c)(x,y,z)
________|α, β, γ|
________|θ, λ, μ|

represents the set of linear functions:

((a, b, c)(x,y,z),(α, β, γ)(x,y,z),(θ, λ, μ)(x,y,z))

I think it's clear that

(T,U,V) = ((a, b, c)(x,y,z),(α, β, γ)(x,y,z),(θ, λ, μ)(x,y,z))

So we see that

T = (a, b, c)(x,y,z) = ax + by + cz
U = (α, β, γ)(x,y,z) = αx + βy + γz
V = (θ, λ, μ)(x,y,z) = θx + λy + μz

If you compare this notation to modern notation:

|T| = |a, b, c||x| = |(ax + by + cz)|
|U| = |α, β, γ||y|=_|(αx + βy + γz)|
|V| = |θ, λ, μ||z| = |(θx + λy + μz)|

multiplication of matrices is extremely clear when multiplying an (A)mxn
matrix by an (X)nx1 matrix. It just follows from unfurling a system.
We're going to use this idea when multiplying general matrices.

He then goes on to define all of the standard abelian operations of
matrices in a nice way in the paper which I really recommend reading but
what comes next is absolutely astonishing.

He defines the standard matrix representation
________
(T,U,V) = (a, b, c)(x,y,z)
________|α, β, γ|
________|θ, λ, μ|

but goes on to define

(x,y,z) = (a', b', c')(ξ,η,ζ)
________|α', β', γ'|
________|θ', λ', μ'|Just think about all of this so far! No magic or defining strange operations,
matrix decomposition is natural and through it we see that matrix
multiplication is like a kind of linear algebra chain rule, very natural.
It is absolutely brilliant how he just morphed (x,y,z) there!

So:

(T,U,V) = (a, b, c)(x,y,z) = (a, b, c)_(a', b', c')(ξ,η,ζ)
________|α, β, γ| _____ = |α, β, γ| |α', β', γ'|
________|θ, λ, μ| _____ = |θ, λ, μ|_|θ', λ', μ'|

If you read his paper you'll see that he then defines

(a, b, c)_(a', b', c')(ξ,η,ζ) = |A _ B__C |(ξ,η,ζ)
|α, β, γ| |α', β', γ'|_____ = |A'_ B' _C' |
|θ, λ, μ|_|θ', λ', μ'| _____= |A''_B''_ C''|

This is the genius of his idea and derivation, how he gets from the L.H.S.
to the R.H.S. is what I'm going to do in the rest of my post, he skips this
step & it took me quite a while to figure it out, if you're looking for a
challenge read the first few pages of Caley's paper & try to figure it out
without peeking at my solution, it's an exercise :cool:

(NB: This could be extremely easy & I just don't recognise it)
----

(T,U,V) = (a, b, c)(x,y,z) = _|a| __ |b| __ |c|
________|α, β, γ| _____ = x|α| + y|β| + z|γ|
________|θ, λ, μ| _____ = _|θ| __ |λ| __ |μ|

(x,y,z) = (a', b', c')(ξ,η,ζ)
________|α', β', γ'|
________|θ', λ', μ'|

x = a'ξ + b'η + c'ζ
y = α'ξ + β'η + γ'ζ
z = θ'ξ + λ'η + μ'ζ

(T,U,V) = T = _|a| __ |b| __ |c| = _____________|a| + _____________|b| + _____________|c|
________ U = x|α| + y|β| + z|γ| = (a'ξ + b'η + c'ζ)|α| + (α'ξ + β'η + γ'ζ)|β| + (θ'ξ + λ'η + μ'ζ)|γ|
________ V = _|θ| __ |λ| __ |μ| = _____________|θ| + _____________|λ| + _____________|μ|

T =
(a'ξ + b'η + c'ζ)•a + (α'ξ + β'η + γ'ζ)•b + (θ'ξ + λ'η + μ'ζ)•c
U = (a'ξ + b'η + c'ζ)•α + (α'ξ + β'η + γ'ζ)•β + (θ'ξ + λ'η + μ'ζ)•γ
V = (a'ξ + b'η + c'ζ)•θ + (α'ξ + β'η + γ'ζ)•λ + (θ'ξ + λ'η + μ'ζ)•μ

T = a'ξ•a + b'η•a + c'ζ•a + α'ξ •b+ β'η•b + γ'ζ•b + θ'ξ•c + λ'η•c + μ'ζ•c
U = a'ξ•α + b'η•α + c'ζ•α + α'ξ•β + β'η•β + γ'ζ•β + θ'ξ•γ + λ'η•γ + μ'ζ•γ
V = a'ξ•θ + b'η•θ + c'ζ•θ + α'ξ•λ + β'η•λ + γ'ζ•λ + θ'ξ•μ + λ'η•μ + μ'ζ•μ

T = (a'a + α'b + θ'c)ξ + (b'a + β'b + λ'c)η + (c'a + γ'b + μ'c)ζ
U = (a'α + α'β + θ'γ)ξ + (b'α + β'β + λ'γ)η + (c'α + γ'β + μ'γ)ζ
V = (a'θ + α'λ + θ'μ)ξ + (b'θ + β'λ + λ'μ)η + (c'θ + γ'λ + μ'μ)ζ

T = |(a'a + α'b + θ'c) (b'a + β'b + λ'c) (c'a + γ'b + μ'c)| (ξ,η,ζ)
U = |(a'α + α'β + θ'γ) (b'α + β'β + λ'γ) (c'α + γ'β + μ'γ)|
V = |(a'θ + α'λ + θ'μ) (b'θ + β'λ + λ'μ) (c'θ + γ'λ + μ'μ)|

(Change a'a to aa' in every entry, makes more sense for what follows)

T = |(aa' + bα' + cθ') (ab' + bβ' + cλ') (ac' + bγ' + cμ')_| (ξ,η,ζ)
U = |(αa' + βα' + γθ') (αb' + ββ' + γλ') (αc' + βγ' + γμ') |
V = |(θa' + λα' + μθ') (θb' + λβ' + μλ') (θc'θ + λγ' + μμ')|(T,U,V) = |(a,b,c)(a',α',θ') (a,b,c)(b',β',λ') (a,b,c)(c',γ',μ')_| (ξ,η,ζ)
_____ ___|(α,β,γ)(a',α',θ') (α,β,γ)(b',β',λ') (α,β,γ)(c',γ',μ')_|
_____ ___|(θ,λ,μ)(a',α',θ') (θ,λ,μ)(b',β',λ') (θ,λ,μ)(c',γ',μ')_|

There we have it! :biggrin:

If you multiply out the R.H.S. of:

(a, b, c)(x,y,z) = (a, b, c)_(a', b', c')(ξ,η,ζ)
|α, β, γ| _____ = |α, β, γ| |α', β', γ'|
|θ, λ, μ| _____ = |θ, λ, μ|_|θ', λ', μ'|

in the normal way you do it you see it agrees with the above derivation.

There we have it, a formal justification and it's all self-contained in a
standard Anxn matrix exploiting the unknowns (x,y,z). Obviously this can
be turned into a neverending rabbit hole if we want it to.

I'll add that looking at matrix multiplication this way is an irrefutable explanation of
the reason behind the fact that an (A)mxn matrix by an (X)nxp gives a (B)mxp matrix,
it explains why it doesn't make sense to multiply a matrix of column "n" by a matrix
with rows anything but "n".
 
Last edited by a moderator:
  • #7
Jesus christ here's another way to multiply matrices :eek:

I got Hoffman/Kunze Linear Algebra in a second-hand bookshop the
other day & this book is apparently one of the most rigorous there is at
this level. It's a great book so far and look what fruitful gifts it bears!

Lets re-cap, there is

1) The dot-product formulation,
2) Cayley's clear explanation I've written above,
3) What I'll explain next:

Since two systems of linear equations are equivalent if the equations of one
system are linear combinations of the equations in the other system, we can
use the idea of row-equivalence (the matrix version of system equivalence) to
construct a new matrix whose entries are simply linear combinations of the
(matrix-representative) equations making up a certain system.

So, if B is an mxn matrix we construct a matrix C whose rows are simply
linear combinations of the rows of B.

____|β₁|
____|β₂|
B = _|. |
____|._|
____|._|
____|β₊| (Where the β's are rows)

and ____|γ₁|
____|γ₂|
C =_|._|
____|._|
____|._|
____|γ₊|

The γ's are linear combinations of the rows of B, so

γ₁ = α₁β₁ + α₂β₂ + ... + α₊β₊
γ₂ = α₁β₁ + α₂β₂ + ... + α₊β₊
.
.
.
γ₊ = α₁β₁ + α₂β₂ + ... + α₊β₊

Now comes the conceptual leap, the α's are the column elements of a
different matrix A :eek:

Took me a while to figure this one out but if you take the o'th row:

γ₀ = α₁β₁ + α₂β₂ + ... + α₊β₊

and append the a's with the o as follows:

γ₀ = α₀₁β₁ + α₀₂β₂ + ... + α₀₊β₊

for every row, i.e.

γ₁ = α₁₁β₁ + α₁₂β₂ + ... + α₁₊β₊
γ₂ = α₂₁β₁ + α₂₂β₂ + ... + α₂₊β₊
.
.
.
γ₊ = α₂₁β₁ + α₂₂β₂ + ... + α₂₊β₊

you've got a new way of multiplying matrices giving equivalent results & it's grounded in
some clever theory. Remember the β's are whole rows, it's crazy! :biggrin:
This could be called "the linear combination derivation" because it's basically just using
that idea, or the equivalent linear combination derivation" since the definition of
equivalent systems is so important.
 
Last edited:
  • #8
http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/lecture-3-multiplication-and-inverse-matrices/"
 
Last edited by a moderator:
  • #9
Ritalin?
 
  • #10
Warning: Long post.

Since you seem to have found what you want, I won't add much about your original question (matrix multiplication). However, I frequently tutor linear algebra, and your basic confusion over unmotivated definitions and "witchcraft" techniques seems to be pretty common to students of the subject (at all levels--I recently tutored a differential geometry/general relativity student who had many similar-sounding questions). In general, linear algebra is a very geometric subject, with the potential to be very intuitive. Unfortunately, different people find different presentations of the subject more or less intuitive; it's largely a matter of taste. Linear algebra is also unique in that there are very often quite a few ways of saying exactly the same thing that sound very different. In other words, there's no single way of presenting the material that's guaranteed to be clear to everyone. So I thought I'd use this post to try to show you one more alternative perspective that might (or might not) help clear things up a bit. I'll admit I've never tried to explain it this way before. This is just how _I_ like to think about things. Feel free to ignore it if you find it confusing or unhelpful.

My preference is for a particular kind of pseudo-physicsy "hybrid" approach common to mathematical physicists, because it makes the relationship of commonplace objects like the cross product to deeper, more abstract concepts like Hodge duality and the wedge product clear while remaining concrete enough to allow you to actually calculate things. There are even more abstract approaches where no indices are used at all, and everything is expressed in terms of a few basic objects (the wedge, Lie derivative, musical isomorphisms, Hodge star, exterior derivative, codifferential, etc.). It's not necessarily important what all of these things are; the point is that all of the things you do in linear algebra (and a good chunk of differential geometry) can be expressed in terms of a handful of fundamental geometric and algebraic objects with (for the most part) clear motivations behind their definitions. You can build up the whole theory by starting with the tangent and cotangent bundles, and tensoring them together as needed. For instance, a "matrix" (i.e., a linear map) in this language becomes an element of the tensor product of the tangent space with the cotangent space (in that order) of a base manifold (flat Euclidean space in this case), say [tex] A = A^i_j \hat{e}_{(i)} \otimes \omega^{(j)} [/tex] (where I've used the sum convention). The action of this object on a vector is then defined in essentially the only way that makes sense, i.e., [tex] A(v) = A^i_j \hat{e}_{(i)} \omega^{(j)}(v) [/tex]. "Matrix multiplication" is similarly straightforward; there are two possible "orderings" you could use to define a product:

(a) [tex] AB = (A^i_j \hat{e}_{(i)} \otimes \omega^{(j)} ) (B^k_l \hat{e}_{(k)} \otimes \omega^{(l)} ) \equiv A^i_j B^k_l \hat{e}_{(i)} \otimes \omega^{(j)}(\hat{e}_{(k)}) \omega^{(l)} [/tex], and

(b) [tex] AB = (A^i_j \hat{e}_{(i)} \otimes \omega^{(j)} ) (B^k_l \hat{e}_{(k)} \otimes \omega^{(l)} ) \equiv A^i_j B^k_l \hat{e}_{(k)} \otimes \omega^{(l)}(\hat{e}_{(i)}) \omega^{(j)} [/tex].

By convention, definition (a) corresponds to what we write as [tex] AB [/tex], and (b) corresponds to [tex] BA [/tex]. If you think about it, this choice doesn't reflect anything more profound than the fact that we like to read left-to-right.

I'll add just one more basic example, where I try to explain the origin of the "determinant" method for calculating the cross product that you seem to find annoying.

This trick is a legitimate concern, since the determinant really isn't supposed to be defined on matrices with "vector" entries. But in index notation, it's pretty clear what's going on: In three dimensions, the alternating spaces of ranks one and two have the same dimension, i.e., 3 (these spaces are natural subspaces of the cotangent space [tex] C [/tex] and the tensor product [tex] C \otimes C [/tex], respectively; specifically, they are the spaces of totally antisymmetric tensors in these tensor products). Thus, there's a natural isomorphism (the Hodge star operator) between them that turns two-forms into vectors (and vice-versa). In this language, the cross product is really a two-form (i.e., a "two-index" object; specifically, the wedge product of two one-forms) whose dual, under the Hodge star, corresponds to the usual vector calculated via the shorthand trick which appears to have drawn your ire. In symbols, we have
[tex]
(V \wedge W)_{\mu \nu} = \frac{1}{2} V_{[\mu} W_{\nu]} \textrm{,}
[/tex]
which, as you can see, has two indices. The Hodge star of this is then [tex](*V \wedge W)^{\mu} = \frac{1}{2} \epsilon^{\nu \sigma \mu} V_{[\vu} W_{\sigma]} = \epsilon^{\nu \sigma \mu} V_{\nu} W_{\sigma} [/tex] (I've raised the index because I don't know an easy way to do mixed-index tensors in TeX). (Intuitively, the Hodge star is simple: It takes basis forms [tex] dx \wedge dy, dy \wedge dz [/tex], and [tex] dz \wedge dx [/tex] to [tex] dz, dx [/tex], and [tex] dy [/tex] (respectively)). Incidentally, it's also easy to prove from this that the cross product is orthogonal to both [tex] V [/tex] and [tex] W [/tex]; for example, the inner product [tex] V \cdot (*V \wedge W) [/tex] is just [tex] \epsilon^{\nu \sigma \mu} V_{\mu} V_{\nu} W_{\sigma} = (\epsilon^{\nu \sigma \mu} W_{\sigma} ) V_{\mu} V_{\nu}[/tex], which has to be zero, since it's the contraction of a manifestly antisymmetric object [tex] \epsilon^{\nu \sigma \mu} W_{\sigma} [/tex] with manifestly symmetric one [tex] V_{\mu} V_{\nu} [/tex]. The "determinant" trick you mentioned comes from the fact that the determinant is also an antisymmetric object, with a similar expression: [tex] \det(A) = \epsilon^{\mu \nu \sigma} A^1_{\mu} A^2_{\nu} A^3_{\sigma} [/tex], so if you define the first row of a "matrix" to consist of the three basis vectors [tex] \hat{e}_{(\mu)} [/tex], then the above expression for the cross product [tex] (*V \wedge W) [/tex] looks the same as the expression for the determinant. You can think of this as a useful coincidence if you want, but the reason is actually pretty deep: To get a "vector product" involving [tex] V [/tex] which is orthogonal to [tex] V [/tex], you're going to have to exploit the symmetry of [tex] V_{\mu} V_{\nu} [/tex] by introducing a totally antisymmetric object with which to contract it. The determinant, on the other hand, is the unique totally antisymmetric (multilinear) function on three vectors in three-dimensional space, up to normalization. So it really shouldn't be surprising that the final expression for a cross product "looks like" a determinant.
 

FAQ: Basic Question That Will Eat Your Brains

What is the purpose of "Basic Question That Will Eat Your Brains"?

The purpose of "Basic Question That Will Eat Your Brains" is to challenge individuals to think critically and deeply about a topic, often leading to new insights and perspectives.

How do you come up with a "Basic Question That Will Eat Your Brains"?

A "Basic Question That Will Eat Your Brains" is typically formulated by identifying a topic or problem that is complex and multifaceted, then breaking it down into a thought-provoking question that encourages exploration and analysis.

Can "Basic Question That Will Eat Your Brains" have more than one correct answer?

Yes, a "Basic Question That Will Eat Your Brains" can have multiple correct answers. The purpose of the question is not to find a single correct answer, but to stimulate critical thinking and generate different perspectives and ideas.

How can "Basic Question That Will Eat Your Brains" benefit scientific research?

"Basic Question That Will Eat Your Brains" can benefit scientific research by encouraging scientists to approach problems from different angles and consider alternative solutions. It can also lead to new discoveries and advancements in the field.

Are "Basic Question That Will Eat Your Brains" only relevant to scientific topics?

No, "Basic Question That Will Eat Your Brains" can be applied to any subject or topic. They can be used in various fields, such as philosophy, literature, and even everyday life, to encourage critical thinking and exploration.

Similar threads

Back
Top