Reasoning behind determinants of high n square matrices

In summary: So in a nutshell, we're using these elementary row operations to reduce a higher dimensional matrix to a lower dimensional matrix, and then using that lower dimensional matrix to calculate the determinant.
  • #1
BiGyElLoWhAt
Gold Member
1,630
134
1st: Not a specific problem, I just didn't know where else to put it.

We just covered this today in class. Basically what we're doing is reducing higher level matrices to 2x2 matrices and using them to calculate the determinant.

I asked my teacher where that came from, and he was really vague, said it was really complicated, and said it stemmed from a long process over time. That really didn't answer my question, but it's a summer class and I really didn't want to tie up the whole session trying to get more out of him.

Just to clarify what I'm talking about:

##\left | \begin{array}{ccc}
a & b & c \\
d & e & f \\
g & h &i \\
\end{array} \right | ##

and what you do is you "reduce" this array into 3- 2x2 arrays by "eliminating" a row or a column, where each element becomes a multiple of 1 of the 2x2 arrays. You then ignore one of the other (if you "eliminate" a column, then you ingore a row, and vice versa).

Example:
"eliminate" row 1:
ignore column 1:
a is the multiple of the first determinant:
##a\ \text{det} \left | \begin{array}{cc}
e & f \\
h & i \\
\end{array} \right |
##
second array keeping the first row "eliminated" and now ignoring the second row. I forgot to mention the sign convention. If A is a matrix ##A_{ij}## then you multiply each element in the array by ##(-1)^{i + j}## BUT ONLY FOR THE "ELIMINATED" ROW

In the second array we have a multiple of -b:
##-b\ \text{det} \left | \begin{array}{cc}
d & f \\
g & i \\
\end{array} \right |
##

and the third follows as:

##c\ \text{det} \left | \begin{array}{cc}
d & e \\
g & h \\
\end{array} \right |
##
thus the determinant of our origional 3x3 array is given by the sum of these 3-2x2 arrays.

This is not intuitive to me. I'm hoping someone can shed some light, as I'm sure mathematicians didn't use the "guess and check" method for defining determinants in such an 'abstract' way.
 
Physics news on Phys.org
  • #2
BiGyElLoWhAt said:
1st: Not a specific problem, I just didn't know where else to put it.

We just covered this today in class. Basically what we're doing is reducing higher level matrices to 2x2 matrices and using them to calculate the determinant.

I asked my teacher where that came from, and he was really vague, said it was really complicated, and said it stemmed from a long process over time. That really didn't answer my question, but it's a summer class and I really didn't want to tie up the whole session trying to get more out of him.

Just to clarify what I'm talking about:

##\left | \begin{array}{ccc}
a & b & c \\
d & e & f \\
g & h &i \\
\end{array} \right | ##

and what you do is you "reduce" this array into 3- 2x2 arrays by "eliminating" a row or a column, where each element becomes a multiple of 1 of the 2x2 arrays. You then ignore one of the other (if you "eliminate" a column, then you ingore a row, and vice versa).

Example:
"eliminate" row 1:
ignore column 1:
a is the multiple of the first determinant:
##a\ \text{det} \left | \begin{array}{cc}
e & f \\
h & i \\
\end{array} \right |
##
second array keeping the first row "eliminated" and now ignoring the second row. I forgot to mention the sign convention. If A is a matrix ##A_{ij}## then you multiply each element in the array by ##(-1)^{i + j}## BUT ONLY FOR THE "ELIMINATED" ROW

In the second array we have a multiple of -b:
##-b\ \text{det} \left | \begin{array}{cc}
d & f \\
g & i \\
\end{array} \right |
##

and the third follows as:

##c\ \text{det} \left | \begin{array}{cc}
d & e \\
g & h \\
\end{array} \right |
##
thus the determinant of our origional 3x3 array is given by the sum of these 3-2x2 arrays.

This is not intuitive to me. I'm hoping someone can shed some light, as I'm sure mathematicians didn't use the "guess and check" method for defining determinants in such an 'abstract' way.

Google "expansion by cofactors" or "expansion by minors" to see lots of examples. While expanding a determinant in this fashion may seem pretty arcane, it can simplify things greatly. It is usually taught in conjunction with learning how row operations affect the determinant. If you can manipulate the determinant so that one row or column has mostly zeroes, you can use that row (or column) for the expansion, cutting the algebra way down.
 
  • #3
Ok, I see lot's of examples, but I'm not having problems doing it. The problem is I don't really understand why I'm doing it, and I'm having trouble getting a good explanation. Ok, a 2x2 matrix is easy enough to get, but once you get up to higher dimensions, it gets more abstract (at least to me), and I would like to know where this method came from.

And we are talking about various elementary row operations and how they affect (or don't) the determinant.
 
  • #4
And also, while we're on the topic of determinants, I've been reading, and was told in class today, that determinants are only defined for square matrices.

My question, thus, is this:
I've heard of what you do when setting up a matrix to calculate a cross product, being called a determinant; but this is in contradiction to the statement above, as this process involves a 2x3 matrix.

Is this just loose termonology or a matter of differing definitions of a determinant?
 
  • #5
BiGyElLoWhAt said:
And also, while we're on the topic of determinants, I've been reading, and was told in class today, that determinants are only defined for square matrices.

My question, thus, is this:
I've heard of what you do when setting up a matrix to calculate a cross product, being called a determinant; but this is in contradiction to the statement above, as this process involves a 2x3 matrix.

Is this just loose terminology or a matter of differing definitions of a determinant?
It's a 3×3 matrix when doing the cross-product via a determinant.

[itex]\displaystyle \vec{A}\times\vec{B}=
\left | \begin{array}{ccc}
\hat{i} & \hat{j} & \hat{k} \\
A_x & A_y & A_z \\
B_x & B_y & B_z \\
\end{array} \right |[/itex]
 
  • #6
There is a fundamental problem with teaching about determinants too early in math courses, and that is that they don't have much use except for learning how to jump through hoops and calculate them. (You will probably learn Cramer's Rule for solving sets of equations using determinants, but that is not much practical use except for 2 equations in two variables, and you don't really need a special method just for that!)

A better way into the subject is related to geometry. If you interpret the rows (or columns) if a 2x2 matrix as vectors, the determinant gives you the area of the parallelogram whose sides are the two vectors. For a 3x3 matrix, the determinant gives you the volume of the parallelpiped whose sides are the 3 vectors, and for bigger matrices it measures "n-dimensional volume".

That geometrical interpretation only makes sense for a square matrix, and indeed determinants are only defined for square matrices.

Statements about what happens to the determinant when you do row or column operations on the matrix can be interpreted geometrically. The same is true for the relation between zero determinants and linearly dependent rows and columns of the matrx. For example if a 3x3 determinant is zero, the geometrical volume is zero so the 3 vectors must all be in the same plane, so one of them is a linear combination of the other two.

With that geometrical interpretation, there is some motivation for the "formulas" for calculating determinants - obviously they have to give the correct areas and volumes. For example the 3x3 determinant is closely related to the "triple product" ##a \cdot (b \times c)## or three vectors.

That still might leave you asking the question "OK, but what are n-dimensional volumes good for?" The answer to that is when you interpret matrices as transformations that change the shape of objects ... but often you first meet determinants a long time before you reach that point in your math courses, if ever.
 
Last edited:
  • #7
BiGyElLoWhAt said:
Ok, I see lot's of examples, but I'm not having problems doing it. The problem is I don't really understand why I'm doing it, and I'm having trouble getting a good explanation. Ok, a 2x2 matrix is easy enough to get, but once you get up to higher dimensions, it gets more abstract (at least to me), and I would like to know where this method came from.
The determinant is essentially a way of measuring volume. Its properties and the various formulas for calculating it all stem from this fact.

It is helpful to think of a matrix ##M## as an operator which takes a vector ##x## and produces another vector ##y##, via the rule ##y=Mx##. Now consider the 2x2 case:
$$M = \begin{pmatrix}a & b\\ c & d\end{pmatrix}$$
Note that
$$\begin{pmatrix}a \\ c\end{pmatrix} = \begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix} 1 \\ 0\end{pmatrix}$$
and
$$\begin{pmatrix}b \\ d\end{pmatrix} = \begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix} 0 \\ 1\end{pmatrix}$$
In other words, the columns of ##M## are the vectors that result from applying ##M## to the canonical basis vectors ##(1,0)## and ##(0,1)##. Note that these basis vectors form two sides of the unit square, which has area 1. Also, ##(a,c)## and ##(b,d)## form two sides of a parallelogram, and it turns out that this parallelogram has area ##|ad - bc| = |\det(M)|##. Moreover, we can even assign a sign (positive or negative) to the area, depending on whether we have to rotate clockwise or counterclockwise to get from ##(a,c)## to ##(b,d)## via the smaller of the two possible angles.

A similar result is true in higher dimensions: ##M## maps the unit ##n##-cube (##n##-dimensional generalization of a cube) to a parallelepiped which has ##n##-dimensional volume equal to ##|\det(A)|##, and the sign gives us information about the ordering of the column vectors.

Now if we want the determinant to represent volume, it must satisfy the following rules:

  1. ##\det(I) = 1##, where ##I## is the identity matrix, since its columns are simply the sides of the unit cube, which has volume 1.
  2. If any of the columns of ##M## are repeated, then ##\det(M) = 0##. This is because in this case, ##M## "squashes" the unit cube into a parallelepiped of dimension smaller than ##n##, so the parallelepiped has ##n##-dimensional volume equal to zero. (Just as a square has zero volume when considered as a three-dimensional object.)
  3. If we hold all of the columns of ##M## fixed except one, and we scale that column by a scale factor ##c##, then the volume should scale by ##|c|##. (We also use the convention that the sign will be inverted if we change the direction of that column vector.)
  4. If we hold all of the columns of ##M## fixed except one, and we decompose the remaining column into the sum of two vectors ##x+y##, then the volume should be additive: ##\det(M_{x+y}) = \det(M_x) + \det(M_y)##, where ##M_v## is the matrix where we replace the specified column with ##v##.

It turns out that the above rules completely determine all of the other properties of the determinant function, including the formulas for calculating it.

You might find this document interesting as it is quite elementary but derives all of the properties and formulas carefully, with special attention paid to the 2x2 and 3x3 cases. See in particular section 6, where the author derives the "expansion by cofactors" formula for the 3x3 case.

http://www.math.brown.edu/~mangahas/det.pdf
 
Last edited:
  • Like
Likes 1 person
  • #8
For me, the best way is to learn all you can about determinants and treat them as something known. The determinant has a purpose, to determine if n linear equations in n variables have a unique solution. Learn all the ways to calculate them (although Cramer's rule is pretty rubbish), learn the algebraic rules (any good book should cover them). When you know them as well as possible, I think you'll find that you are satisfied with what they are and how they work.

You'll find that the geometric reasoning is a little forced, for example the document Jbunniii linked to talks about geometry but then defines the 2x2 determinant as ##ad - bc##. What I mean is, it didn't say "we see that the determinant of a 2x2 matrix is ##ad - bc##, it pronounced it after a ton of confusing math -- hence not really being geometric in spirit, this is what I mean.
 
Last edited:
  • #9
SammyS said:
It's a 3×3 matrix when doing the cross-product via a determinant.

[itex]\displaystyle \vec{A}\times\vec{B}=
\left | \begin{array}{ccc}
\hat{i} & \hat{j} & \hat{k} \\
A_x & A_y & A_z \\
B_x & B_y & B_z \\
\end{array} \right |[/itex]

ahhh... It's actually really funny you should mention that. I did that in my calc 3 class when we were "learning" how to calculate cross products "for the first time". My teacher said that's not how I should be doing it... but then again we butted heads quite a bit in that class...
Thanks for the reassurance.
 
  • #10
verty said:
You'll find that the geometric reasoning is a little forced, for example the document Jbunniii linked to talks about geometry but then defines the 2x2 determinant as ##ad - bc##. What I mean is, it didn't say "we see that the determinant of a 2x2 matrix is ##ad - bc##, it pronounced it after a ton of confusing math -- hence not really being geometric in spirit, this is what I mean.
I agree with what you're saying in general - any formula for the determinant beyond the 2x2 case is lamentably nasty.

But the 2x2 case is not so hopeless. The document I linked doesn't seem to include the pictures for some reason, and they make the calculation a bit worse than it needs to be.

Referring to the attached figure, the area of the parallelogram is simply ##A = |x| |y|\sin(\theta)##. Since ##x \cdot y = |x| |y| \cos(\theta)##, we have ##A^2 + (x \cdot y)^2 = |x|^2 |y|^2## (since ##\sin^2(\theta) + \cos^2(\theta) = 1##), so
$$A^2 = |x|^2 |y|^2 - (x\cdot y)^2$$
If we set ##x## and ##y## to be the two columns of the matrix, then ##x = (a,c)## and ##y = (b,d)##, so ##|x|^2 = a^2 + c^2##, ##|y|^2 = b^2 + d^2##, and ##(x \cdot y)^2 = (ab + cd)^2##, and therefore
$$A^2 = (a^2 + c^2)(b^2 + d^2) - (ab + cd)^2$$
which after a bit of algebra simplifies to
$$A^2 = a^2d^2 - 2abcd + b^2 c^2 = (ad - bc)^2$$
 

Attachments

  • parallelogram.png
    parallelogram.png
    2.1 KB · Views: 442
Last edited:
  • #11
jbunniii said:
The determinant is essentially a way of measuring volume. Its properties and the various formulas for calculating it all stem from this fact.

It is helpful to think of a matrix ##M## as an operator which takes a vector ##x## and produces another vector ##y##, via the rule ##y=Mx##. Now consider the 2x2 case:
$$M = \begin{pmatrix}a & b\\ c & d\end{pmatrix}$$
Note that
$$\begin{pmatrix}a \\ c\end{pmatrix} = \begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix} 1 \\ 0\end{pmatrix}$$
and
$$\begin{pmatrix}b \\ d\end{pmatrix} = \begin{pmatrix}a & b\\ c & d\end{pmatrix}\begin{pmatrix} 0 \\ 1\end{pmatrix}$$
In other words, the columns of ##M## are the vectors that result from applying ##M## to the canonical basis vectors ##(1,0)## and ##(0,1)##. Note that these basis vectors form two sides of the unit square, which has area 1. Also, ##(a,c)## and ##(b,d)## form two sides of a parallelogram, and it turns out that this parallelogram has area ##|ad - bc| = |\det(M)|##. Moreover, we can even assign a sign (positive or negative) to the area, depending on whether we have to rotate clockwise or counterclockwise to get from ##(a,c)## to ##(b,d)## via the smaller of the two possible angles.

A similar result is true in higher dimensions: ##M## maps the unit ##n##-cube (##n##-dimensional generalization of a cube) to a parallelepiped which has ##n##-dimensional volume equal to ##|\det(A)|##, and the sign gives us information about the ordering of the column vectors.

Now if we want the determinant to represent volume, it must satisfy the following rules:

  1. ##\det(I) = 1##, where ##I## is the identity matrix, since its columns are simply the sides of the unit cube, which has volume 1.
  2. If any of the columns of ##M## are repeated, then ##\det(M) = 0##. This is because in this case, ##M## "squashes" the unit cube into a parallelepiped of dimension smaller than ##n##, so the parallelepiped has ##n##-dimensional volume equal to zero. (Just as a square has zero volume when considered as a three-dimensional object.)
  3. If we hold all of the columns of ##M## fixed except one, and we scale that column by a scale factor ##c##, then the volume should scale by ##|c|##. (We also use the convention that the sign will be inverted if we change the direction of that column vector.)
  4. If we hold all of the columns of ##M## fixed except one, and we decompose the remaining column into the sum of two vectors ##x+y##, then the volume should be additive: ##\det(M_{x+y}) = \det(M_x) + \det(M_y)##, where ##M_v## is the matrix where we replace the specified column with ##v##.

It turns out that the above rules completely determine all of the other properties of the determinant function, including the formulas for calculating it.

You might find this document interesting as it is quite elementary but derives all of the properties and formulas carefully, with special attention paid to the 2x2 and 3x3 cases. See in particular section 6, where the author derives the "expansion by cofactors" formula for the 3x3 case.

http://www.math.brown.edu/~mangahas/det.pdf
Wow. I haven't gotten to the paper yet, but that was actually a really insightful and well written post. Many many thanks to bunniii and all others who chimed in. I guess I "knew" all those things, but I just never thought to put them together in such a manner.

Thanks again.
 
  • #12
Its not quite correct that a determinate should be looked at as the volume of the vectors comprising it. Unless you can tell me what a negative volume is? So a set of n n-vectors can be represented as a matrix, and their volume (as defined) is the absolute value of the determinant, while the sign of the determinant is the 'orientation' of this volume (which is easy enough to picture for n=2 or 3 (but not so much for n=4 on up, LOL) but which I won't explain here).
One EASY way to calculate the determinate of any matrix (n >2) is to clone the matrix.
given the matrix
a b c d
e f g h
i j k l
m n o p
we write
a b c d a b c d
e f g h e f g h
i j k l i j k l
m n o p m n o p
which, despite how it looks here, should be a n row by 2n column matrix.
We then draw n diagonal lines (down and to the right):
We draw the first starting at a, and intersecting a, f, k and p
Next we draw a line parallel to the first starting at b. (so it intersects b, g, i and the second m)
We do two more starting at the c and then the d.
The terms connected by each of these four lines are afkp+bgim+chin+dejo
Then we draw four more lines, this time diagonally up and to the right:
Starting at m - giving the term mjgd
And continuing to get the other three terms.
These terms are subtracted from the first four. So we have n terms added, and n terms subtracted.
Their sum is the determinant.
If you try this on a 2x2 matrix you'd get ad+bc-cb-da which equals zero. In other words this method fails for n=2. So, in a way, 2x2 matrices are different from higher rank matrices.
-=-=
Matrices appear all over the place in Physics and many models of real world systems. You learn a little bit about them every time you run across them. While you could study them for their own 'pure' properties, they are most often used to abbreviate more complex systems of equations, or to simplify calculations. Matrices used to be initially taught to show how to solve systems of linear (simultaneous) equations.
Like x+3y = 7 and 2x-y = 9... or systems with 3 (or more) variables. The reason they are introduced here is that once you've solved enough of these linear equations you realize that there is a general method to solve them. It turns out that this method can be encapsulated extremely well with matrices. You probably know when you have n unknowns (unknown variables), that you must have n equations. In other words you must have an nxn matrix (or the information to create one). So, square matrices are important in broad areas of math and science (and economics, and statistics, and well even sports now-days uses them (believe it or not, sometimes you need to crunch the numbers to make your roster as strong as possible - see the film Moneyball).
And you also probably know how central determinants are in finding solutions using matrices. If your first reaction to using matrices is that they are a lot more work than most problems require, you aren't wrong. Sometimes simple is not short. OTOH, if I say that the determinant for a system of equations is 0, then by golly, it can keep me from wasting massive amounts of time chasing snipe.
 
  • #13
abitslow said:
Its not quite correct that a determinate should be looked at as the volume of the vectors comprising it. Unless you can tell me what a negative volume is?
This was addressed (admittedly briefly) in my previous post. The absolute value of the determinant is a volume, and the sign indicates whether or not the orientation of the canonical basis vectors is preserved by the mapping.
One EASY way to calculate the determinate of any matrix (n >2) is to clone the matrix.
given the matrix
a b c d
e f g h
i j k l
m n o p
we write
a b c d a b c d
e f g h e f g h
i j k l i j k l
m n o p m n o p
which, despite how it looks here, should be a n row by 2n column matrix.
We then draw n diagonal lines (down and to the right):
We draw the first starting at a, and intersecting a, f, k and p
Next we draw a line parallel to the first starting at b. (so it intersects b, g, i and the second m)
We do two more starting at the c and then the d.
The terms connected by each of these four lines are afkp+bgim+chin+dejo
Then we draw four more lines, this time diagonally up and to the right:
Starting at m - giving the term mjgd
And continuing to get the other three terms.
These terms are subtracted from the first four. So we have n terms added, and n terms subtracted.
Their sum is the determinant.
If you try this on a 2x2 matrix you'd get ad+bc-cb-da which equals zero. In other words this method fails for n=2. So, in a way, 2x2 matrices are different from higher rank matrices.
This method is called the Rule of Sarrus. It works for 3x3 matrices, but not for any larger matrices. In your 4x4 example, your calculation gave you 8 terms, but in fact you need 24 terms, one for each permutation of the set ##\{1,2,3,4\}##. In general, for an ##n \times n## matrix, the determinant will consist of ##n!## terms, half of them added and half subtracted. The Rule of Sarrus only gives you ##2n## of the terms. For the ##n=3## case, we have ##n! = 2n##, which is why the method works in that case.
 
Last edited:
  • #14
jbunniii said:
I agree with what you're saying in general - any formula for the determinant beyond the 2x2 case is lamentably nasty.

But the 2x2 case is not so hopeless. The document I linked doesn't seem to include the pictures for some reason, and they make the calculation a bit worse than it needs to be.

Referring to the attached figure, the area of the parallelogram is simply ##A = |x| |y|\sin(\theta)##. Since ##x \cdot y = |x| |y| \cos(\theta)##, we have ##A^2 + (x \cdot y)^2 = |x|^2 |y|^2## (since ##\sin^2(\theta) + \cos^2(\theta) = 1##), so
$$A^2 = |x|^2 |y|^2 - (x\cdot y)^2$$
If we set ##x## and ##y## to be the two columns of the matrix, then ##x = (a,c)## and ##y = (b,d)##, so ##|x|^2 = a^2 + c^2##, ##|y|^2 = b^2 + d^2##, and ##(x \cdot y)^2 = (ab + cd)^2##, and therefore
$$A^2 = (a^2 + c^2)(b^2 + d^2) - (ab + cd)^2$$
which after a bit of algebra simplifies to
$$A^2 = a^2d^2 - 2abcd + b^2 c^2 = (ad - bc)^2$$

I'll give the alternative justification. Suppose we have these two linear equations:

ax + by = z
cx + dy = z'

Is there a unique solution? Each equation defines a line in the plane and there will be a unique solution whenever the lines are not parallel. If the lines are parallel,

a/r(a^2 + b^2) = c/r(c^2 + d^2) and b/r(a^2 + b^2) = d/r(c^2 + d^2)

because <a, b> and <c, d> are normal vectors to each line and by normalizing them, the coefficients must be the same. Multiplying, we see that ad = bc.

This was for parallel lines, so for there to be a unique solution, ad-bc != 0.

This explanation just clicks for me; perhaps it is that I learned it this way first. The geometrical arguments are really quite convincing though, one could in fact learn about determinants without ever knowing about simultaneous equations.

EDIT: A slight clarification, I realize now that this argument has a slight problem, the normalized normals could differ in sign, that is, they could have opposite orientations. But this is easily fixed, so I'll say no more.
 
Last edited:

FAQ: Reasoning behind determinants of high n square matrices

What is the purpose of determining the determinant of a high n square matrix?

The determinant of a square matrix is a numerical value that provides important information about the matrix. It can be used to determine if a matrix is invertible, calculate the area or volume of a parallelogram or parallelepiped, and solve systems of linear equations.

How is the determinant of a high n square matrix calculated?

The determinant of a high n square matrix is calculated by expanding along any row or column using the cofactor expansion method. This involves multiplying the elements in the chosen row or column by their corresponding cofactors, which are determined using the minor matrix of each element.

What factors determine the value of the determinant for a high n square matrix?

The value of the determinant for a high n square matrix is determined by the size and values of the elements in the matrix. The position of the elements within the matrix also plays a role, as swapping rows or columns results in a change in sign for the determinant.

How can the determinant of a high n square matrix be used to solve systems of linear equations?

The determinant can be used to solve systems of linear equations by setting up a matrix equation and using the Cramer's rule to solve for the variables. The determinant is used to determine whether a unique solution, infinite solutions, or no solution exists for the system of equations.

What other applications does the determinant of a high n square matrix have?

The determinant has numerous applications in mathematics, physics, and engineering. It can be used in calculating eigenvalues and eigenvectors, finding the area and volume of a region in calculus, and solving optimization problems. It also has applications in computer graphics and image processing.

Similar threads

Back
Top