Intuition behind elementary operations on matrices

In summary, elementary operations on matrices involve manipulating the rows and columns of a matrix to simplify or transform it. These operations include adding, multiplying, and scaling rows or columns, as well as swapping rows or columns. They are based on the concept of linear combinations and can be used to solve systems of linear equations, find inverses, and perform other operations on matrices. These operations are essential in understanding advanced concepts in linear algebra and are used extensively in various fields of mathematics and science.
  • #1
Mr Real
122
3
For finding the inverse of a matrix A, we convert the expression A = I A (where I is identity matrix), such that we get I = B A ( here B is inverse of matrix A) by employing elementary row or column operations. But why do these operations work? Why does changing elements of a complete row by another corresponding row work but e.g. we can't add/subtract the same number in the equality?
Is there some proof of these operations? What is the principle/intuition behind these operations? Can they be explained by any first principles?
 
Physics news on Phys.org
  • #2
Mr real said:
Why does changing elements of a complete row by another corresponding row work but e.g. we can't add/subtract the same number in the equality?

You have to be clear about what you mean by "work".

If the only requirement for a procedure to "work" is that it transforms an equation into another equation then we can add and subtract to both sides of a matrix equation. Each "side" of a matrix equation , when completely simplified, is a matrix. For example, suppose we have the equation ##C = D## where ##C## and ##D## are 2x3 matrices. If we add 7 to the entry in the 2nd row 3rd column on both sides of the equation, we get another equation. So adding 7 to the corresponding entry of both matrices does "work" in that sense.

Suppose we have the equation ##C = (F)(G)## where ##C,F,G## are 2x3 matrices. It is not true that we always transform this to an equivalent equation if we add 7 to the 2nd row 3rd column of ##C## and do the same to only to the matrix ##F##. That isn't surprising. By analogy, if you have the equation in real variables ##0 = (x-2)(x+1)## we don't preserve the solution set if if we add 7 to the left hand side and add 7 only to factor ##(x-2)## on the right hand side.

With an equation in real variables, you can multiply both sides by the same number. For example, we can transform the equation ##0 = (x/7 - 1/7)(x-2)## to an equivalent equation by multiplying both sides by ##7##. When we multiply the right side by 7, we are permitted to multiply only the factor ##(x/7 - 1/7)## by 7. By analogy, if we have the matrix equation ##C = (F)(G)## we can multiply both sides of the equation by a matrix ##E##, obtaining an equivalent equation ##(E)(C) = (E)(F)(G)##. In evaluating the right hand side, we can compute it as ##(EF)(G)##

The reason an elementary row operation "works" is that it is the same as multiplying both sides of a matrix equation of the form ##C = (F)(G)## by a matrix ##E##. Have you studied how to find the "elementary matrix" that performs a given elementary row operation?
 
  • #3
Okay, i understood what all you said above. But I still have some doubts. Consider C = (F)(G), now why does doing operations like replacing a complete row by another row of the same matrix or changing R2 to R1 - R3 (R refers to row) give an equivalent equation? What's the logic behind this?
Also while doing a row operation and changing C and G by the same amount (e.g. by replacing a row), it wouldn't give an equivalent equation but changing C and F by the same amount would. Why is it so?
 
  • #5
Mr Real said:
Consider C = (F)(G), now why does doing operations like replacing a complete row by another row of the same matrix or changing R2 to R1 - R3 (R refers to row) give an equivalent equation?

The operations you described may not give an equilvalent equation - depending on what things in the equation are the variables. Not all manipulations with rows are "elementary row operations". An example of an elementary row operation is to replace R2 by R1 + R2.

That elementary row operation changes the matrix ## A = \begin{pmatrix} a & b \\ c & d \end{pmatrix}## to ##\begin{pmatrix} a & b\\ a+c && b+d \end{pmatrix}##.

The same effect can be accomplished by the multiplication ##\begin{pmatrix} 1 &0 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} a & b \\ c& d \end{pmatrix}##. That's why I asked you if you have studied the "elementary matrices" that correspond to "elementary row operations".

When you perform an elementary row operation on a matrix equation ##C = (F)(G)## you are implementing a short-cut for multiplying both sides of the equation by an invertible matrix ##E##.
 
  • Like
Likes Mr Real
  • #6
Stephen Tashi said:
The operations you described may not give an equilvalent equation
Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.
And no, I didn't know about elementary matrices. Just read a bit about it and have understood these operations a little better. Will continue doing so.
Thanks for your response.
Mr R
 
  • #7
Mr Real said:
Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.
Without giving it very much thought, I don't believe that replacing R2 by R1 - R3 is a valid row operation.
There are three row operations. Each of these row operations can be represented by an elementary matrix.
1. Replace a row by a nonzero multiple of itself. ##R_i <-- kR_i##
2. Switch two rows. ##R_i <--> R_j##
3.Replace a row by itself plus a constant multiple of another row. ##R_i <-- R_i + kR_j##

Each row operation results in a changed matrix, but one that is equivalent to the one you started with. In other words, the three row operations don't change the solution set of the matrix you started with.

Mr Real said:
And no, I didn't know about elementary matrices. Just read a bit about it and have understood these operations a little better. Will continue doing so.
Thanks for your response.
Mr R
 
  • #8
Mark44 said:
Without giving it very much thought, I don't believe that replacing R2 by R1 - R3 is a valid row operation.
Okay, so one of the terms needs to be the row that is to be changed.
Mark44 said:
3.Replace a row by itself plus a constant multiple of another row. Ri<−−Ri+kRjRi<−−Ri+kRjR_i
So won't replacing R2 by R1+ 7R2 be fine ?
Just a set of some other queries regarding this: so we can't replace, let's say, R1 by R1 - R2 + R3 (as it has 3 terms), by R1 - R1/R2 ?
Thanks for your response.
Mr R
 
  • #9
Mr Real said:
So won't replacing R2 by R1+ 7R2 be fine ?
This isn't an elementary row operation, but it could be done by two of these operations.
1. ##R_2 <-- 7R_2##
2. ##R_2 <-- R_2 + R_1##
Mr Real said:
Just a set of some other queries regarding this: so we can't replace, let's say, R1 by R1 - R2 + R3 (as it has 3 terms), by R1 - R1/R2 ?
For the first, this could be done in two steps, similar to what I did above.
For your second, dividing one row by another isn't a valid operation. The elementary row operations involve only the addition of two rows, multiplication of a row by a nonzero constant, and swapping two rows. They don't involve multiplying one row by another, or dividing one row by another, taking the square root of a row, or other such operation.
 
  • #10
Mr Real said:
Can you provide an example of an equation of the form: C = (F)(G) where if we change R2 to R1 - R3, it doesn't give an equivalent equation.
An "equation" is statement involving variables. By contrast, an "equality" is simply a statement that two constant expressions are equal. For two equations to be "equivalent" they must have the same set of solutions.

The matrix equation ##\begin{pmatrix}x&0&0 \\0&y&0 \\0&0&z \end{pmatrix} = \begin{pmatrix} 1&0&0 \\0&1&0 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## has solution ##x = 3, y =4, z = 5##.

The matrix equation ##\begin{pmatrix} x&0&0 \\x&0&-z \\0&0&z \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## has additional solutions such as ##x = 3, y = 13, z = 5## because the equation allows ##y## to be arbitrary.

It is correct that the row operation you mention does change an "equality" into another equality.

##\begin{pmatrix}3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\0&1&0 \\0&0&1 \end{pmatrix} \begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ##

##\begin{pmatrix}3&0&0 \\3&0&-5 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix}\begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ##

However changing from one equality to another is not sufficient to prove that such a series of steps reaches a desired goal. For example one could begin with a matrix equality and apply the operation "Change all rows to rows of zeroes". This would produce another equality, but serve no useful purpose.

The goal we are pursuing in manipulating the equality ##A = (I)(A)## is to reach another equality of the form ##I = (B)(A)##. For this goal to be reached, we must avoid producing an equality like
##\begin{pmatrix}3&0&0 \\3&0&-5 \\0&0&5 \end{pmatrix} = \begin{pmatrix} 1&0&0 \\1&0&-1 \\0&0&1 \end{pmatrix}\begin{pmatrix} 3&0&0 \\0&4&0 \\0&0&5 \end{pmatrix} ## because this leaves us no way to proceed that will change the left hand side to the identity matrix. The fact that the second column of entries in the left hand size is all zeroes prevents us from getting the second row ##0,1,0## of the identity matrix.
 
  • #11
matrices represent linear transformations between two vector spaces. there are several natural equivalence relations in these terms. one is left equivalence, where two transformations, between the same two spaces, are "left - equivalent" iff they become equal after composing one of them with an isomorphism of the target space. by convention of writing compositions, this means composing on the left with an isomorphism. this turns out to be the same as saying the two transformations have the same row space, also iff they have the same kernel.

in terms of the matrices of the linear transformations, they are left equivalent iff multiplying one of them from the left by an invertible matrix changes it into the other. since every isomorphism can be written as a composition of a sequence of "elementary matrices", this is the same as saying one of the matrices can be changed into the other by a sequence of elementary row operations. I.e. left multiplying by an elementary matrix just performs an elementary row operation.

since any matrix can be changed into any other matrix with the same kernel by a sequence of row operations, then any square matrix with zero kernel, can be changed into the identity matrix by such a sequence. Necessarily such a sequence will have as product the inverse of the matrix. This is why we can compute an inverse by these operations.

another equivalence relation is right equivalence, where two matrices are R-equivalent iff one can be changed into the other by right multiplication by an invertible matrix, or equivalently by a sequence of column operations. this happens iff the two matrices have the same image, i.e. same column space.

the 3rd equivalence relation for arbitrary matrices makes two matrices equivalent iff one can be changed into the other by using both row and column operations. for square matrices, the key equivalence relation is similarity, where we change one into the other by conjugating ti by an invertible matrix.

there are normal foms for matrices under all these equivalences. the normal form for row equivalence is the row educed echelon form, for column operations it is the reduced version of the columns, i.e. iff their transposes have the same row reduced echelon form. the row reduced echelon form of every invertible matrix for example is the identity matrix.

for 2 sided equivalence, the normal form is diagonal, with only 1's and zeroes on the diagonal, so two matrices are 2-sided equivalent iff they have the same rank. finally two square matrices are similar iff they have the same rational canonical form, iff their characteristic matrices are 2 sided equivalent over the polynomial ring, i.e. iff their characteristic matrices have the same diagonal form, (when the diagonal entries are made to successively divide one another.) these forms can be computed by the euclidean algorithm.

there is another normal form for similarity, the jordan form, but it cannot be computed by hand unless one is lucky enough to be able to factor the characteristic polynomial, which in general is unfeasible, but in cooked examples found in books, it can be done by the rational root theorem, since in those examples the characteristic polynomial is rigged to have all its irreducible factors linear polynomials.
 
Last edited:
  • Like
Likes Mr Real
  • #12
I don't know if this is what you are looking for, but , formally, the set of all elementary matrices generates the space of all invertible matrices through the operation of multiplication. This means if you start with the identity matrix ( for convenience; you can start with any other invertible matrix) and apply a finite number of elementary transformations, you will end up with any invertible matrix.
 
  • #13
Stephen Tashi said:
An "equation" is statement involving variables. By contrast, an "equality" is simply a statement that two constant expressions are equal.
But I have studied that an "expression" is a statement involving variables and an equation is an expression that has an equality sign.
Stephen Tashi said:
It is correct that the row operation you mention does change an "equality" into another equality.
How is the second equation an equality? The product of the two matrices in RHS does not give the matrix in LHS ( the second row of the matrix in LHS should be [0 0 0] instead of [3 0 -5].
Stephen Tashi said:
this leaves us no way to proceed that will change the left hand side to the identity matrix. The fact that the second column of entries in the left hand size is all zeroes prevents us from getting the second row 0,1,0 of the identity matrix.
How did you come to know just by seeing the equality that the second column elements are all zero that the matrix on the left cannot be converted into an identity matrix? How does second column elements all being zero stop us from converting it to an identity matrix? Is it some property related to matrices?
Thanks
Mr R
 
Last edited:
  • #14
Mr Real said:
But I have studied that an "expression" is a statement involving variables and an equation is an expression that has an equality sign.
An equation has an equality sign, but an equation implies there are variables, for which we wish to find solutions. For example, "3 = 2 + 1" does not say anything about a variable being involved. You could pose a problem by saying "Find x such that 3 = 2 + 1" and force the reader to consider a variable. In that case the solution set for the "equation" 3 = 2 + 1 would be all numbers.

How is the second equation an equality? The product of the two matrices in RHS does not give the matrix in LHS ( the second row of the matrix in LHS should be [0 0 0] instead of [3 0 -5].
Second "equation" or second "equality" ? In either case, check your multiplication.
For example the computation for the first element in the second row is ##(1)(3) + (0)(0)+ (-1)(0) = 3##.

How did you come to know just by seeing the equality that the second column elements are all zero that the matrix on the left cannot be converted into an identity matrix? How does second column elements all being zero stop us from converting it to an identity matrix? Is it some property related to matrices?
In performing elementary row operations, once you get a column of zeroes, no amount of adding rows together or multiplying rows by numbers will get rid of the zeroes. It's just something you'll learn from experience.
 
  • #15
i apologize if my answer was too long, but i was trying to be complete. WWGD's nice answer for example is my second sentence, second paragraph. post 11.

maybe this point indeed is what the OP wants to grasp. i.e. we are trying to transform a matrix into a simple form by multiplying on the left by a suitable invertible matrix. Now elementary matrices are chosen because they are basic building blocks for all invertible matrices. Thus we can achieve the same result of left multiplying by any invertible matrix, in stages, by performing a sequence of elementary row operations. So elementary row operations are a tool for accomplishing the left multiplication of a matrix by any invertible matrix. In particular, if starting from an invertible matrix, we can reach the identity only by left multiplying by its left inverse. hence if we can find any sequence of operations that results in the identity, then those must have as their product, the inverse matrix.

as to why some of your operations are not acceptible, all operations must be invertible, i.e. it must be possible to reverse their effect by some other operation. but some of your operations lose information which cannot be recovered by another operation, as steven tashi is explaining to you.
 
Last edited:
  • #16
Stephen Tashi said:
Second "equation" or second "equality" ? In either case, check your multiplication.
For example the computation for the first element in the second row is (1)(3)+(0)(0)+(−1)(0)=3(1)(3)+(0)(0)+(−1)(0)=3(1)(3) + (0)(0)+ (-1)(0) = 3
Yeah, you are right, it was my mistake.
 
  • #17
mathwonk said:
i apologize if my answer was too long, but i was trying to be complete.
Actually, I've just started to learn about matrices, so many of the terms used in your previous answer were alien to me but I understood this answer.
mathwonk said:
as to why some of your operations are not acceptible, all operations must be invertible, i.e. it must be possible to reverse their effect by some other operation. but some of your operations lose information which cannot be recovered by another operation
But how is an operation like converting R2 to R1 - R3 not invertible but converting it to R2 - R3 is invertible? Can you explain this with an example?
Thanks
Mr R
 
  • #18
How would you get back R2 in your matrix ? You can't add or subtract any rows now, you've lost R2 permanently.

However the R2-R3 replacement for R2 means you can revert back to the old R2 just by adding R3 to it.
 
  • #19
jedishrfu said:
How would you get back R2 in your matrix ? You can't add or subtract any rows now, you've lost R2 permanently.
Yeah, I get it now. So, would this work for all elementary row/column operations and for any matrix?
Thanks
Mr R
 
  • #20
Mr Real said:
Yeah, I get it now. So, would this work for all elementary row/column operations and for any matrix?
Thanks
Mr R

In this context matrices are just a convenient way of representing a set of equations. As row operations involve at most two rows, you can explain it all using just two equations. These can have any number of variables, but let's have three in this example:

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Now, let's suppose we have found ##(x, y, z)## that solve these equations. It's clear that if:

##a_1x + b_1y + c_1z = d_1##

Then:

##k(a_1x + b_1y + c_1z) = k(d_1)##

Where ##k## is some non-zero constant. And, likewise, if this second equation holds (with ##k## in it) then so does the first without the ##k##.

That's why you can multiply a row by a constant.

It's even clearer that if:

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Then:

##a_2x + b_2y + c_2z = d_2##
##a_1x + b_1y + c_1z = d_1##

That's just the same equations written in a different order, so must have the same solutions.

Finally, if

##a_1x + b_1y + c_1z = d_1##
##a_2x + b_2y + c_2z = d_2##

Then adding these together gives:

##(a_1+a_2)x + (b_1+b_2)y + (c_1+c_2)z = d_1+d_2##

And, if we put this together with either of the original equations we have:

##a_1x + b_1y + c_1z = d_1##
##(a_1+a_2)x + (b_1+b_2)y + (c_1+c_2)z = d_1+d_2##

This has exactly the same solutions as the original equations. That's why you can add one row to another. Adding a mutiple of one row to another isn't really a new operation, just a combination of these ones.
 
  • Like
Likes Mr Real and member 587159
  • #21
PeroK said:
This has exactly the same solutions as the original equations. That's why you can add one row to another. Adding a mutiple of one row to another isn't really a new operation, just a combination of these ones.
So, with regard to what jedishrfu said that we can always get R2 back if we apply any elementary operation, does it hold for every matrix?
 
  • #22
Mr Real said:
So, with regard to what jedishrfu said that we can always get R2 back if we apply any elementary operation, does it hold for every matrix?

You can see for yourself. If you start with:

R1
R2

If you then you add R1 to R2, you have an equaivalent set of equations:

R1
R2 + R1

Which has exactly the same solutions.

If you want R2 back, you simply subtract R1 from the second row and you are back to:

R1
R2

The key point is that these operations change the set of equations and change the matrix, but they do not change the solutions. The new set of equations always has the same set of solutions.
 
  • #23
Thank you everyone for giving such brilliant answers to my doubts, especially Stephen Tashi. Thanks to all of you, matrices have become clearer to me.
p.s. Thanks for suggesting the awesome playlist @jedishrfu. Just watched most of them!
:smile:
Mr R
 
  • #24
PeroK said:
In this context matrices are just a convenient way of representing a set of equations.
This is an important point, and one that many students overlook.
 
  • Like
Likes Mr Real
  • #25
Hey, I have come to know a little more about matrices so had doubts about this particular statement.
PeroK said:
In this context matrices are just a convenient way of representing a set of equations.
Well, I know that matrices can be used to represent a set of equations(and then we can find solutions for x, y, z, etc.), but isn't that just one of the many uses of matrices, not what matrices themselves are because matrices can also be used for other things, like to decribe linear transformations in 2D, 3D, etc. , can't they?
 
  • #26
Mr Real said:
Hey, I have come to know a little more about matrices so had doubts about this particular statement.
Well, I know that matrices can be used to represent a set of equations(and then we can find solutions for x, y, z, etc.), but isn't that just one of the many uses of matrices, not what matrices themselves are because matrices can also be used for other things, like to decribe linear transformations in 2D, 3D, etc. , can't they?

Yes, but if they stand for Linear Transformations you can't do elementary row operations on them and still get the same transformation. Which was more or less my point. The matrices change, but in this context (the context of sets of linear equations) the change is not relevant.

In other contexts you cannot apply row operations without fundamentally changing what the matrix represents.
 
  • #27
PeroK said:
Yes, but if they stand for Linear Transformations you can't do elementary row operations on them and still get the same transformation. Which was more or less my point.
But if we have an equality like the one we were discussing; C = (F)(G),
even if we considered these matrices as linear transformations, and apply elementary operations on both sides wouldn't we get the same transformation on both sides?
 
  • #28
Mr Real said:
But if we have an equality like the one we were discussing; C = (F)(G),
even if we considered these matrices as linear transformations, and apply elementary operations on both sides wouldn't we get the same transformation on both sides?

No. For example:

##\begin{pmatrix} 1 &1 \\ 1 & 1 \end{pmatrix}##

Is not the same linear transformation as:

##\begin{pmatrix} 1 &1 \\ 0 & 0 \end{pmatrix}##
 
  • #29
PeroK said:
No. For example:

##\begin{pmatrix} 1 &1 \\ 1 & 1 \end{pmatrix}##

Is not the same linear transformation as:

##\begin{pmatrix} 1 &1 \\ 0 & 0 \end{pmatrix}##
I know they are not the same but that's not what I had asked. What I had asked is: if we've an equality C = (F)(G) and we apply elementary row operation on both sides then the new RHS will be different from C and the LHS will be different from (F)(G) but RHS will be still be equal to LHS, that is we would get the same transformation on both sides even after applying an elementary operation.
 
  • #30
Mr Real said:
I know they are not the same but that's not what I had asked. What I had asked is: if we've an equality C = (F)(G) and we apply elementary row operation on both sides then the new RHS will be different from C and the LHS will be different from (F)(G) but RHS will be still be equal to LHS, that is we would get the same transformation on both sides even after applying an elementary operation.

I'll let you find your own counterexample to that. Hint: think about the identity matrix.

But, if you think for a moment about matrix multiplication you will see that there is no way that row operations would support that sort of result.
 
  • #31
Mr Real said:
I know they are not the same but that's not what I had asked. What I had asked is: if we've an equality C = (F)(G) and we apply elementary row operation on both sides then the new RHS will be different from C and the LHS will be different from (F)(G) but RHS will be still be equal to LHS, that is we would get the same transformation on both sides even after applying an elementary operation.

The only three operations that preserve the solution set are : exchange rows, scaling (by a non-zero constant) , adding a multiple of one row to another. It is an interesting exercise to show these are the unique operations with these properties.
 
  • #32
PeroK said:
I'll let you find your own counterexample to that. Hint: think about the identity matrix.

But, if you think for a moment about matrix multiplication you will see that there is no way that row operations would support that sort of result.
I don't know what you mean. If I consider the equality A = (I)(A) and then we apply any elementary row operation on both sides, which is equivalent to multiplying by an elementary matrix (let's say B), then the equality will become; (B)(A) = (B)(I)(A), this new equality is different from the previous one but we still have an equality, that is the transformations we have obtained now are different from the previous ones but are equal to each other. This is what I've been trying to say and I think it is correct.
 
  • #33
Mr Real said:
I don't know what you mean. If I consider the equality A = (I)(A) and then we apply any elementary row operation on both sides, which is equivalent to multiplying by an elementary matrix (let's say B), then the equality will become; (B)(A) = (B)(I)(A), this new equality is different from the previous one but we still have an equality, that is the transformations we have obtained now are different from the previous ones but are equal to each other. This is what I've been trying to say and I think it is correct.

If you are talking about applying the row operation to the first matrix on the RHS, then the proof is trivial - effectively what you have done. I assumed you were talking about applying the row operations to all matrices.

In any case, I can't see where this is leading.
 
  • #34
PeroK said:
If you are talking about applying the row operation to the first matrix on the RHS, then the proof is trivial - effectively what you have done. I assumed you were talking about applying the row operations to all matrices.

In any case, I can't see where this is leading.
Sorry, I forgot one thing: when we were considering matrices as a set of equations and considering equalities consisting of matrices, not only did we obtain an another equality when we did the valid elementary operations but the solutions to these matrices was the same as the solutions to the previous equalities. Is this what I've been missing here, that it doesn't matter if we have got another equality after applying the elementary operation, instead the thing that matters is we didn't get the previous transformation?

Another doubt: Then, do elementary operations have no meaning when we consider matrices as linear transformations?
 
  • #35
Mr Real said:
Another doubt: Then, do elementary operations have no meaning when we consider matrices as linear transformations?

Elementary operations are themselves linear transformations.
 

Similar threads

Back
Top