Show that the eigenvalues of any matrix are unaltered by a similarity transform

In summary: S^{-1}$ is orthogonal, right?)$SAS^{-1}(Sv) = \lambda (Sv)$ (we can take...$S^{-1}$ is orthogonal, right?)$S(xI)S^{-1} = x(SIS^{-1})$ (we can take...$S^{-1}$ is orthogonal, right?)Thus, the only thing that changes is the order of the matrix bracket.
  • #1
ognik
643
2
Show that the eigenvalues of any matrix are unaltered by a similarity transform - the book says this follows from the invariance of the secular equation under a similarity transform - which is news to me.

The secular eqtn is found by \(\displaystyle Det(A-\lambda I)=0\) and is a poly in \(\displaystyle \lambda \), so I can't see how that can even undergo a sim. transform, its not a matrix?

(Not sure how to start this at all)
 
Physics news on Phys.org
  • #2
ognik said:
Show that the eigenvalues of any matrix are unaltered by a similarity transform - the book says this follows from the invariance of the secular equation under a similarity transform - which is news to me.

The secular eqtn is found by \(\displaystyle Det(A-\lambda I)=0\) and is a poly in \(\displaystyle \lambda \), so I can't see how that can even undergo a sim. transform, its not a matrix?

(Not sure how to start this at all)
Given a matrix A we have \(\displaystyle AV = \lambda V\), where V is an eigenvector and \(\displaystyle \lambda\) is an eigenvalue. Define a symmetry transformation S. Then:
\(\displaystyle S(AV) = S( \lambda V) = \lambda (SV)\) (Yes, I'm doing this somewhat backward.)

Insert a \(\displaystyle S^{-1}S\) on the left:
\(\displaystyle \left ( SAS^{-1} \right ) (SV) = \lambda (SV)\)

Can you finish from here?

-Dan
 
  • #3
Dan, that utterly confuses me.

Note that $A$ and $SAS^{-1}$ have the same characteristic polynomial:

$\det(xI - SAS^{-1}) = \det(x(SIS^{-1}) - SAS^{-1}) = \det(S(xI)S^{-1} - SAS^{-1})$

$= \det(S(xI - A)S^{-1}) = \det(S)\det(xI - A)\det(S^{-1}) = \det(S)\det(S^{-1})\det(xI - A)$

$= \det(xI - A)$.
 
  • #4
I think it should be in the middle instead of the left:
$$S(AV) = S( \lambda V)
\quad\Rightarrow\quad S(A(S^{-1}S)V)= S( \lambda V)
\quad\Rightarrow\quad (SAS^{-1})(SV) = \lambda (SV)
$$
where $S$ is any invertible matrix.
 
  • #5
I'm afraid I can't see why the difference between Topsquark and ILS' solutions is significant, it seem the $SS^{-1}$ goes in the same place and its just arranging the bracket order that is changed?

I believe S must be real orthogonal? (if S complex my book refers to unitary transformation which makes sense to me)?

There's not much more to be done once $SS^{-1}$ have been applied? From \(\displaystyle (SAS^{−1})(SV)=λ(SV)\) I get

$ A' (SV)=λ(SV) , \therefore A' (SS^{-1}V)=λ(SS^{-1}V), \therefore A'V=λV $
 
  • #6
Deveno said:
Note that $A$ and $SAS^{-1}$ have the same characteristic polynomial:
Hi Deveno, why?

I can see something like
$ \left( A -\lambda I\right)V =0, \therefore S\left( A -\lambda I\right)VS^{-1}=0, $
$\therefore S AVS^{-1} -S\lambda IVS^{-1}=0$ ... but this is not going anywhere useful...
 
  • #7
ognik said:
I'm afraid I can't see why the difference between Topsquark and ILS' solutions is significant, it seem the $SS^{-1}$ goes in the same place and its just arranging the bracket order that is changed?

I believe S must be real orthogonal? (if S complex my book refers to unitary transformation which makes sense to me)?

There's not much more to be done once $SS^{-1}$ have been applied? From \(\displaystyle (SAS^{−1})(SV)=λ(SV)\) I get

$ A' (SV)=λ(SV) , \therefore A' (SS^{-1}V)=λ(SS^{-1}V), \therefore A'V=λV $
No, they're right. What I presented is what I call "Physics Math." Such things tend to skip a number of critical steps that turn out, in the end, to be correct. But it can occasionally make for an incomplete proof.

-Dan
 
  • #8
topsquark said:
No, they're right. What I presented is what I call "Physics Math." Such things tend to skip a number of critical steps that turn out, in the end, to be correct. But it can occasionally make for an incomplete proof.

-Dan
Thanks - I am doing 'Maths for Physicists', so I don't mind ;-). Anyway I wrote down both methods side by side to see exactly what the difference implied- that was clear enough.
 
  • #9
ognik said:
Hi Deveno, why?

I can see something like
$ \left( A -\lambda I\right)V =0, \therefore S\left( A -\lambda I\right)VS^{-1}=0, $
$\therefore S AVS^{-1} -S\lambda IVS^{-1}=0$ ... but this is not going anywhere useful...

The characteristic equation is in $x$ (or, as you write it, $\lambda$). It has nothing to do with any vector $v$, it's purely a function of the matrix.

What I wrote shows that $A$ and $SAS^{-1}$ have the same characteristic polynomial:

$\det(\lambda I - A) = \det(\lambda I - SAS^{-1})$.

Obviously, if $\lambda$ is such that $\det(\lambda I - A) = 0$ (that is, $\lambda$ is a *root* of $f(\lambda) = \det(\lambda I - A)$), then there exists some $v$ for which $(\lambda I - A)v = 0$ (because, for this $\lambda$, the matrix $\lambda I - A$ is singular).

Dan's post makes more sense to me, now-he is approaching it from the *eigenvectors*, that is, if:

$Av = \lambda v$, then:

$SAS^{-1}(Sv) = \lambda (Sv)$

that is, if $v$ is an eigenvector of $A$ with eigenvalue $\lambda$, then $Sv$ is an eigenvector of $SAS^{-1}$, *also* with eigenvalue $\lambda$. This is still somewhat unsatisfactory to me-it doesn't address any eigenvalues $SAS^{-1}$ might have that $A$ does not (there are none, but I like to see a proof).

*****

I think what you are "not seeing" is that:

$S(xI)S^{-1} = x(SIS^{-1})$ (we can take the scalar $x$ out front)

$x(SIS^{-1}) = x(SS^{-1}) = xI$.

So to go from $xI - A$ to $S(xI - A)S^{-1}$ (in effect, applying the similarity transform $B \to SBS^{-1}$ to the matrix $xI - A$), we can get away with just putting the "$S$" 's around $A$.

Some general *algebraic* properties of similarity transforms (in what follows, matrices are assumed square):

Let $T_S(A) = SAS^{-1}$. Then:

1. $T_S(A + B) = T_S(A) + T_S(B)$ (this is due to the distributive laws of matrices)

2. $T_S(kA) = k(T_S(A))$ (multiplication by a scalar $k$, can be replaced by multiplication by the matrix $kI$, which commutes with any square matrix).

1&2 together say $T_S$ is linear.

3. $T_S(AB) = T_S(A)T_S(B)$ (the $S$'s in the middle cancel).

This means that $T_S$ is an *algebra automorphism* of $\text{Mat}_n(R)$ for any invertible square matrix $S$, and commutative ring $R$. This is actually lots more restrictive than being a linear automorphism (an invertible linear map from a space to itself).

All of this holds for *any* invertible $S$, however, when $A$ has some "special forms" (such as being diagonalizable), we can often find special forms for $S$, as well.
 
  • #10
Deveno said:
The characteristic equation is in $x$ (or, as you write it, $\lambda$). It has nothing to do with any vector $v$, it's purely a function of the matrix.
- thanks, the point I was missing here (but knew)
Deveno said:
that is, if $v$ is an eigenvector of $A$ with eigenvalue $\lambda$, then $Sv$ is an eigenvector of $SAS^{-1}$, *also* with eigenvalue $\lambda$.
thanks again, clear.

Deveno said:
Some general *algebraic* properties of similarity transforms (in what follows, matrices are assumed square):
... and again, whether I know some of them or not, these supplements can be of enormous valuable to me because of the 20 year gap I am still filling :-)

PS:
If the eigenvalues are invariant, then to me it follows that the trace and determinant are also invariant?

Now a similarity transform is used to rotate ('realign') the matrix to it's 'principle axis', in effect diagonalising it if it can be diagonalised? What is the impact if A is NOT diagonalisable?
 
  • #11
ognik said:
- thanks, the point I was missing here (but knew) thanks again, clear.

... and again, whether I know some of them or not, these supplements can be of enormous valuable to me because of the 20 year gap I am still filling :-)

PS:
If the eigenvalues are invariant, then to me it follows that the trace and determinant are also invariant?

Now a similarity transform is used to rotate ('realign') the matrix to it's 'principle axis', in effect diagonalising it if it can be diagonalised? What is the impact if A is NOT diagonalisable?
A similarity transform is best viewed as the result of a change-of-basis. Obviously, if a matrix has a "full set of eigenvectors" (that is, we can form an eigenbasis), this eigenbasis would be the easiest for computation, as it reduces our matrix to scalar stretches along each eigenvector (this is clearly a diagonal matrix). But this may not be the optimum basis for some calculations (how we want vectors expressed can be another factor-expressing polynomials, for example, in terms of a basis other that $\{1,t,t^2,\dots\}$ can be cumbersome to work with).

Yes, the trace and determinant are also invariant under similarity transforms. It is trivial to prove this directly for the determinant (since the scalar ring a determinant takes values in is commutative-usually this is a field, matrix rings over commutative rings have subtleties to them that are greatly simplified by considering just linear maps of vector spaces), and easier just to use the fact that the trace of a matrix is the sum of the roots (including multiplicities) of its characteristic polynomial (so the trace of $2I$ is $4$, not $2$, even though it has only the single eigenvalue $2$).

The answer to your question-"What if $A$ is not diagonalizable?" is given in full by the Jordan Normal Form-basically, for a non-diagonalizable matrix, one considers, for each eigenvalue $\lambda$, the *nilpotency* of $\lambda I - A$, which decomposes a matrix like so:

$SAS^{-1} = D + N$

where $D$ is diagonal, and $N$ is nilpotent in canonical form. The canonical form of a nilpotent matrix is to put $1$'s on the super-diagonal, and this allows us to arrange the decomposed matrix (in the *generalized eigenbasis*) into Jordan blocks, the size of which are directly tied to the nilpotency indices of the various eigenvalue.

For example, the matrix:

$A = \begin{bmatrix}1&1\\0&1\end{bmatrix}$

has but one eigenvalue, $1$ (the characteristic polynomial is $(x - 1)^2$), and the eigenspace $E_1$ is spanned by a single eigenvector $(1,0)$. So we consider the kernel (null space) of $(I - A)^2$. As this is $0$, all we need is *any* element of $\Bbb R^2$ not in $\text{span}((1,0))$. $(0,1)$ will do nicely. That is, the matrix:

$I - A = \begin{bmatrix}0&-1\\0&0\end{bmatrix}$ has nilpotency two. Our Jordan block will therefore be $2 \times 2$, and our similarity transform can be via the identity matrix (since its columns are composed of generalized eigenvectors).

In this case, then, $D = I$, and $N = \begin{bmatrix}0&1\\0&0\end{bmatrix}$.

Although this may seem like a very "special case" it shares many features with the "general case", at least conceptually.
 
  • #12
As you might have gathered I am basically teaching myself this section... found some notes that inter-Alia said:

The matrix of unit eigenvectors $\equiv$ a rotation matrix relating coords in 1 frame (x, y) to cords in an orthogonal ref. frame (x’, y’)...

Does it have to be the UNIT eigenvectors? It seems to me that the eigenvectors only supply direction? In the form $SAS^{-1}$ it seems to me to make no difference.
 
  • #13
ognik said:
As you might have gathered I am basically teaching myself this section... found some notes that inter-Alia said:

The matrix of unit eigenvectors $\equiv$ a rotation matrix relating coords in 1 frame (x, y) to cords in an orthogonal ref. frame (x’, y’)...

Does it have to be the UNIT eigenvectors? It seems to me that the eigenvectors only supply direction? In the form $SAS^{-1}$ it seems to me to make no difference.

They need to be unit vectors only so it will be a rotation matrix.
Btw, additionally the determinant needs to be $+1$, otherwise it's still not a rotation.
 
  • #14
I like Serena said:
They need to be unit vectors only so it will be a rotation matrix.
Btw, additionally the determinant needs to be $+1$, otherwise it's still not a rotation.
I was thinking that $SS^{-1}$ = I$ already ... Are those 2 points by definition?
 
  • #15
ognik said:
I was thinking that $SS^{-1}$ = I$ already ... Are those 2 points by definition?

Not by definition - it's a consequence.
A 2x2 or 3x3 matrix defines a rotation if and only if it is an orthogonal matrix with determinant +1.
A matrix is orthogonal if and only if its column vectors have unit length and each pair is perpendicular.
 
  • #16
Wonder why they didn't call it an orthonormal matrix?

Rotations preserve length - as a consequence of orthonormality?
 
  • #17
ognik said:
Wonder why they didn't call it an orthonormal matrix?

Rotations preserve length - as a consequence of orthonormality?

I don't really know.
Apparently the distinction between an orthogonal and orthonormal matrix is not considered interesting, so people just stuck with orthogonal.

I believe a rotation is usually defined as a transformation that preserves distances, that preserves at least one point, and that preserves orientation.
A consequence of that definition is that the corresponding matrix is orthogonal and has a positive determinant.
Then again, a matrix that is orthogonal and has a positive determinant is probably an equivalent definition.
 
  • #18
ognik said:
Wonder why they didn't call it an orthonormal matrix?

Rotations preserve length - as a consequence of orthonormality?

If we define *length* of a vector $x \in \Bbb R^n$ as $\|x\| = \sqrt{\langle x,x\rangle}$, and we have that:

$U^{-1} = U^T$, for $U: \Bbb R^n \to \Bbb R^n$ then as a pure consequence of this:

$\langle Ux,Ux\rangle = (Ux)^TUx = x^TU^TUx = x^T(U^{-1}U)x = x^TIx = x^Tx = \langle x,x\rangle$, so that:

$\|Ux\| = \sqrt{\langle Ux,Ux\rangle} = \sqrt{\langle x,x\rangle} = \|x\|$.

That is, all orthogonal matrices represent isometries (there are other isometries that are not linear maps, such as translations, and glide reflections).

The question of orientation is a subtle one-it represents the difference between a basis (an arbitrary set of linearly independent spanning vectors) and an *ordered basis* (given an ordered basis, any other ordering of it can be assigned a*sign* determined by the sign of the permutation that takes one ordering to another). A determinant can be essentially regarded as a "signed volume" (the general formula for computing an $n$-dimensional content of a region in $\Bbb R^n$ essentially uses "oriented $n$-cubes" to compute the "volume element:" $dV$, which students of differentiable manifolds may recognize as a differential $n$-form). In other words, an "improper rotation" (all such rotations can be regarded as the composition of a "proper rotation"-an element of $SO(n)$-and the matrix:

$\begin{bmatrix}1&0&\dots&0\\0&1&\dots&0\\ \vdots&\vdots&\ddots&\vdots\\0&0&\dots&-1\end{bmatrix}$);

is only improper due to an established-beforehand choice of orientation.

The "standard" orientation is such that the volume of the $n$-cube $[0,1]^n$ is $+1$. This corresponds to the ordered basis:

$\{e_1,\dots,e_n\}$.

*****************

You ask why there is no "special name" for a matrix whose columns form an orthogonal, but not orthonormal basis. I do not know, but I suspect the answer is this: when measuring actual physical quantities, we typically "normalize our units" (so if we are plotting meters versus seconds, we choose the "unit second vector" and the "unit meter vector"). Often this is accomplished by introducing "appropriate scaling constants" (such as the physicist's $\hbar$).
 

FAQ: Show that the eigenvalues of any matrix are unaltered by a similarity transform

What is a similarity transform?

A similarity transform is a mathematical operation that transforms a matrix into a similar matrix, where the two matrices have the same eigenvalues but potentially different eigenvectors.

How is a similarity transform different from other matrix operations?

A similarity transform is different from other matrix operations because it preserves the eigenvalues of a matrix, while other operations such as multiplication and addition can change the eigenvalues.

Can you provide an example of a similarity transform?

One example of a similarity transform is a diagonalization, where a matrix is transformed into a diagonal matrix with the same eigenvalues.

Why is it important to show that eigenvalues are unaltered by a similarity transform?

It is important to show that eigenvalues are unaltered by a similarity transform because it allows us to study the behavior and properties of a matrix without being affected by changes in the basis or coordinate system used to represent it.

Is the proof for this statement complex?

The proof for this statement can be complex, depending on the level of mathematical background and understanding. However, the basic concept is relatively simple and can be understood with some knowledge of linear algebra and matrix operations.

Similar threads

Replies
20
Views
2K
Replies
12
Views
4K
Replies
18
Views
3K
Replies
2
Views
2K
Replies
4
Views
3K
Replies
1
Views
1K
Back
Top