Proof of Secular Equation for Hermitian Matrices

emob2p · Jan 27, 2006

Hi,
I'm looking for a proof of the following theorem:
If A is a hermitian matrix with eigenvalues a_1, a_2...a_n, then the secular equation holds:
(A - a_1)(A - a_2)...(A - a_n) = 0.

The proof escapes me right now but I think it has to do with diagonalizing the hermitian matrix. I'm just struggling to put together the details. Assume non-degeneracy. Thanks.

0rthodontist · Jan 27, 2006

I don't know anything about hermitian matrices besides the definition, but what does A - a_1 mean? You can't subtract a scalar from a matrix.

emob2p · Jan 27, 2006

Right, there is an implied multiplication by the identity matrix.

matt grime · Jan 28, 2006

Scalars are clearly identified with scalar matrices (another one to add to your list of new things, 0rthodontist like inverse functions). Surely you know the Cayley-Hamilton theorem that a matrix statisfies its own characteristic polynomial? That's all that this is. Hermitian matrices are diagonalizable hence the minimal poly only has linear factors. Non-degeneracy, by which I presmue you mean A is not invertible, is not important. The odd thing in your post is i have no idea what the word secular is doing there. Is there some meaning of secular I"m not aware of?

0rthodontist · Jan 28, 2006

matt grime said:

Scalars are clearly identified with scalar matrices (another one to add to your list of new things, 0rthodontist like inverse functions).

Well what the hell do I know? I've never seen an inverse function defined for a noninvertible function before, a double check just now at http://mathworld.wolfram.com/InverseFunction.html shows not everyone thinks that way, and in my book (Linear Algebra and its Applications by David C. Lay) an expression like that is always written like A - aI. From looking at what he has it makes as much sense to me to assume A - a is a subtracted from every element of A. Different books, different people use different conventions. I'm just trying to help.

matt grime · Jan 28, 2006

I admit I'm being lazy, though I am being lazy in the accepted way: properly f^{-1}(D) should be called the preimage and is defined for all functions, but is a common abuse of notation to refer to it as the inverse image of a set. They key thing begin it is defined as a map from subsets to subsets and not element to element, ie think of it as a function from power set to power set. The preimage of a point is then a bunch of points. It is useful for instance, to cross reference another thread in an attempt to confuse the OP here, to define the kernel of a linear map M as M^{-1}({0}), the pullback of the zero vector.

We are not saying f^{-1} is the inverse function to f, but the pullback, though that is an unnecessarily fancy word for it.

Wolfram, whilst useful, is not a de facto standard, and cannot be expected to mention every variation on a theme. As I said, f^{-1} is not in general an inverese function which is why it fails to be mentioned on page that contains a definition of inverse function, which is not at all surprising.

A better notation would undoubtedly be f*(D), but for books like to keep it simple and suggestive.

HallsofIvy · Jan 28, 2006

If A is a hermitian matrix, then there exist a basis for the vector space consisting entirely of eigenvectors of A.

What do you get if you apply (A - a_1)(A - a_2)...(A - a_n) to an eigenvector?

Any vector in the space can be written as a linear combination of those basis eigenvectors. What does that tell you about (A - a_1)(A - a_2)...(A - a_n) applied to any vector?

shmoe · Jan 28, 2006

0rthodontist said:

From looking at what he has it makes as much sense to me to assume A - a is a subtracted from every element of A.

The f^{-1} notation as a preimage is a standard usage. Not explicitly writing the I's isn't, but the context is clear from the question.

Assuming the OP hasn't seen Cayley-Hamilton yet, diagonalize A=P*D*P^{-1}. If f is a polynomial then f(A)=P*f(D)*P^{-1}. What happens to f(D) when f is the characteristic polynomial?

matt grime · Jan 28, 2006

Writing 1 for the identity matrix is perfectly normal in the world of algebra, though perhaps not in a first course in linear algebra. We are talking about a ring after all, and hence the scalars are multiplies of the identity, and it is perfectly normal to write, say, 2 for twice the identity in a ring.

shmoe · Jan 28, 2006

matt grime said:

Writing 1 for the identity matrix is perfectly normal in the world of algebra, though perhaps not in a first course in linear algebra.

Which world do you think this question is coming from? From the level of the question early on in linear algebra land seems most likely and here it's not so standard. Even higher up, it's not something I've seen often (when talking specifically about matrices and not rings in general), though this is just my own experience.

This is all pointless though. Everyone knows what the OP means now, and the moral, if there is one, is to consider all 'reasonable' interpretations of notation. If you can't think of one, there should be no harm in asking- either the poster made an error or you'll learn something new upon clarification.

emob2p · Jan 28, 2006

Thanks guys all your suggestions are quite helpful...I didn't mean to start such a controversy. It's been a few years since I took a Linear Algebra course so the Cayley-Hamilton theorem wasn't the first thing to come to my head.

To Matt Grime:
I used the word secular because that is what my prof called the equation.

matt grime · Jan 28, 2006

Would you mind asking him what he meant by secular? The only meaning I know of for it relates to 'modern life; things civic and not religious'. Bugger all to do with maths.

0rthodontist · Jan 28, 2006

Apparently a secular equation is another name for a characteristic equation and I think it's probably misused here.

matt grime · Jan 28, 2006

shmoe said:

Which world do you think this question is coming from? From the level of the question early on in linear algebra land seems most likely and here it's not so standard. Even higher up, it's not something I've seen often (when talking specifically about matrices and not rings in general), though this is just my own experience.

I wasn't talking about the OP's view, which he clarified, or introductory linear algebra, but responding the assertion of 0rthodontist that it is "impossible to subtract a scalar from a matrix" (with the obvious exception), and indicating that it is perfectly acceptable to write it the way it was written. It is nothing to do with the level of the question by the OP, but correcting a misapprehension by someone else.

0rthodontist · Jan 28, 2006

matt grime said:

I admit I'm being lazy, though I am being lazy in the accepted way: properly f^{-1}(D) should be called the preimage and is defined for all functions, but is a common abuse of notation to refer to it as the inverse image of a set. They key thing begin it is defined as a map from subsets to subsets and not element to element, ie think of it as a function from power set to power set. The preimage of a point is then a bunch of points. It is useful for instance, to cross reference another thread in an attempt to confuse the OP here, to define the kernel of a linear map M as M^{-1}({0}), the pullback of the zero vector.

We are not saying f^{-1} is the inverse function to f, but the pullback, though that is an unnecessarily fancy word for it.

Wolfram, whilst useful, is not a de facto standard, and cannot be expected to mention every variation on a theme. As I said, f^{-1} is not in general an inverese function which is why it fails to be mentioned on page that contains a definition of inverse function, which is not at all surprising.

A better notation would undoubtedly be f*(D), but for books like to keep it simple and suggestive.

Thanks, that was very informative.

shmoe said:

Everyone knows what the OP means now, and the moral, if there is one, is to consider all 'reasonable' interpretations of notation. If you can't think of one, there should be no harm in asking- either the poster made an error or you'll learn something new upon clarification.

Can't argue with that.

matt grime · Jan 28, 2006

I would like to point out that we are all presumably happy with the assertion of the Cayley-Hamilton theorem, that a matrix satisfies it's onw characteristic equation, and ask those who think that identifying a scalar with a scalar matrix is not normal, how they interpret the constant coefficient in the characteristic equation? Chi(X) is a poly in F[x] (F the underlying field), so accordingly saying Chi(M)=0 is meaningless unless we accept that constants are scalar matrices.

shmoe · Jan 28, 2006

matt grime said:

I wasn't talking about the OP's view, which he clarified, or introductory linear algebra, but responding the assertion of 0rthodontist that it is "impossible to subtract a scalar from a matrix" (with the obvious exception), and indicating that it is perfectly acceptable to write it the way it was written. It is nothing to do with the level of the question by the OP, but correcting a misapprehension by someone else.

I see, I had thought you were responding to my post where I said it wasn't standard notation, at least in the sense that it's not something someone with undergrad linear algebra would likely have seen.

Typically the definition of a f(A), f a polynomial, A a square matrix, is f(A)=a_n*A^n+...+a_1*A+a_0*I, and the *I is made explicit. I'm not confused if this *I is left off, it's just not how it's usually written in my experience. Alternatively, the constant term of f will have an x^0 term instead, with A^0 defined to be I for all A. I can't say I've ever seen a characteristic polynomial written as det(x-A) instead of det(x*I-A) either. (this isn't to say it doesn't happen though)

This is getting more and more pointless. Hopefully we're done with this and can get on to learning some obscure looking terminology like 'secular'.

Orthodontist- that is the characteristic polynomial. Where have you seen "secular" mean "characteristic"? I've never seen secular used like this before either, so I'm curious.

0rthodontist · Jan 28, 2006

Orthodontist- that is the characteristic polynomial. Where have you seen "secular" mean "characteristic"? I've never seen secular used like this before either, so I'm curious.

Well, I looked it up. http://mathworld.wolfram.com/SecularEquation.html

shmoe · Jan 28, 2006

0rthodontist said:

Well, I looked it up. http://mathworld.wolfram.com/SecularEquation.html

Dissapointing in two ways-that I didn't think to check mathworld myself (thanks though), and that I'm hoping it's a term used in some obscure long-lost linear algebra tome.

Proof of Secular Equation for Hermitian Matrices

FAQ: Proof of Secular Equation for Hermitian Matrices

What is a secular equation for Hermitian matrices?

How is the secular equation derived?

What is the significance of the secular equation for Hermitian matrices?

Are there any special properties of the secular equation for Hermitian matrices?

How is the secular equation related to the spectral theorem?

Similar threads

Hot Threads

Recent Insights