Proof of Secular Equation for Hermitian Matrices

In summary, the conversation is discussing the proof of the secular equation, which is another term for the characteristic equation, for a hermitian matrix. The proof involves diagonalizing the matrix and applying the Cayley-Hamilton theorem. There is a debate about the notation used, particularly the use of "A - a" and "1" to represent the identity matrix, but ultimately the conversation concludes that everyone understands the intended meaning and the moral is to consider all reasonable interpretations of notation. The term "secular" is used by the professor but its relevance to the conversation is unclear.
  • #1
emob2p
56
1
Hi,
I'm looking for a proof of the following theorem:
If A is a hermitian matrix with eigenvalues a_1, a_2...a_n, then the secular equation holds:
(A - a_1)(A - a_2)...(A - a_n) = 0.


The proof escapes me right now but I think it has to do with diagonalizing the hermitian matrix. I'm just struggling to put together the details. Assume non-degeneracy. Thanks.
 
Last edited:
Physics news on Phys.org
  • #2
I don't know anything about hermitian matrices besides the definition, but what does A - a_1 mean? You can't subtract a scalar from a matrix.
 
  • #3
Right, there is an implied multiplication by the identity matrix.
 
  • #4
Scalars are clearly identified with scalar matrices (another one to add to your list of new things, 0rthodontist like inverse functions). Surely you know the Cayley-Hamilton theorem that a matrix statisfies its own characteristic polynomial? That's all that this is. Hermitian matrices are diagonalizable hence the minimal poly only has linear factors. Non-degeneracy, by which I presmue you mean A is not invertible, is not important. The odd thing in your post is i have no idea what the word secular is doing there. Is there some meaning of secular I"m not aware of?
 
  • #5
matt grime said:
Scalars are clearly identified with scalar matrices (another one to add to your list of new things, 0rthodontist like inverse functions).
Well what the hell do I know? I've never seen an inverse function defined for a noninvertible function before, a double check just now at http://mathworld.wolfram.com/InverseFunction.html shows not everyone thinks that way, and in my book (Linear Algebra and its Applications by David C. Lay) an expression like that is always written like A - aI. From looking at what he has it makes as much sense to me to assume A - a is a subtracted from every element of A. Different books, different people use different conventions. I'm just trying to help.
 
Last edited:
  • #6
I admit I'm being lazy, though I am being lazy in the accepted way: properly f^{-1}(D) should be called the preimage and is defined for all functions, but is a common abuse of notation to refer to it as the inverse image of a set. They key thing begin it is defined as a map from subsets to subsets and not element to element, ie think of it as a function from power set to power set. The preimage of a point is then a bunch of points. It is useful for instance, to cross reference another thread in an attempt to confuse the OP here, to define the kernel of a linear map M as M^{-1}({0}), the pullback of the zero vector.

We are not saying f^{-1} is the inverse function to f, but the pullback, though that is an unnecessarily fancy word for it.

Wolfram, whilst useful, is not a de facto standard, and cannot be expected to mention every variation on a theme. As I said, f^{-1} is not in general an inverese function which is why it fails to be mentioned on page that contains a definition of inverse function, which is not at all surprising.

A better notation would undoubtedly be f*(D), but for books like to keep it simple and suggestive.
 
  • #7
If A is a hermitian matrix, then there exist a basis for the vector space consisting entirely of eigenvectors of A.

What do you get if you apply (A - a_1)(A - a_2)...(A - a_n) to an eigenvector?

Any vector in the space can be written as a linear combination of those basis eigenvectors. What does that tell you about (A - a_1)(A - a_2)...(A - a_n) applied to any vector?
 
  • #8
0rthodontist said:
From looking at what he has it makes as much sense to me to assume A - a is a subtracted from every element of A.

The f^{-1} notation as a preimage is a standard usage. Not explicitly writing the I's isn't, but the context is clear from the question.


Assuming the OP hasn't seen Cayley-Hamilton yet, diagonalize A=P*D*P^{-1}. If f is a polynomial then f(A)=P*f(D)*P^{-1}. What happens to f(D) when f is the characteristic polynomial?
 
  • #9
Writing 1 for the identity matrix is perfectly normal in the world of algebra, though perhaps not in a first course in linear algebra. We are talking about a ring after all, and hence the scalars are multiplies of the identity, and it is perfectly normal to write, say, 2 for twice the identity in a ring.
 
  • #10
matt grime said:
Writing 1 for the identity matrix is perfectly normal in the world of algebra, though perhaps not in a first course in linear algebra.

Which world do you think this question is coming from? From the level of the question early on in linear algebra land seems most likely and here it's not so standard. Even higher up, it's not something I've seen often (when talking specifically about matrices and not rings in general), though this is just my own experience.

This is all pointless though. Everyone knows what the OP means now, and the moral, if there is one, is to consider all 'reasonable' interpretations of notation. If you can't think of one, there should be no harm in asking- either the poster made an error or you'll learn something new upon clarification.
 
Last edited:
  • #11
Thanks guys all your suggestions are quite helpful...I didn't mean to start such a controversy. It's been a few years since I took a Linear Algebra course so the Cayley-Hamilton theorem wasn't the first thing to come to my head.

To Matt Grime:
I used the word secular because that is what my prof called the equation.
 
  • #12
Would you mind asking him what he meant by secular? The only meaning I know of for it relates to 'modern life; things civic and not religious'. Bugger all to do with maths.
 
  • #13
Apparently a secular equation is another name for a characteristic equation and I think it's probably misused here.
 
  • #14
shmoe said:
Which world do you think this question is coming from? From the level of the question early on in linear algebra land seems most likely and here it's not so standard. Even higher up, it's not something I've seen often (when talking specifically about matrices and not rings in general), though this is just my own experience.

I wasn't talking about the OP's view, which he clarified, or introductory linear algebra, but responding the assertion of 0rthodontist that it is "impossible to subtract a scalar from a matrix" (with the obvious exception), and indicating that it is perfectly acceptable to write it the way it was written. It is nothing to do with the level of the question by the OP, but correcting a misapprehension by someone else.
 
  • #15
matt grime said:
I admit I'm being lazy, though I am being lazy in the accepted way: properly f^{-1}(D) should be called the preimage and is defined for all functions, but is a common abuse of notation to refer to it as the inverse image of a set. They key thing begin it is defined as a map from subsets to subsets and not element to element, ie think of it as a function from power set to power set. The preimage of a point is then a bunch of points. It is useful for instance, to cross reference another thread in an attempt to confuse the OP here, to define the kernel of a linear map M as M^{-1}({0}), the pullback of the zero vector.

We are not saying f^{-1} is the inverse function to f, but the pullback, though that is an unnecessarily fancy word for it.

Wolfram, whilst useful, is not a de facto standard, and cannot be expected to mention every variation on a theme. As I said, f^{-1} is not in general an inverese function which is why it fails to be mentioned on page that contains a definition of inverse function, which is not at all surprising.

A better notation would undoubtedly be f*(D), but for books like to keep it simple and suggestive.
Thanks, that was very informative.
shmoe said:
Everyone knows what the OP means now, and the moral, if there is one, is to consider all 'reasonable' interpretations of notation. If you can't think of one, there should be no harm in asking- either the poster made an error or you'll learn something new upon clarification.
Can't argue with that.
 
  • #16
I would like to point out that we are all presumably happy with the assertion of the Cayley-Hamilton theorem, that a matrix satisfies it's onw characteristic equation, and ask those who think that identifying a scalar with a scalar matrix is not normal, how they interpret the constant coefficient in the characteristic equation? Chi(X) is a poly in F[x] (F the underlying field), so accordingly saying Chi(M)=0 is meaningless unless we accept that constants are scalar matrices.
 
  • #17
matt grime said:
I wasn't talking about the OP's view, which he clarified, or introductory linear algebra, but responding the assertion of 0rthodontist that it is "impossible to subtract a scalar from a matrix" (with the obvious exception), and indicating that it is perfectly acceptable to write it the way it was written. It is nothing to do with the level of the question by the OP, but correcting a misapprehension by someone else.

I see, I had thought you were responding to my post where I said it wasn't standard notation, at least in the sense that it's not something someone with undergrad linear algebra would likely have seen.

Typically the definition of a f(A), f a polynomial, A a square matrix, is f(A)=a_n*A^n+...+a_1*A+a_0*I, and the *I is made explicit. I'm not confused if this *I is left off, it's just not how it's usually written in my experience. Alternatively, the constant term of f will have an x^0 term instead, with A^0 defined to be I for all A. I can't say I've ever seen a characteristic polynomial written as det(x-A) instead of det(x*I-A) either. (this isn't to say it doesn't happen though)

This is getting more and more pointless. Hopefully we're done with this and can get on to learning some obscure looking terminology like 'secular'.

Orthodontist- that is the characteristic polynomial. Where have you seen "secular" mean "characteristic"? I've never seen secular used like this before either, so I'm curious.
 
  • #18
Orthodontist- that is the characteristic polynomial. Where have you seen "secular" mean "characteristic"? I've never seen secular used like this before either, so I'm curious.

Well, I looked it up. http://mathworld.wolfram.com/SecularEquation.html
 
Last edited:
  • #19
0rthodontist said:

Dissapointing in two ways-that I didn't think to check mathworld myself (thanks though), and that I'm hoping it's a term used in some obscure long-lost linear algebra tome.
 

FAQ: Proof of Secular Equation for Hermitian Matrices

What is a secular equation for Hermitian matrices?

A secular equation for Hermitian matrices is a mathematical expression used to find the eigenvalues of a Hermitian matrix, which is a square matrix that is equal to its own complex conjugate transpose. It is an important concept in linear algebra and has various applications in quantum mechanics and other fields of science.

How is the secular equation derived?

The secular equation is derived by setting the determinant of the matrix's characteristic polynomial equal to zero. This polynomial is obtained by subtracting the identity matrix multiplied by the variable from the original matrix and then finding the determinant.

What is the significance of the secular equation for Hermitian matrices?

The secular equation allows us to find the eigenvalues of a Hermitian matrix, which are values that represent the possible states of a quantum mechanical system. It also helps in solving various problems in physics and engineering, such as calculating the energy levels of particles in a potential well.

Are there any special properties of the secular equation for Hermitian matrices?

Yes, the secular equation has some important properties that make it unique. For example, all the roots of the secular equation are real numbers, and the eigenvalues of a Hermitian matrix are always real. Also, the eigenvalues are distinct, meaning that they are all different from each other.

How is the secular equation related to the spectral theorem?

The spectral theorem states that every Hermitian matrix can be diagonalized by a unitary matrix, meaning that it can be written as a product of its eigenvalues and eigenvectors. The secular equation is used to find the eigenvalues, which are then used to construct the diagonal matrix in the spectral theorem.

Similar threads

Back
Top