Is the Inverse of a Matrix a Continuous Function?

In summary, the authors are likely referring to the standard norm on a finite dimensional vector space when discussing continuity of matrix inverse functions. The inverse of a matrix can be shown to be continuous by using the epsilon-delta definition of continuity and constructing the inverse using blocks. Additionally, Cramer's rule can be used to show that the inverse of a matrix is a polynomial function of the entries of the original matrix, which is a continuous function.
  • #1
tjm7582
7
0
When doing some self-study in probability, I have seen a number of authors state, without proof or justification, that the inverse of a matrix is continuous. For instance, a passage in a popular econometrics text (White (2001)) reads: "The matrix inverse function is continuous at every point that represents a nonsingular matrix" (p16). After poring through a number of references on linear algebra, I have yet to find even a definition of what it means for a function on a matrix to be continuous, yet alone how I would go about showing that the inverse satisfies these properties. I subsequently have two questions:
(1) What does it mean for a function accepting a matrix as an argument to be continuous?
(2) Do you know of any references in which I could learn more about such functions?
Any help is greatly appreciated
 
Physics news on Phys.org
  • #2
You have to give the set of matrices a topology for continuity to mean something; you should be able to do that by making a metric space out of the set of matrices, using any matrix norm. From that, you can define continuous functions of matrices. Once you've done that, you should be able to use the epsilon-delta stuff to test continuity of matrix functions.
 
  • #3
The statement "the function f(M)= M-1", or, for that matter, that any function is continuous, makes no sense without some kind of topology. Have those authors defined a topology for matrices? In other words have they defined a "distance" between matrices in order to make "close to" precise? Have they defined a "norm"? That can be used to define a topology the way absolute value is used to define distance in the real numbers.
 
  • #4
Thanks for the replies. There has been no mention of a matrix topology, and a norm has never been defined when this claim of continuity was made. This is actually why I was so confused and felt that I was missing something entirely (that is, was the result was so obvious that it needed no further explanation?). Unfortunately, this is a common frustration with the exposition in many econometrics books. Do either of you happen to know of a reference that covers analysis-type issues on matrices (e.g., this continuity question). I am going to try to work out some examples on my own, but I always find it somewhat comforting to have some examples for reference.
Thanks again.
 
  • #5
they are probably talking about the standard norm on a finite dimensional vector space. In fact, all norms for finite vector spaces are equivalent in the sense that one norm is related to another by
C|a|_2<|a|_1<D|a|_2. Where C, D are constants. Hence they all give the same topology.

So pick any norm you want, be it the max of absolute value of all entries, or the sup of |Ax| where x is on the unit sphere and |x| the euclidean norm... The later is the one used in Rudin's analysis book. He gives a very clear proof on the continuity of the inverse.
 
  • #6
tim_lou,
Thanks for the help. Are you referring to "baby" Rudin (Principles of Mathematical Analysis), or his other analysis book? I am away from my books, but I look forward to going over this when I return.

Tom
 
  • #7
I think there's something called Cramer's rule that tells you how to construct the inverse of a matrix in terms of determinant of the matrix and the cofactor of the matrix. Since both of these objects are polynomials the entries of inverse matrix are just polynomial functions of the entries of the original matrix. And these polynomial functions are of course continuous.

(You probably have to be a little more careful, making sure you aren't dividing by 0, etc. But I think the requirement that the determinant of the original matrix is not 0 will help.)

Hope that gives you a start.
 
  • #8
You could proceed by induction. Suppose M and W are matrices that are nearby, and split the matrices into blocks, say,

[tex]M = \left(\begin{matrix}a_m & b_m* \\ c_m & D_m\end{matrix}\right)[/tex]

[tex]W = \left(\begin{matrix}a_w & b_w* \\ c_w & D_w\end{matrix}\right)[/tex]

where for each matrix a is a scalar, b* a row vector, c a column vector, and D is an invertible k-1 by k-1 matrix. The inverses can be explicitly calculated in terms of the blocks:

[tex]\left(\begin{matrix}a & b* \\ c & D\end{matrix}\right)^{-1} = \left(\begin{matrix}(a-b^*D^{-1}c)^{-1} & -a^{-1}b(D-a^{-1}cb^*)^{-1} \\ -cD^{-1}(a-b^*D^{-1}c)^{-1} & (D-a^{-1}cb^*)^{-1}\end{matrix}\right)[/tex]

Now we just have to verify that each block of M-1-W-1 can be made sufficiently small. Choose M and W close enough that

[tex]a_m = a_w(1+O(\epsilon))[/tex]

[tex]b^*_m = b^*_w(I+O(\epsilon))[/tex]

[tex]c_m = c_w(I+O(\epsilon))[/tex]

[tex]D_m^{-1} = D_w^{-1}(I+O(\epsilon))[/tex]

The last such statement about D-1 follows from the inductive hypothesis - Dm and Dw are close k-1 by k-1 invertible matrices, so Dm-1 and Dw-1 are close as well.

Then the first element is:

[tex](M^{-1}-W^{-1})_{1,1} = (a_m-b_m^*D_m^{-1}c_m)^{-1} - (a_w-b_w^*D_w^{-1}c_w)^{-1}[/tex]

[tex]= (a_w(1+O(\epsilon))-b_w^*(I+O(\epsilon))D_w^{-1}(I+O(\epsilon))c_w(I+O(\epsilon)))^{-1} - (a_w-b_w^*D_w^{-1}c_w)^{-1} [/tex]

[tex]= (a_w-b_w^*D_w^{-1}c_w)^{-1}(1+O(\epsilon)) - (a_w-b_w^*D_w^{-1}c_w)^{-1}[/tex]

[tex]=O(\epsilon)(a_w-b_w^*D_w^{-1}c_w)^{-1} = O(\epsilon) \cdot const[/tex]

So [itex](M^{-1}-W^{-1})_{1,1}[/itex] can be made arbitrarily small. The other blocks [itex](M^{-1}-W^{-1})_{1,2}[/itex], [itex](M^{-1}-W^{-1})_{2,1}[/itex], and [itex](M^{-1}-W^{-1})_{2,2}[/itex] follow similarly, though you will need to invoke the Sherman-Morrison formula (about the inverse of a rank-1 update of a matrix):

[tex](A+xy^*)^{-1} = A^{-1}-\frac{A^{-1}xy^*A^{-1}}{1+y^*A^{-1}x}[/tex]

to deal with the (1,2) and (2,2) blocks.
 
Last edited:
  • #9
Oh, here is an easier way using the Sherman Morrison formula more directly,

[tex](A+\epsilon xy^*)^{-1} = A^{-1}-\epsilon \frac{A^{-1}xy^*A^{-1}}{1+\epsilon y^*A^{-1}x} = A^{-1} + \epsilon M[/tex]

Where [itex]\epsilon M[/itex] slightly perturbs [itex]A^{-1}[/itex], with ||M|| < ||xy*|| ||A||2. More generally,

[tex]\left(A + \epsilon \sum_{j=1}^{n} x_j y_j^*\right)^{-1} = \left( \left(A + \epsilon \sum_{j=1}^{n-1} x_j y_j^*\right) + \epsilon x_n y_n^*\right)^{-1} = \left(A + \epsilon \sum_{j=1}^{n-1} x_j y_j^*\right)^{-1} + \epsilon M_n[/tex]

[tex]= \left(A + \epsilon \sum_{j=1}^{n-2} x_j y_j^*\right)^{-1} + \epsilon M_{n-1} + \epsilon M_n = \hdots = A^{-1} + \epsilon \left(M_1 + M_2 + \hdots + M_n\right) = A^{-1} + \epsilon M[/tex]

with [itex]||M|| < c ||\sum_{j=1}^{n} x_j y_j^*||[/itex]. You could calculate the exact constant c if you are careful about keeping track of the errors, but I'm too lazy. The key thing is that the constant depends only on A and n, but not on the xj's and yj's. But any matrix can be written as a sum of rank-1 matrices (ie: [itex]W = \sum_j x_j y_j^*[/itex]), so for all W of norm less than or equal to 1 there exists an M such that [itex]\left(A + \epsilon W\right)^{-1} = A^{-1} + \epsilon M[/itex] with ||M|| < c.

Then for [itex]||A-B|| < \epsilon[/itex], we can rewrite B as a perturbation of A: [itex]B = A + \epsilon W[/itex] with ||W|| less than or equal to 1. So by the above, we have
[tex]||A^{-1} - B^{-1}|| = ||A^{-1} - \left(A^{-1} + \epsilon M\right)|| = \epsilon ||M|| < \epsilon c[/tex]
 
Last edited:
  • #10
Isn't it obvious that an nbyn matrix is equivalent to a sequence of n^2 numbers? hence to a point in R^(n^2)? hence there is a natural topology as usual on euclidean space.

then cramer's rule says that in these coordinates the inverse matrix function is a quotient of polynomials with non zero denominator, hence continuous.
 

FAQ: Is the Inverse of a Matrix a Continuous Function?

What is the definition of continuity of matrix inverse?

Continuity of matrix inverse refers to the property of a matrix inverse function to have a continuous output for a continuous input. In other words, small changes in the input matrix should result in small changes in the output inverse matrix.

Why is continuity of matrix inverse important?

Continuity of matrix inverse is important because it ensures that the inverse function is well-behaved and can be used in various mathematical operations. It also guarantees that small changes in the input matrix will not result in large changes in the output inverse matrix, which can lead to more accurate calculations.

Does every matrix have a continuous inverse?

No, not every matrix has a continuous inverse. Some matrices, such as singular matrices, do not have an inverse at all. Others may have a discontinuous inverse, which means that small changes in the input matrix can result in large changes in the output inverse matrix.

What is the relationship between continuity of matrix inverse and invertibility?

Continuity of matrix inverse is closely related to invertibility. In order for a matrix to have a continuous inverse, it must be invertible. This means that the matrix has a unique inverse, and the inverse function is well-defined and continuous.

How can continuity of matrix inverse be tested?

Continuity of matrix inverse can be tested by taking small perturbations of the input matrix and observing if the output inverse matrix also changes by a small amount. If the changes are small, then the inverse is continuous. Additionally, mathematical proofs can also be used to show the continuity of a matrix inverse.

Similar threads

Back
Top