Matrix inverse equals power-series

In summary, the conversation revolved around a statement from an economics book about input-output analysis that was presented without proof. The statement was (I - A)^{-1} = (I + A + A^2 + A^3 + ... + A^n), and the question was how to prove it. It was mentioned that there are assumptions about A, such as all elements being less than 1 and greater than 0, and that n goes to infinity. The conversation then delved into different conditions that could be applied to A, such as all eigenvalues being less than 1, and how this affects the convergence of the series. It was also discussed how this power expansion for the inverse matrix is calculated in computers. Ultimately
  • #1
Mårten
126
1
Hi!

In an economics book about input-output analysis the following statement is presented, but I cannot find the proof:

[tex](I - A)^{-1} = (I + A + A^2 + A^3 + ... + A^n)[/tex]

Can someone help me show why this is the case?

P.s. I think there is some assumptions made about [itex]A[/itex] such that all elements [itex]a_{ij}[/itex] is less than 1 and greater than 0. Btw, n goes to infty.
 
Physics news on Phys.org
  • #2
So you mean [itex](I - A)^{-1} = \sum_{k=0}^\infty A^k[/itex]. This is true if the right side converges, which is true if and only if all of the eigenvalues of A have absolute value smaller than 1.

To prove it, multiply both sides by [itex]I - A[/itex].
 
  • #3
Thanks a lot! Easier than I thought.

Now I found another assumption regarding A, and that is that the row sums are all less than 1, and the column sums are also all less than 1. Could that be the same as saying that all eigenvalues have to be less than |1|, and in that case why are these statements equivalent?
 
  • #4
What exactly do you mean by row sums and column sums?
 
  • #5
If the matrix A is

[tex]\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}[/tex]

then the rowsums are [itex]a_{11} + a_{12} < 1[/itex] and [itex]a_{21} + a_{22} < 1[/itex]. The columnsums are [itex]a_{11} + a_{21} < 1[/itex] and [itex]a_{12} + a_{22} < 1[/itex].
 
  • #6
Well then that's certainly not true. Maybe you meant absolute values, or something?

An equivalent condition to the eigenvalue condition is that [itex]\lvert Ax \rvert < 1[/itex] for all [itex]x[/itex] such that [itex]\lvert x \rvert = 1[/itex].
 
Last edited:
  • #7
But if, at the same time, all [itex]a_{ij} > 0[/itex], it doesn't work then?

Anyhow, is there a way to set up criteria for A making all its eigenvalues less than |1|?
 
  • #8
adriank said:
So you mean [itex](I - A)^{-1} = \sum_{k=0}^\infty A^k[/itex]. This is true if the right side converges, which is true if and only if all of the eigenvalues of A have absolute value smaller than 1.

To prove it, multiply both sides by [itex]I - A[/itex].

Mårten said:
Thanks a lot! Easier than I thought.

Now I found another assumption regarding A, and that is that the row sums are all less than 1, and the column sums are also all less than 1. Could that be the same as saying that all eigenvalues have to be less than |1|, and in that case why are these statements equivalent?
This isn't a valid proof. If you multiply both sides by I-A and simplify the right-hand side, you're making assumptions about the infinite sum that you can't make at this point. And if you intend to consider the equality in post #1 with a finite number of terms in the sum, the two sides aren't actually equal, so you'd be starting with an equality that's false.

(Maybe you understand this already, but it didn't look that way to me, and you did say that you're an economy student. :smile:)

What you need to do is to calculate

[tex]\left(\lim_{n\rightarrow\infty}(I+A+A^2+\cdots+A^n)\right)(I-A)[/tex]

Start by rewriting it as

[tex]=\lim_{n\rightarrow\infty}\left((I+A+A^2+\cdots+A^n)(I-A)\right)[/tex]

and then prove that this is =I. This result implies that I-A is invertible, and that the inverse is that series. The information you were given about the components of A is needed to see that the unwanted term goes to zero in that limit.
 
Last edited:
  • #9
Thanks a lot! I really appreciate this help, I think I understand it pretty well now.

Adriank, it seems that my conditions given about A satisfy your condition that [itex]\lvert Ax \rvert < 1[/itex] for all [itex]x[/itex] such that [itex]\lvert x \rvert = 1[/itex]. I cannot prove it formally that these two conditions are the same (or that my condition follows from Adriank's), but some easy calculations and checks I've made, makes it reasonable. (If someone has the energy to prove it, I wouldn't be late to look at it.)

My conditions about A once more (sorry to not have given them at the same time before): Row sums and column sums are all, one by one, less than 1, and at the same time [itex]0 \leq a_{ij} < 1[/itex].

P.s. Actually, I didn't say I'm an economics student, I just said I read an economics book. :smile:
 
  • #10
Btw, it occurred to me that I don't understand really why all eigenvalues of A have to have absolute value less than 1 in order to make the series above converge. Why does that eigenvalue condition affect the convergence property of the series?

Another follow-up question: Now when we have this nice power expansion for the inverse matrix, is that actually the way (some) matrix inverses is calculated in computers?
 
  • #11
Following on what Fredrik said, note that
[tex]\Bigl(\sum_{k=0}^n A^k\Bigr) (I - A) = I - A^{n+1}.[/tex]
Now we want to take the limit as [itex]n \to \infty[/itex] to show that
[tex]\Bigl(\sum_{k=0}^\infty A^k\Bigr) (I - A) = I.[/tex]
But for that to be true, you need that
[tex]\lim_{n \to \infty} A^{n+1} = 0.[/tex]
This is true if and only if the operator norm of A is less than 1. And it turns out that the operator norm of A is the largest absolute value of the eigenvalues of A. If you have some other condition that implies this, then that works too.

As for computation, usually that's a very inefficient way to calculate it directly, unless A is nilpotent (so that the series only has finitely many nonzero terms). However, if A is n by n, then you can express An in terms of lower powers of A by the Cayley-Hamilton theorem. You could also apply various numerical techniques to directly find the inverse of I - A; a lot of times, all you care about is (I - A)-1x for some vector x, which can often be done even more efficiently than calculating the full matrix inverse.
 
  • #12
Let's try the Hilbert-Schmidt norm, defined by

[tex]\langle A,B\rangle=\operatorname{Tr}A^\dagger B[/tex]

[tex]\|A\|^2=\langle A,A\rangle=\sum_{i,j}a_{ji}^*a_{ji}=\sum_{i,j}|a_{ij}|^2[/tex]

[tex]\|A\|\geq 0[/tex]

This definition and the information given together imply that if A is a m×m matrix,

[tex]\|A\|^2<\sum_{i,j}1=m^2[/tex]

The norm satisfies

[tex]\|AB\|\leq\|A\|\|B\|[/tex]

so

[tex]\|A^{n+1}\|\leq\|A\|^{n+1}=m^{n+1}[/tex]

but this doesn't converge as n→∞. It looks like the assumption [itex]|a_{ij}|<1[/itex] isn't strong enough to guarantee convergence. It looks like we would need something like [itex]|a_{ij}|<1/m[/itex] (or another norm to define convergence). Hmm...
 
Last edited:
  • #13
Fredrik said:
It looks like the assumption [itex]|a_{ij}|<1[/itex] isn't strong enough to guarantee convergence. It looks like we would need something like [itex]|a_{ij}|<1/m[/itex] (or another norm to define convergence). Hmm...
Did you take into account that my assumption wasn't just that [itex]|a_{ij}|<1[/itex] (actually [itex]0 \leq a_{ij}<1[/itex]), but also that the row sums and column sums are, one by one, less than 1? In a way, the latter thing seems to imply that [itex]|a_{ij}|<1/m[/itex], "on the average" at least.

I'll take a look at the operator room in the meantime.
 
  • #14
Well, the Hilbert-Schmidt norm [itex]\lVert \cdot \rVert_{HS}[/itex] is always bigger than the operator norm, so if A has Hilbert-Schmidt norm less than 1, then its operator norm is also less than 1.

And the Hilbert-Schmidt norm happens to satisfy
[tex]\lVert A \rVert_{HS}^2 = \sum_{i,j} \lvert a_{ij} \rvert^2.[/tex]
 
  • #15
Mårten said:
Did you take into account that my assumption wasn't just that [itex]|a_{ij}|<1[/itex] (actually [itex]0 \leq a_{ij}<1[/itex]), but also that the row sums and column sums are, one by one, less than 1? In a way, the latter thing seems to imply that [itex]|a_{ij}|<1/m[/itex], "on the average" at least.
No, I didn't take that into account. When I do, I get

[tex]\|A\|^2=\langle A,A\rangle=\sum_{i,j}|a_{ij}|^2<\sum_{i,j}|a_{ij}|=\sum_i\bigg(\sum_j a_{ij}\bigg)<\sum_i 1=m[/tex]

which is better, but not good enough. We want [itex]\|A\|<1[/itex]. The condition on the "row sums" and "column sums" allows the possibility of a diagonal matrix with entries close to 1 on the diagonal. Such a matrix doesn't have a Hilbert-Schmidt norm <1 (unless m=1). Edit: On the other hand, the components of An+1 go to 0 individually for such an A, so the problem with such an A isn't that An+1 doesn't go to zero in the H-S norm. It's just that the usual way of showing that it goes to zero doesn't work. The first step of "the usual way" is [itex]\|A^{n+1}\|\leq\|A\|^{n+1}[/itex], and the right-hand side is simply too big to be useful.
 
Last edited:
  • #16
Fredrik said:
No, I didn't take that into account. When I do, I get

[tex]\|A\|^2=\langle A,A\rangle=\sum_{i,j}|a_{ij}|^2<\sum_{i,j}|a_{ij}|=\sum_i\bigg(\sum_j a_{ij}\bigg)<\sum_i 1=m[/tex]

which is better, but not good enough. We want [itex]\|A\|<1[/itex]. The condition on the "row sums" and "column sums" allows the possibility of a diagonal matrix with entries close to 1 on the diagonal. Such a matrix doesn't have a Hilbert-Schmidt norm <1 (unless m=1). Edit: On the other hand, the components of An+1 go to 0 individually for such an A, so the problem with such an A isn't that An+1 doesn't go to zero in the H-S norm.
Hm, that was interesting! Could you explain more in detail how your equation above imply that all the components of An+1 go to zero when n goes to infty? Where exactly in the equation above does one see that? It was a couple of years ago I took my classes in linear algebra... :blushing:

(Edit: I do understand that it's enough to show that the elements of A2 is less than the elements of A, because then the elements in A3 should be even less, and so on, but I cannot yet see why the elements of A2 is less than the elements of A.)
 
Last edited:
  • #17
Mårten said:
Hm, that was interesting! Could you explain more in detail how your equation above imply that all the components of An+1 go to zero when n goes to infty? Where exactly in the equation above does one see that?
It doesn't, and you don't. For this method of proof to work, we needed the norm of A to be <1.

But I realized something last night. Both Adrian and I have been making this more complicated than it needs to be. We may not need to consider the "fancy" Hilbert-Schmidt and operator norms. We can define the norm of A to be the absolute value of its largest component. For arbitrary B, and an A that satisfies the conditions we've been given, this norm satisfies

[tex]\|BA\|=\max\{|(BA)_{ij}|\}=\max\Big\{\Big|\sum_k b_{ik}a_{kj}\Big|\Big\} \leq \max\Big\{\sum_k|b_{ik}a_{kj}|\Big\}\leq\max\Big\{\|B\|\sum_k|a_{kj}|\Big\}<\|B\|a[/tex]

where a is the largest column sum of A. This a is <1.

So

[tex]\|A^{n+1}\|<\|A^{n}\|a<\|A^{n-1}\|a^2\cdots<\|A\|a^n<a^n\rightarrow 0[/tex]

Since the left-hand side is the largest component of An+1, all components of An+1 go to zero.
 
Last edited:
  • #18
Thanks a lot for this derivation, it was really a nifty one! Just a minor thing, I think the rightmost inequality in the uppermost equation above should say "= ||B||a" not "<||B||a", since you already defined a to be the largest column sum of A, with a<1.

I actually found, just the other day, another proof in Miller & Blair (1985): Input-output analysis: Foundations and extensions, p. 23, which uses the ||A||1 definition of a matrix norm (that is the maximum column sum according to http://en.wikipedia.org/wiki/Matrix_norm#Induced_norm"), and then uses a similar reasoning as in your lowermost equation above, but uses the Cauchy-Schwarz inequality instead. I think that's not as intuitive as your proof above.

Some more information about input-output analysis in economics for the interested reader. The columns of matrix A describes one by one the input industry j needs from all the different industries i in the rows, to be able to produce an output worth 1 dollar. The rows correspondingly describe the outputs from industry i. (I-A)-1 is called the Leontief inverse matrix, and column j there describes the total amount of output all industries i have to produce as a result of the consumption of products worth 1 dollar from industry j. This could be useful, when you want to know the energy needed (or any other environmental pressure) resulting from consumption of x dollar's worth of products from a certain industry (if you at the same time know the energy intensities, i.e. J/$, for the industries). The power series is a result of the recursive procedure happening when industries need input from industries, which in turn need input from industries, and so on.

P.s. Sorry for this late reply, I've done some other things for a while.
 
Last edited by a moderator:
  • #19
Correction

I looked through this again, and realized that I was wrong about the condition I gave that the rowsums are less than 1. Fortunately, this doesn't have any implications for the proof above, since it just makes use of that the column sums are less than 1. So the proof still holds! :smile:

For anyone surfing by this thread, the following background information could therefore be useful: What you start out with in input-output analysis, is the input-output table which could be written as

[tex]Fi + C = x[/tex],

where fij represents input of products from industry i to industry j, i is a unit column vector (with just ones in it), C is a column vector representing final demand (non-industry use from households, government, et.c.), and x is a column vector representing total ouput (as well as total input). Not included in the above equation is a row of value-added (VA) below the F-matrix - this row could be interpreted as the labour force input (i.e., salaries) required for the industries to produce their products. The different inputs of products and labor for a certain industry j is shown in the rows of column j in the F-matrix and the VA-row; the sum of this column equals the total inputs xj. This value is the same as the total output xj from that industry, which is the rowsum of F and C for row j. That is, total costs equal total revenues.

Now, if the matrix A is generated via aij = fij/xj, the cells in a certain column j in A, represent the shares of total input xj. That implies that each column sum of A are less than 1. And that's not the same as saying that the row sums of A are less than 1, since the cells in a row of A doesn't represent the shares of that row's rowsum, according to aij = fij/xj. Finally, the above equation can now be written as

[tex]Ax + C = x \Leftrightarrow x = (I-A)^{-1}C[/tex]

(where x becomes a function of the final demand C), and that explains why the topic of this thread was about matrix inverses.
 

FAQ: Matrix inverse equals power-series

What is a matrix inverse?

A matrix inverse is a mathematical operation that results in a new matrix that, when multiplied by the original matrix, gives the identity matrix as the product. In other words, it "undoes" the original matrix.

What is a power-series?

A power-series is an infinite series of terms that involve a variable raised to different powers. It is a useful tool in mathematics for representing functions as an infinite sum.

How is a matrix inverse related to a power-series?

The concept of a matrix inverse can be extended to infinite matrices, which can be thought of as power-series. The inverse of an infinite matrix is equivalent to the reciprocal of the corresponding power-series.

What is the significance of a matrix inverse equaling a power-series?

The equality between a matrix inverse and a power-series has important applications in fields such as differential equations and linear algebra. It allows for the efficient computation of solutions to these types of problems.

How is the inverse of a matrix with a power-series representation calculated?

The inverse of a matrix with a power-series representation can be calculated using various methods such as power-series expansion, diagonalization, and the Cauchy integral formula. The specific method used will depend on the properties of the matrix and the desired level of accuracy.

Back
Top