Python’s Sympy Module and the Cayley-Hamilton Theorem
Two of my favorite areas of study are linear algebra and computer programming. In this article I combine these areas by using Python to confirm that a given matrix satisfies the Cayley-Hamilton theorem. The theorem due to Arthur Cayley and William Hamilton states that if ##f(\lambda) = \lambda^n + c_{n-1}\lambda^{n-1} + \dots + c_1\lambda + c_0## is the characteristic polynomial for a square matrix A , then A is a solution to this characteristic equation. That is, ##f(A) = A^n + c_{n-1}A^{n-1} + \dots + c_1A + c_0I = 0##. Here I is the identity matrix of order n, 0 is the zero matrix, also of order n.
Characteristic polynomial – the determinant |A – λI|, where A is an n x n square matrix, I is the n x n identity matrix, and λ is a scalar variable, real or complex. The characteristic polynomial for a square matrix is a function of the variable, λ.
Identity matrix – a square matrix with with 1’s along the main diagonal, and 0’s everywhere else. For example, the 2 x 2 identity matrix is ##\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}##
Determinant – the determinant of a 2 x 2 matrix A, where ##A = \begin{bmatrix}a & b \\ c & d \end{bmatrix}##, is ab – cd, and is usually written as |A|. For determinants of higher-order matrices, see Determinant – Wikipedia.
Table of Contents
Finding the characteristic polynomial
The SymPy library makes it easy to find the characteristic polynomial for a matrix. In what follows, I’ll be using the 3 x 3 matrix M, defined as ##M = \begin{bmatrix} 2 & 1 & 1 \\ 2 & 3 & 2 \\ 1 & 1 & 2\end{bmatrix}##.
The following snippet defines the matrix M, and calls the charpoly() method to get the characteristic polynomial for M. It also uses the rank() method to set a constant, MATSIZE, to the number of non-zero rows in the matrix. Letting the program determine the size of the matrix means that I can change the matrix definition without having to change any hard-coded “magic” numbers anywhere else in the program.
M = Matrix([[2, 1, 1], [2, 3, 2], [1, 1, 2]]) MATSIZE = M.rank() lamda = symbols('lamda') poly = M.charpoly(lamda) # Get the characteristic polynomial print(poly)
(Yes, I know that “lamda” is misspelled. Python already has a keyword named lambda, so the name has to be altered slightly.) When the code snippet above is executed, the last line displays the following:
PurePoly(lamda**3 - 7*lamda**2 + 11*lamda - 5, lamda, domain='ZZ')
From the above, you can see that the characteristic polynomial for this matrix is ##f(\lambda) = \lambda^3 – 7\lambda^2 + 11\lambda – 5##. The domain value, ‘ZZ’, indicates the set of numbers that can be used in polynomial calculations.
We can also get the polynomial in factored form, assuming our polynomial is relatively simple, by using the factor() and as_expr() methods. The code below treats the polynomial as an expression and then converts it to its factored form before displaying it.
pprint(factor(poly.as_expr()))
The pprint() method “pretty prints” the expression passed to it. For the example matrix and characteristic polynomial in this article, the program displays the factored characteristic polynomial like this:
2 (λ - 5)⋅(λ - 1)
Next, we’ll separate the terms of this polynomial.
Parsing the terms of the characteristic polynomial
To construct the characteristic polynomial as a function of the matrix M, we use another SymPy method — terms(). This method returns a Python list object that contains all of the nonzero terms of the given polynomial.
poly_terms = poly.terms() print("Terms ", poly_terms)
The returned list contains tuples, one for each non-zero term of the polynomial. Each tuple contains the exponent on the term and the coefficient of that term. The exponent part is also a tuple, to account for the possibility of a polynomial in two or more variables. Because our polynomial uses only a single variable, each exponent tuple is a singleton.
For our example, the terms list looks like this:
[((3,), 1), ((2,), -7), ((1,), 11), ((0,), -5)]
The index-0 element, ((3, ), 1), corresponds to the ##\lambda^3## term of the characteristic polynomial. The remaining list elements correspond to the lower-degree terms in this polynomial.
Now that we have a list that contains the exponent and coefficient for each term of the polynomial, the next step is to extract them and place them in separate lists.
Extracting the coefficients of the characteristic polynomial
For each tuple in the terms list, the item at index 1 contains the coefficient of a given term. The coefficient of the i-th term of a list named poly is poly[i][1]. The function shown below extracts the coefficients and returns them as a list. The Complete Python Code section contains a fully-commented version of this code.
def getCoeffs(poly): coefficients = [poly[i][1] for i in range(len(poly))] return coefficients
Rather than use a for loop to populate coefficients, I am using list comprehension, a more “Pythonic” technique. For each top-level tuple in the list of exponents and coefficients, getCoeffs() extracts each coefficient and adds it to coefficients. A call to this function looks like this:
coefficients = getCoeffs(poly_terms)
Extracting the exponents of the characteristic polynomial
Extracting the exponents is a bit more difficult, as the first item in each tuple in the terms list is itself a tuple. In more detail, the index-0 tuple in the list is ((3, ), 1). poly[0] evaluates to this tuple, poly[0][0] is the singleton tuple that contains (3, ), and poly[0][0][0] evaluates to 3. The exponent of the i-th term of a list named poly is poly[i][0][0].
The function shown below extracts the exponents and returns them in a list.
def getExpons(poly): exponents = [poly[i][0][0] for i in range(len(poly))] return exponents
The Complete Python Code section contains a fully-commented version of this function, including a call to it.
Constructing the matrix characteristic polynomial
Now that we have a list of the polynomial coefficients and another one of the exponents, we can use either the expanded form or the factored form of the characteristic polynomial. In my code I chose to use the factored form.
To evaluate the matrix characteristic polynomial, I’ll need to build an expression of the form ##(M – coefficient_i * I)^{exponent_i}## for each factor, and then multiply them all together. I do this using a for loop. The final value of Result will contain the value of the characteristic polynomial when it is evaluated for the matrix M.
I = eye(MATSIZE) Result = zeros(MATSIZE, MATSIZE) for i in range(MATSIZE): Result *= (M - coefficients[i]*I)**exponents[i]
In the snippet above, I is the identity matrix of order MATSIZE, the constant that represents the number of nonzero rows of the given matrix.
The program also generates a string that represents the expanded form of the characteristic polynomial of this article, which looks like this.
1M**3 -7M**2 + 11M**1 -5I
Verifying that the matrix satisfies Cayley-Hamilton
If the matrix product Result is equal to the zero matrix, then we have verified that this matrix is a solution to its characteristic polynomial equation.
if (Result == zeros(MATSIZE, MATSIZE)): print("The matrix satisfies its characteristic equation.") else: print("The matrix DOES NOT satisfy its characteristic equation.")
Program Output
When I run the program describe in this article, the IDLE shell displays the following results. The next section contains the code for the complete program.
Complete Python Code
I’m running Python 3.10 and SymPy 1.9 in IDLE, the integrated development environment that comes with Python.
#CayleyHamilton.py - Confirm that a matrix is a solution of its char. equation, f(lamda). from sympy import Matrix, symbols, pprint, factor, eye, zeros, Poly, degree def getCoeffs(poly): """ Get the coefficients of the terms of a polynomial. Parameters ---------- poly: list - a list of tuples that contain the coefficients and exponents of the given polynomial. In the i-th tuple, the subtuple at index 1 holds the coefficient. Returns ---------- coefficients: list - A list of the coefficients. """ coefficients = [poly[i][1] for i in range(len(poly))] return coefficients def getExpons(poly): """ Get the exponents of the terms of a polynomial. Parameters ---------- poly: list - a list of tuples that contain the coefficients and exponents of the given polynomial. In the i-th tuple, the subtuple at index 1 holds the coefficient. Returns ---------- exponents: list - A list of the exponents. """ exponents = [poly[i][0][0] for i in range(len(poly))] return exponents # The matrix we're checking M = Matrix([[2, 1, 1], [2, 3, 2], [1, 1, 2]]) MATSIZE = M.rank() I = eye(MATSIZE) # Print a label and pretty print the matrix. print("M") pprint(M) lamda = symbols('lamda') # Get and print the characteristic polynomial poly = M.charpoly(lamda) print(poly) # Get the terms of the polynomial. # The terms() method returns a list of tuples of tuples. # Each tuple contains the polynomial exponent and its coefficient. # E.g., [((3,), 1), ((2,), -7), ((1,), 11), ((0,), -5] print("Terms ", poly.terms()) poly_terms = poly.terms() # Get a list of the coefficients of the terms of the polynomial. coefficients = getCoeffs(poly_terms) print("Coeffs: ", coefficients) # Get a list of the exponents of the terms of the polynomial. exponents = getExpons(poly_terms) print("Expons: ", exponents) # Print a label and pretty print the factored polynomial. print("Char. polynomial: ") pprint(factor(poly.as_expr())) print() # Substitute the matrix into the characteristic polynomial. Result = zeros(MATSIZE, MATSIZE) for i in range(MATSIZE): Result *= (M - coefficients[i]*I)**exponents[i] # Write the matrix equation in expanded form. # This is just a string that is used only for informational purposes. strng = "Matrix polynomial: " for i in range(MATSIZE): strng += str(coefficients[i]) + 'M**' + str(exponents[i]) + (' + ' if coefficients[i+1]>0 else ' ') strng += str(coefficients[MATSIZE]) + 'I' print (strng) # Print the value of the characteristic polynomial, evaluated for the matrix M. This should result in the zero matrix. print("Result...") pprint(Result) # Check that our matrix is a solution of Cayley-Hamilton. if (Result == zeros(MATSIZE, MATSIZE)): print("The matrix satisfies its characteristic equation.") else: print("The matrix DOES NOT satisfy its characteristic equation.")
Former college mathematics professor for 19 years where I also taught a variety of programming languages, including Fortran, Modula-2, C, and C++. Former technical writer for 15 years at a large software firm headquartered in Redmond, WA. Current associate faculty at a nearby community college, teaching classes in C++ and computer architecture/assembly language.
I enjoy traipsing around off-trail in Olympic National Park and the North Cascades and elsewhere, as well as riding and tinkering with my four motorcycles.
Leave a Reply
Want to join the discussion?Feel free to contribute!