Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

9.2. The Characteristic Polynomial

So far, we’ve found the eigenvalues and eigenvectors of a matrix by eyeballing them or reasoning about them geometrically, but this is not a sustainable strategy. Let’s develop a more systematic approach.


Deriving the Characteristic Polynomial

We’re looking for combinations of scalars, λ\mathbf{\lambda}, and vectors, v\vec v, such that

Av=λvA \vec v = \lambda \vec v

In some ways, we’re trying to “solve” for λ\mathbf{\lambda} and v\vec v given AA. Let’s experiment a little:

Av=λvAvλv=0(AλI)v=0\begin{align*} A \vec v = \lambda \vec v \\ A \vec v - \lambda \vec v = \vec 0 \\ (A - \lambda I) \vec v = \vec 0 \end{align*}

What the above says is that if v\vec v is an eigenvector of AA with eigenvalue λ\mathbf{\lambda}, then v\vec v is in the null space of AλIA - \lambda I. But, since eigenvectors can’t be the zero vector, this means that AλIA - \lambda I is not invertible, since it has a non-trivial null space!

Thinking back to Chapter 6.2, we know that there are several equivalent ways to check if a matrix is not invertible. Perhaps the most computational approach is to compute its determinant; if a (square) matrix’s determinant is 0, then it is not invertible, otherwise it is.

So, since AλIA - \lambda I is not invertible, its determinant must be 0!

Av=λv    det(AλI)=0A \vec v = \lambda \vec v \implies \det(A - \lambda I) = 0

(We won’t always use the symbol p(λ)p(\lambda), I just introduced it above to make it clear that det(AλI)\text{det}(A - \lambda I) is a polynomial function of λ\mathbf{\lambda}.)

Let’s revisit our example A=[1221]A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}.

The matrix AλIA - \lambda I is

AλI=[1221]λ[1001]=[1λ221λ]A - \lambda I = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 - \lambda & 2 \\ 2 & 1 - \lambda \end{bmatrix}

Note that AλIA - \lambda I involves subtracting λ\mathbf{\lambda} from the diagonal elements of AA, and leaving all other elements unchanged.

The determinant of AλIA - \lambda I is

det(AλI)=(1λ)(1λ)22=λ22λ+14=λ22λ3p(λ)\det(A - \lambda I) = (1 - \lambda)(1 - \lambda) - 2 \cdot 2 = \lambda^2 - 2\lambda + 1 - 4 = \underbrace{\lambda^2 - 2\lambda - 3}_{p(\lambda)}

The eigenvalues of AA are the values of λ\mathbf{\lambda} where λ22λ3=0\lambda^2 - 2\lambda - 3 = 0.

Image produced in Jupyter

(In general, we’re not going to plot the characteristic polynomial each time, but I think it’s useful to see once or twice to give the idea some context.)

By factoring, I can write this equation as

(λ+1)(λ3)=0(\lambda + 1)(\lambda - 3) = 0

which tells me that the eigenvalues of AA are λ1=1\lambda_1 = -1 and λ2=3\lambda_2 = 3. If this equation weren’t factorable, I’d need to use the quadratic formula to find the eigenvalues. For now, there’s no particular “ordering” to the eigenvalues, i.e. I could have said λ1=3\lambda_1 = 3 and λ2=1\lambda_2 = -1; all that matters is that I stay consistent throughout a particular example.

Once we find the eigenvalues by solving p(λ)=0p(\lambda) = 0, we can find the eigenvectors by solving (AλI)v=0(A - \lambda I) \vec v = \vec 0 for each eigenvalue.

  • For λ1=1\lambda_1 = -1, we’re looking for vectors v\vec v such that Av=1vA \vec v = -1 \vec v, i.e. if v=[ab]\vec v = \begin{bmatrix} a \\ b \end{bmatrix}, then

    [1221][ab]=[ab]\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} -a \\ -b \end{bmatrix}

    (I’m using aa and bb instead of v1v_1 and v2v_2 because I’ll refer to the vectors v1\vec v_1 and v2\vec v_2 in just a moment.) As a system of equations, this says

    a+2b=a2a+b=b\begin{align*} a + 2b &= -a \\ 2a + b &= -b \end{align*}

    The first and second equations both tell us that b=ab = -a. Remember, we’d expect there to be infinitely many solutions to this system, since any scalar multiple of an eigenvector is still an eigenvector. So, the “simple” solution is v1=[11]\vec v_1 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}, but [22]\begin{bmatrix} 2 \\ -2 \end{bmatrix}, [33]\begin{bmatrix} 3 \\ -3 \end{bmatrix}, etc. are also solutions.

  • For λ2=3\lambda_2 = 3, let me introduce another way of finding the corresponding eigenvector. There’s nothing wrong with the system of equations approach, but it’s useful to have multiple techniques for solving problems in our toolkit. λ2=3\lambda_2 = 3 tells us that Av=3vA \vec v = 3 \vec v, or equivalently, (A3I)v=0(A - 3I) \vec v = \vec 0. So, the eigenvector we’re looking for is in the null space of A3IA - 3I.

    A3I=[132213]=[2222]A - 3I = \begin{bmatrix} 1 - 3 & 2 \\ 2 & 1 - 3 \end{bmatrix} = \begin{bmatrix} -2 & 2 \\ 2 & -2 \end{bmatrix}

    Here, we notice that both rows of A3IA - 3I sum to 0, meaning that

    (A3I)[11]=[00](A - 3I) \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}

    So, [11]\begin{bmatrix} 1 \\ 1 \end{bmatrix} is an eigenvector for λ2=3\lambda_2 = 3.


Examples

Example: 2×22 \times 2 Matrices

Find the eigenvalues and eigenvectors of A=[3124]A = \begin{bmatrix} 3 & 1 \\ 2 & 4 \end{bmatrix}.

Example: Diagonal Matrices

Find the eigenvalues and eigenvectors of A=[3004]A = \begin{bmatrix} 3 & 0 \\ 0 & -4 \end{bmatrix}. What do you notice about the eigenvalues? The eigenvectors?

Example: Sharing a Characteristic Polynomial

Find two different 2×22 \times 2 matrices who have the characteristic polynomial

p(λ)=λ24λ+3p(\lambda) = \lambda^2 - 4 \lambda + 3

Example: 3×33 \times 3 Matrices

Recall, if AA is a 3×33 \times 3 matrix, like

A=[abcdefghi]A = \begin{bmatrix} {\color{3d81f6} a} & {\color{3d81f6} b} & {\color{3d81f6} c} \\ {\color{orange} d} & {\color{d81a60}e } & {\color{004d40}f } \\ {\color{orange} g} & {\color{d81a60}h } & {\color{004d40}i } \end{bmatrix}

then det(A)\text{det}(A) is

det(A)=aefhibdfgi+cdegh\text{det}(A) = {\color{3d81f6} a} \begin{vmatrix} {\color{d81a60}e } & {\color{004d40}f } \\ {\color{d81a60}h } & {\color{004d40}i } \end{vmatrix} - {\color{3d81f6} b} \begin{vmatrix} {\color{orange} d} & {\color{004d40}f } \\ {\color{orange} g} & {\color{004d40}i } \end{vmatrix} + {\color{3d81f6} c} \begin{vmatrix} {\color{orange} d} & {\color{d81a60}e } \\ {\color{orange} g} & {\color{d81a60}h } \end{vmatrix}

where | \cdot | denotes the determinant of the matrix.

Find the eigenvalues and eigenvectors of the upper triangular matrix

A=[330015002]A = \begin{bmatrix} 3 & 3 & 0 \\ 0 & 1 & 5 \\ 0 & 0 & 2 \end{bmatrix}

What do you notice about the eigenvalues?

# When in doubt, check with numpy!
np.linalg.eig(
    np.array([
        [2, 0, 0],
        [-1, 5, 0],
        [4, 3, -7]
    ])
)
EigResult(eigenvalues=array([-7., 5., 2.]), eigenvectors=array([[0. , 0. , 0.83925433], [0. , 0.9701425 , 0.27975144], [1. , 0.24253563, 0.4662524 ]]))

Number of Eigenvalues

The characteristic polynomial of an n×nn \times n matrix is a polynomial of degree nn. A fact from algebra is that a polynomial of degree nn has exactly nn roots. The intuitive way of thinking about this is that degree nn polynomials can have at most n1n-1 “bends” (a line can’t bend, a quadratic can bend once, a cubic can bend twice, etc.), and each time it bends, it can change directions to cross the xx-axis again.

The issue is that some of these roots may be repeated, and some may be complex numbers, meaning that they don’t actually cross the xx-axis in the standard xyxy-plane (or in our case, the λ\mathbf{\lambda}-axis and λ,p(λ)\lambda, p(\lambda)-plane).

For example, the matrix A=[4000010000000004]A = \begin{bmatrix} 4 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 4 \end{bmatrix} is diagonal, meaning its easy to read off its characteristic polynomial:

p(λ)=det(AλI)=4λ00001λ0000λ00004λ=λ(λ1)(λ4)2p(\lambda) = \det(A - \lambda I) = \begin{vmatrix} 4 - \lambda & 0 & 0 & 0 \\ 0 & 1 - \lambda & 0 & 0 \\ 0 & 0 & -\lambda & 0 \\ 0 & 0 & 0 & 4 - \lambda \end{vmatrix} = \lambda (\lambda - 1)(\lambda - 4)^2

This characteristic polynomial has a double root at λ=4\lambda = 4, a single root at λ=1\lambda = 1, and a single root at λ=0\lambda = 0. This AA has 3 distinct eigenvalues, but one of them, λ=2\lambda = 2, is repeated, and has an algebraic multiplicity of 2.

Image produced in Jupyter

As another example, the matrix A=[3443]A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix} has the characteristic polynomial

p(λ)=3λ443λ=(λ3)2+16p(\lambda) = \begin{vmatrix} 3 - \lambda & -4 \\ 4 & 3 - \lambda \end{vmatrix} = (\lambda - 3)^2 + 16

I can expand this out to get λ26λ+25\lambda^2 - 6\lambda + 25, but in a case like this I think the above form is more telling. Visually, p(λ)p(\lambda) here is a parabola that sits entirely above the λ\mathbf{\lambda}-axis, meaning it has no real roots, so AA has no real eigenvalues.

Image produced in Jupyter

It should not be all that surprising that A=[3443]A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix} has no real eigenvalues. AA is a rotation matrix, scaled by a factor of 5:

A=[3443]=5[35454535]=5[cosθsinθsinθcosθ]A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix} = 5 \begin{bmatrix} \frac{3}{5} & -\frac{4}{5} \\ \frac{4}{5} & \frac{3}{5} \end{bmatrix} = 5 \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}

where θ=cos1(35)\theta = \cos^{-1}\left(\frac{3}{5}\right). This matrix rotates vectors by θ\mathbf{\theta} radians counterclockwise, which means that no real-valued vector will remain in the same direction after being multiplied by AA. In the 2×22 \times 2 case, the only way to get a real-valued eigenvalue out of a rotation matrix is if the matrix rotates by an integer multiple of π\mathbf{\pi} radians (180 degrees), which either corresponds to reflecting/negating the vector (like in [1001]\begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}) or returning back the same vector itself (like in [1001]\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, the identity matrix, which can be thought of as a rotation by 2π2 \pi).

The solutions to p(λ)=(λ3)2+16=0p(\lambda) = (\lambda - 3)^2 + 16 = 0 are the complex numbers λ=3+4i\lambda = 3 + 4i and λ=34i\lambda = 3 - 4i, again where ii is the imaginary unit, defined by i2=1i^2 = -1. The corresponding eigenvectors are complex too, and we won’t worry about finding them. (If you’re familiar with complex numbers, you might recognize these eigenvalues as 5eiθ5e^{i\theta} and 5eiθ5e^{-i\theta}, where θ=cos1(35)\theta = \cos^{-1}\left(\frac{3}{5}\right). This comes from Euler’s formula, eiθ=cosθ+isinθe^{i\theta} = \cos \theta + i \sin \theta.)

Trace and Determinant

As a final example, consider the arbitrary 2×22 \times 2 matrix

A=[abcd]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

Its characteristic polynomial is

p(λ)=aλbcdλ=(aλ)(dλ)bc=λ2(a+d)λ+adbcp(\lambda) = \begin{vmatrix} a - \lambda & b \\ c & d - \lambda \end{vmatrix} = (a - \lambda)(d-\lambda) - bc = \lambda^2 - (a + d)\lambda + ad-bc

The coefficient on λ\mathbf{\lambda} is (a+d)-(a + d), which is the negative of the sum of the diagonal entries of AA. A term often used to refer to the sum of the diagonal entries of a matrix is its trace, trace(A)\text{trace}(A). And, the constant term of p(λ)p(\lambda) above is just det(A)\text{det}(A)! For 2×22 \times 2 matrices, this says that the characteristic polynomial is of the form

p(λ)=λ2trace(A)λ+det(A)p(\lambda) = \lambda^2 - \text{trace}(A) \lambda + \text{det}(A)

For example, consider A=[3179]A = \begin{bmatrix} 3 & 1 \\ 7 & 9 \end{bmatrix}. The fact above says that AA’s characteristic polynomial is

p(λ)=λ2(3+9)λ+(3917)=λ212λ+20p(\lambda) = \lambda^2 - (3 + 9)\lambda + (3 \cdot 9 - 1 \cdot 7) = \lambda^2 - 12\lambda + 20

which, with enough practice, is a calculation you might be able to do in your head. But, this gives way to a more general fact, that is true for any n×nn \times n matrix.

Example 1: A=[3179]A = \begin{bmatrix} 3 & 1 \\ 7 & 9 \end{bmatrix}'s eigenvalues add up to 3+9=123 + 9 = 12 and multiply to det(A)=3917=20\text{det}(A) = 3 \cdot 9 - 1 \cdot 7 = 20. This gives me a quick way of figuring out that AA’s eigenvalues are 10 and 2.

Example 2: A=[0.50.500.40.30.300.50.5]A = \begin{bmatrix} 0.5 & 0.5 & 0 \\ 0.4 & 0.3 & 0.3 \\ 0 & 0.5 & 0.5 \end{bmatrix} has an eigenvalue of 1, corresponding to the eigenvector [111]\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}, since the sum of each row of AA is 1. Let λ2\lambda_2 and λ3\lambda_3 be the other two eigenvalues. The trace fact tells me that

1+λ2+λ3=0.5+0.3+0.5=1.3    λ2+λ3=0.31 + \lambda_2 + \lambda_3 = 0.5 + 0.3 + 0.5 = 1.3 \implies \lambda_2 + \lambda_3 = 0.3

and the determinant fact tells me that

1λ2λ3=0.50.500.40.30.300.50.5=0.5(0.30.50.30.5)0.5(0.40.50.30)+0(0.40.50.30)=0.50.40.5=0.1\begin{align*} 1 \cdot \lambda_2 \cdot \lambda_3 &= \begin{vmatrix} 0.5 & 0.5 & 0 \\ 0.4 & 0.3 & 0.3 \\ 0 & 0.5 & 0.5 \end{vmatrix} \\ &= 0.5 (0.3 \cdot 0.5 - 0.3 \cdot 0.5) - 0.5 (0.4 \cdot 0.5 - 0.3 \cdot 0) + 0 (0.4 \cdot 0.5 - 0.3 \cdot 0) \\ &= -0.5 \cdot 0.4 \cdot 0.5 &= -0.1 \end{align*}

In other words, the other two eigenvalues sum to 0.3 and multiply to -0.1. This sounds like a job for 0.5 and -0.2, which are indeed the other two eigenvalues of AA (other than 1).

Example 3: A=[1224]A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix} has a determinant of 1422=01 \cdot 4 - 2 \cdot 2 = 0, which means that its eigenvalues multiply to 0, which is another reminder that AA has an eigenvalue of 0. The trace fact tells me that the other eigenvalue λ2\lambda_2 must satisfy 1+4=0+λ21 + 4 = 0 + \lambda_2, so λ2=5\lambda_2 = 5.


Key Takeaways

Let’s summarize the key ideas from both this section and the last.

Suppose AA is an n×nn \times n matrix.

  1. λ\mathbf{\lambda} is an eigenvalue of AA, corresponding to the eigenvector v\vec v, if

    Av=λvA \vec v = \lambda \vec v

    The English interpretation of this is that v\vec v is an eigenvector of AA if, when multiplied by AA, v\vec v still points in the same direction, and is just stretched by a factor of λ\mathbf{\lambda}.

  2. An n×nn \times n matrix AA has exactly nn eigenvalues. Some of these may be complex, and some of these may be repeated, but all of them are roots to the characteristic polynomial of AA,

p(λ)=det(AλI)p(\lambda) = \det(A - \lambda I)
  1. A quick way to check if you’ve found the right eigenvalues is that

  • The product of the eigenvalues of AA is equal to det(A)\det(A), i.e. λ1λ2λn=det(A)\lambda_1 \lambda_2 \cdots \lambda_n = \det(A).

  • The sum of the eigenvalues of AA is equal to the trace of AA, which is the sum of the diagonal elements of AA.

  1. 0 is an eigenvalue of AA if and only if AA is not invertible.

  2. If λ\mathbf{\lambda} is an eigenvalue of AA with eigenvector v\vec v, then λk\mathbf{\lambda}^k is an eigenvalue of AkA^k with the same eigenvector v\vec v.