Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

9.1. Eigenvalues and Eigenvectors

Overview

In Chapter 9 – which will last us the majority of the remainder of the semester – we’re going to introduce a new lens through which we can view the information stored in a matrix: through its eigenvalues and eigenvectors. As Gilbert Strang says in his book, eigenvalues and eigenvectors allow us to look into the “heart” of a matrix and see what it’s really doing.

Throughout this chapter, we’ll see how eigen-things can help us more deeply understand the topics we’ve already covered, like linear regression, the normal equations, gradient descent, and convexity.

On top of all of that, eigenvalues and eigenvectors will unlock a new set of applications – those that involve some element of time. My favorite such example is Google’s PageRank algorithm. The algorithm, first published in 1998 by Sergey Brin and Larry Page (Google’s cofounders, the latter of whom is a Michigan alum), is used to rank pages on the internet based on their relative importance.

Loading...

From the research paper linked above:

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web.

We’ll make sense of the algorithm in Homework 10: just know that this is where we’re heading.


Introduction

Before we get started, keep in mind that everything we’re about to introduce only applies to square matrices. This was also true when we first studied invertibility, and for the same reason: we should think of eigenvalues and eigenvectors as properties of a linear transformation from Rn\mathbb{R}^n to Rn\mathbb{R}^n (that is, from a vector space to itself), not between vector spaces of different dimensions. Rectangular matrices will have their moment in Chapter 10.1.

This definition is a bit hard to parse when you first look at it. But here’s the intuitive interpretation.

“Eigen” is a German word meaning “own”, as in “one’s own”. So, an eigenvector is a vector who still points in its own direction when transformed by AA.

A First Example

Let’s start with a 2×22 \times 2 matrix:

A=[1221]A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}

I’ve chosen the numbers in AA to be small enough that we can roughly eyeball the eigenvectors. Here’s how I look at AA:

  • First, notice that both rows of AA sum to 3, meaning that

    [1221][11]=[33]=3[11]\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \end{bmatrix} = 3 \begin{bmatrix} 1 \\ 1 \end{bmatrix}

    This tells me that [11]\begin{bmatrix} 1 \\ 1 \end{bmatrix} is an eigenvector of AA with eigenvalue 3. But, [22]\begin{bmatrix} 2 \\ 2 \end{bmatrix} is also an eigenvector of AA with the same eigenvalue, since

    [1221][22]=[66]=3[22]\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 2 \end{bmatrix} = \begin{bmatrix} 6 \\ 6 \end{bmatrix} = 3 \begin{bmatrix} 2 \\ 2 \end{bmatrix}

    Indeed, if v\vec v is an eigenvector of AA with eigenvalue λ\mathbf{\lambda}, then so is cvc \vec v for any non-zero scalar cc. So really, eigenvectors define directions.

  • Additionally, noticing that there’d be some symmetry if I took the difference of the entries in each row of AA, consider

    [1221][11]=[11]=1[11]\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ -1 \end{bmatrix} = -1 \begin{bmatrix} -1 \\ 1 \end{bmatrix}

    This tells me that [11]\begin{bmatrix} -1 \\ 1 \end{bmatrix} is an eigenvector of AA with eigenvalue -1.

So, A=[1221]A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} has two eigenvalues, 3 and -1, and two corresponding eigenvectors, [11]\begin{bmatrix} 1 \\ 1 \end{bmatrix} and [11]\begin{bmatrix} -1 \\ 1 \end{bmatrix}. In general, an n×nn \times n matrix has nn eigenvalues, but some of them may be the same, and some of them may not be real numbers. We’ll see how to systematically find these eigenvalues and eigenvectors in just a bit.

Visually, this means that v1=[11]\color{orange} \vec v_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} lives on the same line as Av1\color{orange} A \vec v_1, which is also the line that cv1\color{orange} c \vec v_1 and A(cv1)\color{orange} A(c \vec v_1) live on (for any non-zero scalar cc). And, v2=[11]\color{#3d81f6} \vec v_2 = \begin{bmatrix} -1 \\ 1 \end{bmatrix} lives on the same line as Av2\color{#3d81f6} A \vec v_2.

But, if a vector isn’t already on one of the two aforementioned lines – like x=[10]\color{#d81a60} \vec x = \begin{bmatrix} -1 \\ 0 \end{bmatrix} – then it will change directions when multiplied by AA, and it is not an eigenvector.

Image produced in Jupyter

Just to be 100% clear, I’ve used 2v2\color{#3d81f6} 2 \vec v_2 instead of v2\color{#3d81f6} \vec v_2 above just to illustrate the fact that eigenvectors are only defined up to a scalar multiple; 2v2\color{#3d81f6} 2 \vec v_2 is just as good as an eigenvector as v2\color{#3d81f6} \vec v_2 is, and both correspond to the same eigenvalue, λ=1\mathbf{\lambda} = 1.

You might notice that the two eigenvectors of AA corresponding to the two different eigenvalues are orthogonal in the example above. This is not true in general for any 2×22 \times 2 matrix, but there’s a specific reason it’s true for AA: it’s symmetric. I’ll elaborate more on this idea in Chapter 9.5, but for now, just remember that symmetric matrices have orthogonal eigenvectors.

Finding Eigenvalues using numpy

Just to show you another example, consider

B=[2510]B = \begin{bmatrix} 2 & 5 \\ 1 & 0 \end{bmatrix}

Its eigenvalues and eigenvectors aren’t particularly nice, and since we don’t yet have a way to find them by hand, now is as good as a time as any to use numpy:

B = np.array([[2, 5],
              [1, 0]])

np.linalg.eig(B)
EigResult(eigenvalues=array([ 3.44948974, -1.44948974]), eigenvectors=array([[ 0.96045535, -0.82311938], [ 0.27843404, 0.56786837]]))

So, BB has eigenvalues of 3.45\approx 3.45 and 1.45\approx -1.45. Note that the eigenvectors are the columns of the matrix returned, not the rows!

eigvals, eigvecs = np.linalg.eig(B)
for i in range(eigvecs.shape[1]):
    print(f"Eigenvector {i+1}: {eigvecs[:, i]}")
    print(f"Eigenvalue {i+1}: {eigvals[i]}")
    print()
Eigenvector 1: [0.96045535 0.27843404]
Eigenvalue 1: 3.4494897427831783

Eigenvector 2: [-0.82311938  0.56786837]
Eigenvalue 2: -1.449489742783178

eigvecs is a matrix where each column is an eigenvector of BB. For now, call this matrix PP. Below, I calculate PTPP^TP, which contains the dot products of all pairs of eigenvectors of BB. The diagonal of this matrix contains the dot products of each eigenvector with itself; since these are 1, this tells us that the returned eigenvectors are unit vectors. This was a design decision by the implementors of np.linalg.eig – remember that we can scale an eigenvector by any non-zero scalar and it is still an eigenvector. The off-diagonal entries of -0.632 tell us that the two eigenvectors are not orthogonal.

eigvecs.T @ eigvecs
array([[ 1. , -0.63245553], [-0.63245553, 1. ]])

Let’s take a look at the directions of the eigenvectors for BB.

Image produced in Jupyter

While the eigenvectors of BB are not orthogonal, they are still linearly independent and span all of R2\mathbb{R}^2. This was also the case in the AA example above. The fact that the eigenvectors of an n×nn \times n matrix are linearly independent and span all of Rn\mathbb{R}^n is also not guaranteed to be true, though it’s a desirable property. The class of matrices that have this property are called diagonalizable, which are the focus of Chapter 9.4.

Let’s consider a few more examples, each of which is meant to highlight a different key property of eigenvalues and eigenvectors.

Example: Matrix Powers

Let A=[1221]A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} be the matrix from the first example above. What are the eigenvalues and eigenvectors of A2A^2? Find them manually. You should notice the property below.

Example: Non-Invertible Matrices

Let A=[14312]A = \begin{bmatrix} 1 & 4 \\ 3 & 12 \end{bmatrix}. Notice that rank(A)=1\text{rank}(A) = 1. AA has an eigenvalue of 13 with eigenvector [13]\begin{bmatrix} 1 \\ 3 \end{bmatrix} – verify that this is the case. Does it have another eigenvalue? What is the corresponding eigenvector?

Example: The Identity Matrix

What are the eigenvalues and eigenvectors of I=[1001]I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}?

A question on your mind might be, how do we find the eigenvalues of a generic matrix, when they aren’t easy to eyeball? Keep reading! The bottom of Chapter 9.2 also has a great summary of the key ideas from this section.