5.1. Eigenvalues and Eigenvectors

import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib_inline.backend_inline import set_matplotlib_formats

set_matplotlib_formats("svg")
sns.set_context("poster")
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (10, 5)

A = np.array([[0, 1 / 2, 1 / 2, 1 / 3],
              [1, 0, 0, 1 / 3],
              [0, 0, 0, 1 / 3],
              [0, 1 / 2, 1 / 2, 0]])

def plot_from_adjacency(adjacency_matrix, node_sizes=0.25):
    np.random.seed(25)
    plt.figure(figsize=(8, 5))
    G = nx.from_numpy_array(adjacency_matrix.T, create_using=nx.DiGraph)
    layout = nx.spring_layout(G)
    labels_dict={i: i+1 for i in range(adjacency_matrix.shape[0])}
    nx.draw(G, layout, 
            node_size=15000 * node_sizes, labels=labels_dict, with_labels=True, font_color='white', font_weight='bold', font_size=15, 
            connectionstyle='arc3, rad = 0.1')
    plt.show()

plot_from_adjacency(A)

Loading...

From the research paper linked above:

PageRank or PR(A) can be calculated using a simple iterative algorithm, and corresponds to the principal eigenvector of the normalized link matrix of the web.

We’ll make sense of the algorithm in Homework 10: just know that this is where we’re heading.

Introduction¶

Before we get started, keep in mind that everything we’re about to introduce only applies to square matrices. This was also true when we first studied invertibility, and for the same reason: we should think of eigenvalues and eigenvectors as properties of a linear transformation from $\mathbb{R}^n$ to $\mathbb{R}^n$ (that is, from a vector space to itself), not between vector spaces of different dimensions. Rectangular matrices will have their moment in Chapter 5.3.

This definition is a bit hard to parse when you first look at it. But here’s the intuitive interpretation.

“Eigen” is a German word meaning “own”, as in “one’s own”. So, an eigenvector is a vector who still points in its own direction when transformed by $A$ .

A First Example¶

Let’s start with a $2 \times 2$ matrix:

A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}

I’ve chosen the numbers in $A$ to be small enough that we can roughly eyeball the eigenvectors. Here’s how I look at $A$ :

First, notice that both rows of $A$ sum to 3, meaning that
$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \end{bmatrix} = 3 \begin{bmatrix} 1 \\ 1 \end{bmatrix}$
This tells me that $\begin{bmatrix} 1 \\ 1 \end{bmatrix}$ is an eigenvector of $A$ with eigenvalue 3. But, $\begin{bmatrix} 2 \\ 2 \end{bmatrix}$ is also an eigenvector of $A$ with the same eigenvalue, since
$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} 2 \\ 2 \end{bmatrix} = \begin{bmatrix} 6 \\ 6 \end{bmatrix} = 3 \begin{bmatrix} 2 \\ 2 \end{bmatrix}$
Indeed, if $\vec v$ is an eigenvector of $A$ with eigenvalue $\mathbf{\lambda}$ , then so is $c \vec v$ for any non-zero scalar $c$ . So really, eigenvectors define directions.
Additionally, noticing that there’d be some symmetry if I took the difference of the entries in each row of $A$ , consider
$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} 1 \\ -1 \end{bmatrix} = -1 \begin{bmatrix} -1 \\ 1 \end{bmatrix}$
This tells me that $\begin{bmatrix} -1 \\ 1 \end{bmatrix}$ is an eigenvector of $A$ with eigenvalue -1.

So, $A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$ has two eigenvalues, 3 and -1, and two corresponding eigenvectors, $\begin{bmatrix} 1 \\ 1 \end{bmatrix}$ and $\begin{bmatrix} -1 \\ 1 \end{bmatrix}$ . In general, an $n \times n$ matrix has $n$ eigenvalues, but some of them may be the same, and some of them may not be real numbers. We’ll see how to systematically find these eigenvalues and eigenvectors in just a bit.

Visually, this means that $\color{orange} \vec v_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}$ lives on the same line as $\color{orange} A \vec v_1$ , which is also the line that $\color{orange} c \vec v_1$ and $\color{orange} A(c \vec v_1)$ live on (for any non-zero scalar $c$ ). And, $\color{#3d81f6} \vec v_2 = \begin{bmatrix} -1 \\ 1 \end{bmatrix}$ lives on the same line as $\color{#3d81f6} A \vec v_2$ .

But, if a vector isn’t already on one of the two aforementioned lines – like $\color{#d81a60} \vec x = \begin{bmatrix} -1 \\ 0 \end{bmatrix}$ – then it will change directions when multiplied by $A$ , and it is not an eigenvector.

from utils import plot_vectors
import numpy as np

# Matrix
A = np.array([[1, 2], [2, 1]])

# Vectors
v1 = np.array([1, 1])      # eigenvector direction
v2 = np.array([-2, 2])    # same eigenvector direction, negative/scaled
x = -np.array([1, 0])
y = np.array([-2, -1])

Av1 = A @ v1
Av2 = A @ v2
Ax = A @ x
Ay = A @ y

# To set alpha=0.5 in hex, convert as follows:
# orange -> #FFA500   ===> with alpha=0.5: #80FFA500
# #3d81f6 -> with alpha=0.5: #803d81f6
# #d81a60 -> with alpha=0.5: #80d81a60
# #004d40 -> with alpha=0.5: #80004d40

vectors = [
    (tuple(v2), 'rgba(61,129,246,1.0)', r'$2\vec v_2$'),      # dimmed blue
    (tuple(Av2), 'rgba(61,129,246,0.5)', r'$A\vec v_2 = -1 (2\vec v_2)$'),    # full blue

    (tuple(v1), 'rgba(255,165,0,1.0)', r'$\vec v_1$'),       # dimmed orange
    (tuple(Av1), 'rgba(255,165,0,0.5)', r'$A\vec v_1 = 3 \vec v_1$'),     # full orange

    (tuple(x),  'rgba(216,26,96,1.0)', r'$\vec x$'),         # dimmed pink
    (tuple(Ax), 'rgba(216,26,96,0.5)', r'$A\vec x$'),        # full pink

]

# Prepare faint lines along the two eigenvector directions: [1,1] and [-1,1]
line_length = 6
eig1_dir = np.array([1, 1]) / np.linalg.norm([1, 1])
eig2_dir = np.array([-1, 1]) / np.linalg.norm([-1, 1])

eig1_line1 = eig1_dir * line_length
eig1_line2 = -eig1_dir * line_length
eig2_line1 = eig2_dir * line_length
eig2_line2 = -eig2_dir * line_length

background_lines = [
    ((eig1_line1[0], eig1_line1[1]), (eig1_line2[0], eig1_line2[1]), "#bbbbbb", "dot"),
    ((eig2_line1[0], eig2_line1[1]), (eig2_line2[0], eig2_line2[1]), "#cccccc", "dot"),
]

fig = plot_vectors(
    vectors,
    # background_lines=background_lines,
    vdeltay=0.1
)

# Add dotted lines in the eigenvector directions
for line in background_lines:
    (x0, y0), (x1, y1), color, dash = line
    fig.add_shape(
        type="line",
        x0=x0, y0=y0, x1=x1, y1=y1,
        line=dict(color=color, width=1, dash="dot"),
        layer="below"
    )

fig.update_layout(
    xaxis=dict(range=[-3, 3], dtick=1),
    yaxis=dict(scaleanchor="x", range=[-3, 3], dtick=1),
    title=r"$$\text{Visualizing the eigenvectors of } A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$$",
    margin=dict(l=0, r=0, t=80, b=0)
)

# Annotate [-1, 0] is not an eigenvector at the bottom left in #d81a60
fig.add_annotation(
    x=-2, y=-0.5,
    text=r"$$\begin{bmatrix} -1 \\ 0 \end{bmatrix} \text{ is not an eigenvector!}$$",
    font=dict(color="#d81a60", size=16),
    showarrow=False,
    align="left"
)

fig.show(scale=3, renderer='png')

Just to be 100% clear, I’ve used $\color{#3d81f6} \vec v_2$ instead of $\color{#3d81f6} \vec v_2$ above just to illustrate the fact that eigenvectors are only defined up to a scalar multiple; $\color{#3d81f6} 2 \vec v_2$ is just as good as an eigenvector as $\color{#3d81f6} \vec v_2$ is, and both correspond to the same eigenvalue, $\mathbf{\lambda} = 1$ .

You might notice that the two eigenvectors of $A$ corresponding to the two different eigenvalues are orthogonal in the example above. This is not true in general for any $2 \times 2$ matrix, but there’s a specific reason it’s true for $A$ : it’s symmetric. I’ll elaborate more on this idea in Chapter 5.3, but for now, just remember that symmetric matrices have orthogonal eigenvectors.

Finding Eigenvalues using `numpy`¶

Just to show you another example, consider

B = \begin{bmatrix} 2 & 5 \\ 1 & 0 \end{bmatrix}

Its eigenvalues and eigenvectors aren’t particularly nice, and since we don’t yet have a way to find them by hand, now is as good as a time as any to use numpy:

B = np.array([[2, 5],
              [1, 0]])

np.linalg.eig(B)

EigResult(eigenvalues=array([ 3.44948974, -1.44948974]), eigenvectors=array([[ 0.96045535, -0.82311938],
       [ 0.27843404,  0.56786837]]))

So, $B$ has eigenvalues of $\approx 3.45$ and $\approx -1.45$ . Note that the eigenvectors are the columns of the matrix returned, not the rows!

eigvals, eigvecs = np.linalg.eig(B)
for i in range(eigvecs.shape[1]):
    print(f"Eigenvector {i+1}: {eigvecs[:, i]}")
    print(f"Eigenvalue {i+1}: {eigvals[i]}")
    print()

Eigenvector 1: [0.96045535 0.27843404]
Eigenvalue 1: 3.4494897427831783

Eigenvector 2: [-0.82311938  0.56786837]
Eigenvalue 2: -1.449489742783178

eigvecs is a matrix where each column is an eigenvector of $B$ . For now, call this matrix $P$ . Below, I calculate $P^TP$ , which contains the dot products of all pairs of eigenvectors of $B$ . The diagonal of this matrix contains the dot products of each eigenvector with itself; since these are 1, this tells us that the returned eigenvectors are unit vectors. This was a design decision by the implementors of np.linalg.eig – remember that we can scale an eigenvector by any non-zero scalar and it is still an eigenvector. The off-diagonal entries of -0.632 tell us that the two eigenvectors are not orthogonal.

eigvecs.T @ eigvecs

array([[ 1.        , -0.63245553],
       [-0.63245553,  1.        ]])

Let’s take a look at the directions of the eigenvectors for $B$ .

from utils import plot_vectors
import numpy as np

# Matrix
A = np.array([[2, 5], [1, 0]])

# Find eigenvalues and eigenvectors
eigvals, eigvecs = np.linalg.eig(A)

# Pick v1 to be the *unit* eigenvector for the first eigenvalue (use eigvecs[:,0])
v1_unit = eigvecs[:, 0]# / np.linalg.norm(eigvecs[:, 0])
v1 = v1_unit

# Pick v2 to be *three times* the unit eigenvector for the second eigenvalue (eigvecs[:,1])
v2 = 3 * eigvecs[:, 1]# / np.linalg.norm(eigvecs[:, 1])

x = -np.array([1, 0])

Av1 = A @ v1
Av2 = A @ v2
Ax = A @ x

# To set alpha=0.5 in hex, convert as follows:
# orange -> #FFA500   ===> with alpha=0.5: #80FFA500
# #3d81f6 -> with alpha=0.5: #803d81f6
# #d81a60 -> with alpha=0.5: #80d81a60
# #004d40 -> with alpha=0.5: #80004d40

vectors = [
    (tuple(v2), 'rgba(61,129,246,1.0)', r'$3\vec v_2$'),      # dimmed blue (same as A example but 3*v2)
    (tuple(Av2), 'rgba(61,129,246,0.5)', r'$A\vec v_2 = %.2f (3\vec v_2)$' % eigvals[1]),    # full blue, faded
    (tuple(v1), 'rgba(255,165,0,1.0)', r'$\vec v_1$'),       # dimmed orange (matches prior coloring)
    (tuple(Av1), 'rgba(255,165,0,0.5)', r'$A\vec v_1 = %.2f \vec v_1$' % eigvals[0]),     # full orange, faded
    (tuple(x),  'rgba(216,26,96,1.0)', r'$\vec x$'),         # dimmed pink
    (tuple(Ax), 'rgba(216,26,96,0.5)', r'$A\vec x$'),        # full pink, faded
]

# Prepare faint lines along the two eigenvector directions (use actual directions from eigvecs)
line_length = 6
eig1_dir = v1_unit
eig2_dir = eigvecs[:, 1] / np.linalg.norm(eigvecs[:, 1])

eig1_line1 = eig1_dir * line_length
eig1_line2 = -eig1_dir * line_length
eig2_line1 = eig2_dir * line_length
eig2_line2 = -eig2_dir * line_length

background_lines = [
    ((eig1_line1[0], eig1_line1[1]), (eig1_line2[0], eig1_line2[1]), "#bbbbbb", "dot"),
    ((eig2_line1[0], eig2_line1[1]), (eig2_line2[0], eig2_line2[1]), "#cccccc", "dot"),
]

fig = plot_vectors(
    vectors,
    # background_lines=background_lines,
    vdeltay=0.1
)

# Add dotted lines in the eigenvector directions
for line in background_lines:
    (x0, y0), (x1, y1), color, dash = line
    fig.add_shape(
        type="line",
        x0=x0, y0=y0, x1=x1, y1=y1,
        line=dict(color=color, width=1, dash="dot"),
        layer="below"
    )

fig.update_layout(
    xaxis=dict(range=[-3, 3], dtick=1),
    yaxis=dict(scaleanchor="x", range=[-3, 3], dtick=1),
    title=r"$$\text{Visualizing the eigenvectors of } B = \begin{bmatrix} 2 & 5 \\ 1 & 0 \end{bmatrix}$$",
    margin=dict(l=0, r=0, t=80, b=0)
# 
)

# Annotate [-1, 0] is not an eigenvector at the bottom left in #d81a60
fig.add_annotation(
    x=-2.5, y=-0.25,
    text=r"$$\begin{bmatrix} -1 \\ 0 \end{bmatrix} \text{ is still not an eigenvector!}$$",
    font=dict(color="#d81a60", size=16),
    showarrow=False,
    align="left"
)

fig.show(scale=3, renderer='png')

While the eigenvectors of $B$ are not orthogonal, they are still linearly independent and span all of $\mathbb{R}^2$ . This was also the case in the $A$ example above. The fact that the eigenvectors of an $n \times n$ matrix are linearly independent and span all of $\mathbb{R}^n$ is also not guaranteed to be true, though it’s a desirable property. The class of matrices that have this property are called diagonalizable, which are the focus of Chapter 5.3.

Let’s consider a few more examples, each of which is meant to highlight a different key property of eigenvalues and eigenvectors.

Example: Matrix Powers¶

Let $A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$ be the matrix from the first example above. What are the eigenvalues and eigenvectors of $A^2$ ? Find them manually. You should notice the property below.

Solution

\begin{align*} A^2 &= \begin{bmatrix} 1 & 2 \\ 2 & 1\end{bmatrix}\begin{bmatrix} 1 & 2 \\ 2 & 1\end{bmatrix} \\&= \begin{bmatrix} 5 & 4 \\ 4 & 5 \end{bmatrix} \end{align*}

Similar to the original example, $A^2$ ’s rows both sum to the same number, 9, meaning that

\begin{bmatrix} 5 & 4 \\ 4 & 5 \end{bmatrix}\begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix}9 \\ 9\end{bmatrix} = 9\begin{bmatrix}1 \\ 1\end{bmatrix}

We can do the same thing with $A^2 \begin{bmatrix}1 \\ -1\end{bmatrix}$ :

A^2 \begin{bmatrix}1 \\ -1\end{bmatrix} = \begin{bmatrix}1 \\ -1\end{bmatrix}= 1\begin{bmatrix}1 \\ -1\end{bmatrix}

So the eigenvalues of $A^2$ are 9 and 1, and their respective eigenvectors are $\begin{bmatrix}1 \\ 1\end{bmatrix}$ and $\begin{bmatrix}1 \\ -1\end{bmatrix}$ .

But, these are the same two eigenvectors that $A$ has, and the corresponding eigenvalues are the squares of $A$ ’s eigenvalues, since $3^2 = 9$ and $(-1)^2 = 1$ . Is this true more generally? Yes!

If

\mathbf{\lambda}

is an eigenvalue of

A

with eigenvector

\vec v

, then

\mathbf{\lambda}^k

is an eigenvalue of

A^k

with the same eigenvector

\vec v

. (Click to see the proof.)

A quick proof: suppose $A \vec v = \lambda \vec v$ . Then,

A^2 \vec v = A (A \vec v) = A (\lambda \vec v) = \lambda (A \vec v) = \lambda^2 \vec v

This logic can be extended to $A^3$ , then $A^4$ , and so on. (If you are familiar with induction from EECS 203, you can think of this as an inductive proof.)

Note that the converse of this statement is not necessarily true, meaning it’s possible for $A^2$ to have an eigenvector that is not an eigenvector of $A$ . For example, if $A$ corresponds to a rotation by 90º degrees (or $\frac{\pi}{2}$ radians), then $A^2$ corresponds to a rotation by 180º degrees (or $\pi$ radians). No vector lies on the same line after rotation by 90º degrees, but all vectors lie on the same line after rotation by 180º degrees. So, in this example, $A^2$ has plenty of eigenvectors that $A$ does not have.

Example: Non-Invertible Matrices¶

Let $A = \begin{bmatrix} 1 & 4 \\ 3 & 12 \end{bmatrix}$ . Notice that $\text{rank}(A) = 1$ . $A$ has an eigenvalue of 13 with eigenvector $\begin{bmatrix} 1 \\ 3 \end{bmatrix}$ – verify that this is the case. Does it have another eigenvalue? What is the corresponding eigenvector?

Solution

Since $\text{rank}(A) = 1$ , $A$ has a non-trivial null space, and any vector in the null space will get sent to 0 times itself when multiplied by $A$ . Since $4 \cdot \text{column 1} = \text{column 2}$ , the null space of $A$ is spanned by the vector $\begin{bmatrix} 4 \\ -1 \end{bmatrix}$ . So,

A \begin{bmatrix} 4 \\ -1 \end{bmatrix} = \begin{bmatrix} 1 & 4 \\ 3 & 12 \end{bmatrix} \begin{bmatrix} 4 \\ -1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix} = 0 \begin{bmatrix} 4 \\ -3 \end{bmatrix}

$\vec 0$ can’t be an eigenvector, but 0 is a perfectly good eigenvalue.

Our intuition tells us that $A$ should have another eigenvalue, since it’s a $2 \times 2$ matrix. It happens to be 13, corresponding to the eigenvector $\begin{bmatrix} 1 \\ 3 \end{bmatrix}$ . In the section titled “The Characteristic Polynomial”, we’ll see how to find this eigenvalue-eigenvector pair without guesswork.

0 is an eigenvalue of

A

if and only if

A

is not invertible. (Click to see more details.)

If $A$ is invertible, then the only solution to $A \vec v = 0 \vec v = \vec 0$ is the trivial solution where $\vec v$ is the zero vector itself. But, we defined eigenvectors to be non-zero vectors. So, 0 can’t be an eigenvalue of an invertible matrix. If $A$ is not invertible, then $A$ has a non-trivial null space, and any non-zero vector in the null space is an eigenvector corresponding to the eigenvalue 0.

Example: The Identity Matrix¶

What are the eigenvalues and eigenvectors of $I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ ?

Solution

The identity matrix multiplied by any vector $\vec v$ returns that same vector back, meaning that all vectors in $\mathbb{R}^2$ are eigenvectors of $I$ , all with eigenvalue 1.

This is the first example we’ve seen so far where there exist multiple “lines” or “directions” of eigenvectors for a single eigenvalue. The vectors $\begin{bmatrix} 2 \\ 3 \end{bmatrix}$ and $\begin{bmatrix} -1 \\ 4 \end{bmatrix}$ are both eigenvectors of $I$ with eigenvalue 1 but they don’t lie on the same line. We will study this idea – of having multiple eigenvector directions for a single eigenvalue – more precisely in Chapter 5.3.

The Characteristic Polynomial¶

So far, we’ve found the eigenvalues and eigenvectors of a matrix by eyeballing them or reasoning about them geometrically, but this is not a sustainable strategy. Let’s develop a more systematic approach.

We’re looking for combinations of scalars, $\mathbf{\lambda}$ , and vectors, $\vec v$ , such that

A \vec v = \lambda \vec v

In some ways, we’re trying to “solve” for $\mathbf{\lambda}$ and $\vec v$ given $A$ . Let’s experiment a little:

\begin{align*} A \vec v = \lambda \vec v \\ A \vec v - \lambda \vec v = \vec 0 \\ (A - \lambda I) \vec v = \vec 0 \end{align*}

What the above says is that if $\vec v$ is an eigenvector of $A$ with eigenvalue $\mathbf{\lambda}$ , then $\vec v$ is in the null space of $A - \lambda I$ . But, since eigenvectors can’t be the zero vector, this means that $A - \lambda I$ is not invertible, since it has a non-trivial null space!

Thinking back to Chapter 2.9, we know that there are several equivalent ways to check if a matrix is not invertible. Perhaps the most computational approach is to compute its determinant; if a (square) matrix’s determinant is 0, then it is not invertible, otherwise it is.

So, since $A - \lambda I$ is not invertible, its determinant must be 0!

A \vec v = \lambda \vec v \implies \det(A - \lambda I) = 0

(We won’t always use the symbol $p(\lambda)$ , I just introduced it above to make it clear that $\text{det}(A - \lambda I)$ is a polynomial function of $\mathbf{\lambda}$ .)

Let’s revisit our example $A = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix}$ .

The matrix $A - \lambda I$ is

A - \lambda I = \begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} - \lambda \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 1 - \lambda & 2 \\ 2 & 1 - \lambda \end{bmatrix}

Note that $A - \lambda I$ involves subtracting $\mathbf{\lambda}$ from the diagonal elements of $A$ , and leaving all other elements unchanged.

The determinant of $A - \lambda I$ is

\det(A - \lambda I) = (1 - \lambda)(1 - \lambda) - 2 \cdot 2 = \lambda^2 - 2\lambda + 1 - 4 = \underbrace{\lambda^2 - 2\lambda - 3}_{p(\lambda)}

The eigenvalues of $A$ are the values of $\mathbf{\lambda}$ where $\lambda^2 - 2\lambda - 3 = 0$ .

xs = np.linspace(-3, 5)
p = lambda x: x ** 2 - 2 * x - 3
ys = p(xs)

# Roots of the polynomial (the eigenvalues)
eigenvalues = np.array([-1, 3])
eigen_y = p(eigenvalues)

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode='lines', line=dict(color='#3d81f6')))

# Add orange dots at eigenvalues
fig.add_trace(go.Scatter(
    x=eigenvalues, y=eigen_y,
    mode='markers',
    marker=dict(size=12, color='orange', line=dict(width=2, color='black')),
    name='eigenvalues'
))

# Annotate above each eigenvalue
for xv, yv in zip(eigenvalues, eigen_y):
    fig.add_annotation(
        x=xv, y=yv-4,  # slightly above the dot
        text="eigenvalue!",
        showarrow=False,
        font=dict(color="orange", size=16),
        yanchor='bottom'
    )

fig.update_layout(
    xaxis_title='$\lambda$',
    yaxis_title='$p(\lambda)$',
    width=600,
    height=300,
    font=dict(family="Palatino Linotype, Palatino, serif", size=16, color="#222"),
    plot_bgcolor="#ffffff",
    paper_bgcolor="#ffffff",
    xaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False,
        dtick=1
    ),
    yaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False
    ),
    margin=dict(l=60, r=20, t=50, b=60),
    title=r'$$p(\lambda) = \lambda^2 - 2\lambda - 3$$',
    showlegend=False
)

fig.show(scale=3, renderer='png')

(In general, we’re not going to plot the characteristic polynomial each time, but I think it’s useful to see once or twice to give the idea some context.)

By factoring, I can write this equation as

(\lambda + 1)(\lambda - 3) = 0

which tells me that the eigenvalues of $A$ are $\lambda_1 = -1$ and $\lambda_2 = 3$ . If this equation weren’t factorable, I’d need to use the quadratic formula to find the eigenvalues. For now, there’s no particular “ordering” to the eigenvalues, i.e. I could have said $\lambda_1 = 3$ and $\lambda_2 = -1$ ; all that matters is that I stay consistent throughout a particular example.

Once we find the eigenvalues by solving $p(\lambda) = 0$ , we can find the eigenvectors by solving $(A - \lambda I) \vec v = \vec 0$ for each eigenvalue.

For $\lambda_1 = -1$ , we’re looking for vectors $\vec v$ such that $A \vec v = -1 \vec v$ , i.e. if $\vec v = \begin{bmatrix} a \\ b \end{bmatrix}$ , then
$\begin{bmatrix} 1 & 2 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} = \begin{bmatrix} -a \\ -b \end{bmatrix}$
(I’m using $a$ and $b$ instead of $v_1$ and $v_2$ because I’ll refer to the vectors $\vec v_1$ and $\vec v_2$ in just a moment.) As a system of equations, this says
$\begin{align*} a + 2b &= -a \\ 2a + b &= -b \end{align*}$
The first and second equations both tell us that $b = -a$ . Remember, we’d expect there to be infinitely many solutions to this system, since any scalar multiple of an eigenvector is still an eigenvector. So, the “simple” solution is $\vec v_1 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$ , but $\begin{bmatrix} 2 \\ -2 \end{bmatrix}$ , $\begin{bmatrix} 3 \\ -3 \end{bmatrix}$ , etc. are also solutions.
For $\lambda_2 = 3$ , let me introduce another way of finding the corresponding eigenvector. There’s nothing wrong with the system of equations approach, but it’s useful to have multiple techniques for solving problems in our toolkit. $\lambda_2 = 3$ tells us that $A \vec v = 3 \vec v$ , or equivalently, $(A - 3I) \vec v = \vec 0$ . So, the eigenvector we’re looking for is in the null space of $A - 3I$ .
$A - 3I = \begin{bmatrix} 1 - 3 & 2 \\ 2 & 1 - 3 \end{bmatrix} = \begin{bmatrix} -2 & 2 \\ 2 & -2 \end{bmatrix}$
Here, we notice that both rows of $A - 3I$ sum to 0, meaning that
$(A - 3I) \begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}$
So, $\begin{bmatrix} 1 \\ 1 \end{bmatrix}$ is an eigenvector for $\lambda_2 = 3$ .

Example: $2 \times 2$ Matrices¶

Find the eigenvalues and eigenvectors of $A = \begin{bmatrix} 3 & 1 \\ 2 & 4 \end{bmatrix}$ .

Solution

\begin{align*} A-\lambda I &= \begin{bmatrix} 3 - \lambda & 1 \\ 2 & 4 - \lambda \end{bmatrix} \\\text{det}(A-\lambda I) &= \ (3 - \lambda)(4 - \lambda)-(1 \cdot 2) \\&= 12 -7\lambda + \lambda^2 - 2 \\&= \lambda^2 - 7\lambda +10 \\&= (\lambda-2)(\lambda-5) \end{align*}

So our eigenvalues are $\lambda_1 = 2$ and $\lambda_2=5$ . Next, let’s find eigenvectors for them. As I did in the first example, I’ll show two different techniques for finding eigenvectors.

For $\lambda_1 = 2$ , we’re looking for a vector $\vec v = \begin{bmatrix} a \\ b \end{bmatrix}$ such that $A \vec v = 2 \vec v$ :
$\begin{align*} A\vec v &= 2\vec v \\ \begin{bmatrix} 3 & 1 \\ 2 & 4 \end{bmatrix} \begin{bmatrix} a \\ b \end{bmatrix} &= \begin{bmatrix} 2a \\ 2b \end{bmatrix} \\ 3a + b &= 2a \\ 2a + 4b &= 2b \end{align*}$
As expected, there are infinitely many choices for $a$ and $b$ (since both equations tell us that $b = -a$ ), and the resulting eigenvectors $\vec v$ all lie on the same line. So, if we let $a = 1$ , then we have that $\vec v_1 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$ is an eigenvector for $\lambda_1$ .
For $\lambda_2 = 5$ , I’ll try the null space approach, just to illustrate it once more. Again, I’m looking for a vector $\vec v$ such that $A \vec v = 5 \vec v$ , or equivalently, $(A - 5I) \vec v = \vec 0$ .
$A - 5I = \begin{bmatrix} 3 - 5 & 1 \\ 2 & 4 - 5 \end{bmatrix} = \begin{bmatrix} -2 & 1 \\ 2 & -1 \end{bmatrix}$
The vector $\vec v_2 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ is orthogonal to both rows of $A - 5I$ , meaning $(A - 5I) \begin{bmatrix} 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}$ . So, $\vec v_2 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ is an eigenvector for $\lambda_2 = 5$ , though again any non-zero scalar multiple of $\begin{bmatrix} 1 \\ 2 \end{bmatrix}$ will also be an eigenvector for $\lambda_2 = 5$ .

So, $A$ has eigenvalues $\lambda_1 = 2$ and $\lambda_2 = 5$ , and corresponding eigenvectors $\vec v_1 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$ and $\vec v_2 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ .

Example: Diagonal Matrices¶

Find the eigenvalues and eigenvectors of $A = \begin{bmatrix} 3 & 0 \\ 0 & -4 \end{bmatrix}$ . What do you notice about the eigenvalues? The eigenvectors?

Solution

The key is to notice that for diagonal matrices – that is, matrices where all the entries off the diagonal are 0 – the eigenvalues are simply the entries on the diagonal.

\begin{align*} A-\lambda I &= \begin{bmatrix} 3 - \lambda & 0 \\ 0 &-4 - \lambda \end{bmatrix} \\\text{det}(A-\lambda I) &= (3 - \lambda)(-4 - \lambda) \end{align*}

The eigenvalues $\lambda_1 = 3$ and $\lambda_2 = -4$ are the same as the values on the diagonal! The corresponding eigenvectors are $\vec v_1 = \begin{bmatrix}1 \\ 0\end{bmatrix}$ and $\vec v_2 = \begin{bmatrix}0 \\ 1\end{bmatrix}$ . Just to illustrate one of those cases out,

A \vec v_2 = \begin{bmatrix} 3 & 0 \\ 0 & -4 \end{bmatrix} \begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ -4 \end{bmatrix} = -4 \begin{bmatrix} 0 \\ 1 \end{bmatrix}

The fact that eigenvalues and eigenvectors are so easy to find for diagonal matrices, coupled with the fact that diagonal matrices are really easy to compute powers of (e.g. $A^2$ is just the diagonal matrix with each entry squared), makes them very useful in practice.

Find two different $2 \times 2$ matrices who have the characteristic polynomial

p(\lambda) = \lambda^2 - 4 \lambda + 3

Solution

Recall that for a $2 \times 2$ matrix $A=\begin{bmatrix}a & b \\ c & d\end{bmatrix}$ :

\begin{align*} p(\lambda)&=\text{det}(A - \lambda I) \\&=\begin{vmatrix}a - \lambda & b \\ c & d-\lambda \end{vmatrix} \\&=(a-\lambda)(d-\lambda)-bc \end{align*}

A simple approach to this problem is to find 2 diagonal matrices $D = \begin{bmatrix}a & 0 \\ 0 & d\end{bmatrix}$ and $D'=\begin{bmatrix}d & 0 \\ 0 & a\end{bmatrix}$ . That way, all we have to do is factor $p(\lambda)$ into the form $(a-\lambda)(d-\lambda)$ :

\begin{align*} \lambda^2-4\lambda+3 &= (3-\lambda)(1-\lambda) \end{align*}

So, our two different matrices are $D = \begin{bmatrix}3 & 0 \\ 0 & 1\end{bmatrix}$ and $D' = \begin{bmatrix}1 & 0 \\ 0 & 3\end{bmatrix}$ .

There exist non-diagonal matrices with the same characteristic polynomial, too. For instance,

\begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}

also has eigenvalues of 1 and 3, and its characteristic polynomial is also $p(\lambda) = \lambda^2 - 4\lambda + 3$ .

Example: $3 \times 3$ Matrices¶

Recall, if $A$ is a $3 \times 3$ matrix, like

A = \begin{bmatrix} {\color{3d81f6} a} & {\color{3d81f6} b} & {\color{3d81f6} c} \\ {\color{orange} d} & {\color{d81a60}e } & {\color{004d40}f } \\ {\color{orange} g} & {\color{d81a60}h } & {\color{004d40}i } \end{bmatrix}

then $\text{det}(A)$ is

\text{det}(A) = {\color{3d81f6} a} \begin{vmatrix} {\color{d81a60}e } & {\color{004d40}f } \\ {\color{d81a60}h } & {\color{004d40}i } \end{vmatrix} - {\color{3d81f6} b} \begin{vmatrix} {\color{orange} d} & {\color{004d40}f } \\ {\color{orange} g} & {\color{004d40}i } \end{vmatrix} + {\color{3d81f6} c} \begin{vmatrix} {\color{orange} d} & {\color{d81a60}e } \\ {\color{orange} g} & {\color{d81a60}h } \end{vmatrix}

where $| \cdot |$ denotes the determinant of the matrix.

Find the eigenvalues and eigenvectors of the upper triangular matrix

A = \begin{bmatrix} 3 & 3 & 0 \\ 0 & 1 & 5 \\ 0 & 0 & 2 \end{bmatrix}

What do you notice about the eigenvalues?

Solution

We’ll start by calculating $\text{det}(A-\lambda I)$ using the formula for the $3 \times 3$ determinant provided for us:

\begin{align*} A-\lambda I &= \begin{bmatrix} {\color{3d81f6}3 - \lambda} & {\color{3d81f6}3} & {\color{3d81f6}0} \\ {\color{orange}0} & {\color{d81a60}1-\lambda} & {\color{004d40}5} \\ {\color{orange}0} & {\color{d81a60}0} & {\color{004d40}2 - \lambda} \end{bmatrix} \\ \text{det}(A-\lambda I)= &~{\color{3d81f6}(3-\lambda)} \begin{vmatrix} {\color{d81a60}1-\lambda} & {\color{004d40}5} \\ {\color{d81a60}0} & {\color{004d40}2-\lambda} \end{vmatrix} -~ {\color{3d81f6}3} \begin{vmatrix} {\color{orange}0} & {\color{004d40}5} \\ {\color{orange}0} & {\color{004d40}2-\lambda} \end{vmatrix} +~ {\color{3d81f6}0} \begin{vmatrix} {\color{orange}0} & {\color{d81a60}1-\lambda} \\ {\color{orange}0} & {\color{d81a60}0} \end{vmatrix} \\ &= (3-\lambda)\big[(1-\lambda)(2-\lambda) - 0\big] -~ 3\big[0 - 0\big] +~ 0\big[0 - 0\big] \\ &= (3-\lambda)(1-\lambda)(2-\lambda) \end{align*}

So, the eigenvalues are the values along the diagonal: $\lambda_1=3$ , $\lambda_2=1$ , and $\lambda_3=2$ . This is a convenient property of triangular matrices (which is why I’ve planted this example here). Next, let’s find their eigenvectors.

For $\lambda_1 = 3$ , I’ll do it the null space way: I’m looking for a vector in the null space of $A - 3I$ . (I say “the null space way” but it’s really just a different way of writing the system of equations approach, and one that can sometimes be easier to eyeball.)
$A - 3I = \begin{bmatrix} 0 & 3 & 0 \\ 0 & -2 & 5 \\ 0 & 0 & -1 \end{bmatrix}$
The zeros in the first column tell me that $(A - 3I) \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}$ , so $\vec v_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}$ is an eigenvector for $\lambda_1 = 3$ . Because I have three unique eigenvalues and my matrix is $3 \times 3$ , I know that this is the only possible eigenvector direction for $\lambda_1 = 3$ ; in Chapter 5.3, we’ll run into situations where the null space of $A - 3I$ (for instance) is spanned by more than one vector, but that’s not the case here.
For $\lambda_2 = 1$ , I’ll do it the system of equations way: I’m looking for a vector $\vec v = \begin{bmatrix} a \\ b \\ c \end{bmatrix}$ such that $A \vec v = 1 \vec v$ .
$\begin{align} A\vec v &= \vec v \\ \begin{bmatrix} 3 & 3 & 0 \\ 0 & 1 & 5 \\ 0 & 0 & 2 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \end{bmatrix} &= \begin{bmatrix} a \\ b \\ c \end{bmatrix} \\3a + 3b &= a \\b+5c &= b \\2c &= c \end{align}$
Once again, there are infinitely many solutions to this system, which we’d expect since there’s a whole line of eigenvectors.
- The third equation above tells us that $c = 0$ .
- Substituting that into the second equation, we have that $b + 5 \cdot 0 = b$ , i.e. $b = b$ , so let’s treat $b$ as a free variable for now.
- The first equation above tells us that $3a + 3b = a$ , i.e. $a = -\frac{3}{2}b$ .
So, any vector of the form $\begin{bmatrix} -\frac{3}{2}b \\ b \\ 0 \end{bmatrix}$ is an eigenvector for $\lambda_2 = 1$ . But we just need to find one, so let’s take $b = 2$ , which gives us $\vec v_2 = \begin{bmatrix} -3 \\ 2 \\ 0 \end{bmatrix}$ .
Finally, for $\lambda_3 = 2$ , I’ll do it the null space way.
$A - 2I = \begin{bmatrix} 3 - 2 & 3 & 0 \\ 0 & 1 - 2 & 5 \\ 0 & 0 & 2 - 2 \end{bmatrix} = \begin{bmatrix} 1 & 3 & 0 \\ 0 & -1 & 5 \\ 0 & 0 & 0 \end{bmatrix}$
I’m looking for a vector $\vec v_3 = \begin{bmatrix} a \\ b \\ c \end{bmatrix}$ in the null space of $A - 2I$ , i.e. a vector whose dot product with each row of $A - 2I$ is 0. The last row of $A - 2I$ is all zeros, so I can focus my attention on the first two rows. If the first two components of $\vec v_3$ are -3 and 1, then the dot product of $\vec v_3$ with the first row of $A - 2I$ is $(1)(-3) + (3)(1) = 0$ , which is what I want.
Then, its dot product with the second row is $(0)(-3) + (-1)(1) + (5)(c) = -1 + 5c$ , where $c$ is the last component of $\vec v_3$ . For $-1 + 5c = 0$ , we need $c = \frac{1}{5}$ . So, one vector in the null space of $A - 2I$ – which is an eigenvector for $\lambda_3 = 2$ – is $\vec v_3 = \begin{bmatrix} -3 \\ 1 \\ \frac{1}{5} \end{bmatrix}$ . To clean the numbers up a bit, I’ll scale this vector by 5, which gives us $\vec v_3 = \begin{bmatrix} -15 \\ 5 \\ 1 \end{bmatrix}$ .

So, our eigenvectors for $\lambda_1=3$ , $\lambda_2=1$ , and $\lambda_3=2$ are $\vec v_1=\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}$ , $\vec v_2=\begin{bmatrix}-3 \\ 2 \\ 0\end{bmatrix}$ , and $\vec v_3 = \begin{bmatrix}-15 \\ 5 \\ 1 \end{bmatrix}$ .

For upper triangular and lower triangular matrices, the eigenvalues are the values along the diagonal!

Recall, an upper triangular matrix has all zeros below the diagonal, and a lower triangular matrix has all zeros above the diagonal.

This comes from the fact that the determinant of an upper triangular matrix is the product of the diagonal entries, and the determinant of a lower triangular matrix is the product of the diagonal entries.

So, for example, the eigenvalues of

\begin{bmatrix} 10 & 3 & 2 & -50 \\ 0 & -94 & 2 & 1 \\ 0 & 0 & 0 & 3 \\ 0 & 0 & 0 & 12 \end{bmatrix}

are 10, -94, 0, and 12. And, the eigenvalues of

\begin{bmatrix} 2 & 0 & 0 \\ -1 & 5 & 0 \\ 4 & 3 & -7 \end{bmatrix}

are 2, 5, and -7.

# When in doubt, check with numpy!
np.linalg.eig(
    np.array([
        [2, 0, 0],
        [-1, 5, 0],
        [4, 3, -7]
    ])
)

EigResult(eigenvalues=array([-7.,  5.,  2.]), eigenvectors=array([[0.        , 0.        , 0.83925433],
       [0.        , 0.9701425 , 0.27975144],
       [1.        , 0.24253563, 0.4662524 ]]))

Number of Eigenvalues¶

The characteristic polynomial of an $n \times n$ matrix is a polynomial of degree $n$ . A fact from algebra is that a polynomial of degree $n$ has exactly $n$ roots. The intuitive way of thinking about this is that degree $n$ polynomials can have at most $n-1$ “bends” (a line can’t bend, a quadratic can bend once, a cubic can bend twice, etc.), and each time it bends, it can change directions to cross the $x$ -axis again.

The issue is that some of these roots may be repeated, and some may be complex numbers, meaning that they don’t actually cross the $x$ -axis in the standard $xy$ -plane (or in our case, the $\mathbf{\lambda}$ -axis and $\lambda, p(\lambda)$ -plane).

For example, the matrix $A = \begin{bmatrix} 4 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 4 \end{bmatrix}$ is diagonal, meaning its easy to read off its characteristic polynomial:

p(\lambda) = \det(A - \lambda I) = \begin{vmatrix} 4 - \lambda & 0 & 0 & 0 \\ 0 & 1 - \lambda & 0 & 0 \\ 0 & 0 & -\lambda & 0 \\ 0 & 0 & 0 & 4 - \lambda \end{vmatrix} = \lambda (\lambda - 1)(\lambda - 4)^2

This characteristic polynomial has a double root at $\lambda = 4$ , a single root at $\lambda = 1$ , and a single root at $\lambda = 0$ . This $A$ has 3 distinct eigenvalues, but one of them, $\lambda = 2$ , is repeated, and has an algebraic multiplicity of 2.

xs = np.linspace(-0.5, 5)
p = lambda x: (x - 4)**2 * (x - 1) * x
ys = p(xs)

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode='lines', line=dict(color='#3d81f6')))

eigenvalues = np.array([0, 1, 4])
eigen_y = p(eigenvalues)

# Add orange dots at eigenvalues
fig.add_trace(go.Scatter(
    x=eigenvalues, y=eigen_y,
    mode='markers',
    marker=dict(size=12, color='orange', line=dict(width=2, color='black')),
    name='eigenvalues'
))

# Annotate above each eigenvalue
for xv, yv in zip(eigenvalues, eigen_y):
    fig.add_annotation(
        x=xv, y=yv + (-6 if xv < 4 else + 4),  # slightly above the dot
        text="eigenvalue!" if xv < 4 else r"<b>double</b><br>eigenvalue!",
        showarrow=False,
        font=dict(color="orange", size=16),
        yanchor='bottom'
    )


fig.update_layout(
    xaxis_title='$\lambda$',
    yaxis_title='$p(\lambda)$',
    width=600,
    height=300,
    font=dict(family="Palatino Linotype, Palatino, serif", size=16, color="#222"),
    plot_bgcolor="#ffffff",
    paper_bgcolor="#ffffff",
    xaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False
    ),
    yaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False
    ),
    margin=dict(l=60, r=20, t=50, b=60),
    title=r'$$p(\lambda) = \lambda (\lambda - 1)(\lambda - 4)^2$$',
    showlegend=False
)

fig.show(scale=3, renderer='png')

As another example, the matrix $A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix}$ has the characteristic polynomial

p(\lambda) = \begin{vmatrix} 3 - \lambda & -4 \\ 4 & 3 - \lambda \end{vmatrix} = (\lambda - 3)^2 + 16

I can expand this out to get $\lambda^2 - 6\lambda + 25$ , but in a case like this I think the above form is more telling. Visually, $p(\lambda)$ here is a parabola that sits entirely above the $\mathbf{\lambda}$ -axis, meaning it has no real roots, so $A$ has no real eigenvalues.

xs = np.linspace(3-5, 3+5)
p = lambda x: (x - 3)**2 + 16
ys = p(xs)

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ys, mode='lines', line=dict(color='#3d81f6')))

# eigenvalues = np.array([0, 1, 4])
# eigen_y = p(eigenvalues)

# # Add orange dots at eigenvalues
# fig.add_trace(go.Scatter(
#     x=eigenvalues, y=eigen_y,
#     mode='markers',
#     marker=dict(size=12, color='orange', line=dict(width=2, color='black')),
#     name='eigenvalues'
# ))

# # Annotate above each eigenvalue
# for xv, yv in zip(eigenvalues, eigen_y):
#     fig.add_annotation(
#         x=xv, y=yv + (-6 if xv < 4 else + 4),  # slightly above the dot
#         text="eigenvalue!" if xv < 4 else r"<b>double</b><br>eigenvalue!",
#         showarrow=False,
#         font=dict(color="orange", size=16),
#         yanchor='bottom'
#     )

fig.add_annotation(
    x=3, y=10,
    text='no real eigenvalues 😢',
    showarrow=False,
    font=dict(color="orange", size=16),
    yanchor='bottom'
)


fig.update_layout(
    xaxis_title='$\lambda$',
    yaxis_title='$p(\lambda)$',
    width=600,
    height=300,
    font=dict(family="Palatino Linotype, Palatino, serif", size=16, color="#222"),
    plot_bgcolor="#ffffff",
    paper_bgcolor="#ffffff",
    xaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False
    ),
    yaxis=dict(
        showgrid=True,
        gridcolor="#f0f0f0",
        zeroline=True,
        zerolinecolor="#ccc",
        zerolinewidth=2,
        linecolor=None,
        linewidth=0,
        mirror=False,
        range=[-1, 30]
    ),
    margin=dict(l=60, r=20, t=50, b=60),
    title=r'$$p(\lambda) = (\lambda - 3)^2 + 16$$',
    showlegend=False
)

fig.show(scale=3, renderer='png')

It should not be all that surprising that $A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix}$ has no real eigenvalues. $A$ is a rotation matrix, scaled by a factor of 5:

A = \begin{bmatrix} 3 & -4 \\ 4 & 3 \end{bmatrix} = 5 \begin{bmatrix} \frac{3}{5} & -\frac{4}{5} \\ \frac{4}{5} & \frac{3}{5} \end{bmatrix} = 5 \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}

where $\theta = \cos^{-1}\left(\frac{3}{5}\right)$ . This matrix rotates vectors by $\mathbf{\theta}$ radians counterclockwise, which means that no real-valued vector will remain in the same direction after being multiplied by $A$ . In the $2 \times 2$ case, the only way to get a real-valued eigenvalue out of a rotation matrix is if the matrix rotates by an integer multiple of $\mathbf{\pi}$ radians (180 degrees), which either corresponds to reflecting/negating the vector (like in $\begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}$ ) or returning back the same vector itself (like in $\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ , the identity matrix, which can be thought of as a rotation by $2 \pi$ ).

The solutions to $p(\lambda) = (\lambda - 3)^2 + 16 = 0$ are the complex numbers $\lambda = 3 + 4i$ and $\lambda = 3 - 4i$ , again where $i$ is the imaginary unit, defined by $i^2 = -1$ . The corresponding eigenvectors are complex too, and we won’t worry about finding them. (If you’re familiar with complex numbers, you might recognize these eigenvalues as $5e^{i\theta}$ and $5e^{-i\theta}$ , where $\theta = \cos^{-1}\left(\frac{3}{5}\right)$ . This comes from Euler’s formula, $e^{i\theta} = \cos \theta + i \sin \theta$ .)

Trace and Determinant¶

As a final example, consider the arbitrary $2 \times 2$ matrix

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

Its characteristic polynomial is

p(\lambda) = \begin{vmatrix} a - \lambda & b \\ c & d - \lambda \end{vmatrix} = (a - \lambda)(d-\lambda) - bc = \lambda^2 - (a + d)\lambda + ad-bc

The coefficient on $\mathbf{\lambda}$ is $-(a + d)$ , which is the negative of the sum of the diagonal entries of $A$ . A term often used to refer to the sum of the diagonal entries of a matrix is its trace, $\text{trace}(A)$ . And, the constant term of $p(\lambda)$ above is just $\text{det}(A)$ ! For $2 \times 2$ matrices, this says that the characteristic polynomial is of the form

p(\lambda) = \lambda^2 - \text{trace}(A) \lambda + \text{det}(A)

For example, consider $A = \begin{bmatrix} 3 & 1 \\ 7 & 9 \end{bmatrix}$ . The fact above says that $A$ ’s characteristic polynomial is

p(\lambda) = \lambda^2 - (3 + 9)\lambda + (3 \cdot 9 - 1 \cdot 7) = \lambda^2 - 12\lambda + 20

which, with enough practice, is a calculation you might be able to do in your head. But, this gives way to a more general fact, that is true for any $n \times n$ matrix.

Example 1: $A = \begin{bmatrix} 3 & 1 \\ 7 & 9 \end{bmatrix}$ 's eigenvalues add up to $3 + 9 = 12$ and multiply to $\text{det}(A) = 3 \cdot 9 - 1 \cdot 7 = 20$ . This gives me a quick way of figuring out that $A$ ’s eigenvalues are 10 and 2.

Example 2: $A = \begin{bmatrix} 0.5 & 0.5 & 0 \\ 0.4 & 0.3 & 0.3 \\ 0 & 0.5 & 0.5 \end{bmatrix}$ has an eigenvalue of 1, corresponding to the eigenvector $\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}$ , since the sum of each row of $A$ is 1. Let $\lambda_2$ and $\lambda_3$ be the other two eigenvalues. The trace fact tells me that

1 + \lambda_2 + \lambda_3 = 0.5 + 0.3 + 0.5 = 1.3 \implies \lambda_2 + \lambda_3 = 0.3

and the determinant fact tells me that

\begin{align*} 1 \cdot \lambda_2 \cdot \lambda_3 &= \begin{vmatrix} 0.5 & 0.5 & 0 \\ 0.4 & 0.3 & 0.3 \\ 0 & 0.5 & 0.5 \end{vmatrix} \\ &= 0.5 (0.3 \cdot 0.5 - 0.3 \cdot 0.5) - 0.5 (0.4 \cdot 0.5 - 0.3 \cdot 0) + 0 (0.4 \cdot 0.5 - 0.3 \cdot 0) \\ &= -0.5 \cdot 0.4 \cdot 0.5 &= -0.1 \end{align*}

In other words, the other two eigenvalues sum to 0.3 and multiply to -0.1. This sounds like a job for 0.5 and -0.2, which are indeed the other two eigenvalues of $A$ (other than 1).

Example 3: $A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \end{bmatrix}$ has a determinant of $1 \cdot 4 - 2 \cdot 2 = 0$ , which means that its eigenvalues multiply to 0, which is another reminder that $A$ has an eigenvalue of 0. The trace fact tells me that the other eigenvalue $\lambda_2$ must satisfy $1 + 4 = 0 + \lambda_2$ , so $\lambda_2 = 5$ .

Adjacency Matrices¶

I’ve pulled this section out into its own page, Chapter 5.1, Part 2: Adjacency Matrices.

Key Takeaways¶

Suppose $A$ is an $n \times n$ matrix.

$\mathbf{\lambda}$ is an eigenvalue of $A$ , corresponding to the eigenvector $\vec v$ , if
$A \vec v = \lambda \vec v$
The English interpretation of this is that $\vec v$ is an eigenvector of $A$ if, when multiplied by $A$ , $\vec v$ still points in the same direction, and is just stretched by a factor of $\mathbf{\lambda}$ .
An $n \times n$ matrix $A$ has exactly $n$ eigenvalues. Some of these may be complex, and some of these may be repeated, but all of them are roots to the characteristic polynomial of $A$ ,

p(\lambda) = \det(A - \lambda I)

A quick way to check if you’ve found the right eigenvalues is that

The product of the eigenvalues of $A$ is equal to $\det(A)$ , i.e. $\lambda_1 \lambda_2 \cdots \lambda_n = \det(A)$ .
The sum of the eigenvalues of $A$ is equal to the trace of $A$ , which is the sum of the diagonal elements of $A$ .

0 is an eigenvalue of $A$ if and only if $A$ is not invertible.
If $\mathbf{\lambda}$ is an eigenvalue of $A$ with eigenvector $\vec v$ , then $\mathbf{\lambda}^k$ is an eigenvalue of $A^k$ with the same eigenvector $\vec v$ .

Overview¶

Introduction¶

A First Example¶

Finding Eigenvalues using numpy¶

Example: Matrix Powers¶

Example: Non-Invertible Matrices¶

Example: The Identity Matrix¶

The Characteristic Polynomial¶

Example: 2×22 \times 22×2 Matrices¶

Example: Diagonal Matrices¶

Example: Sharing a Characteristic Polynomial¶

Example: 3×33 \times 33×3 Matrices¶

Number of Eigenvalues¶

Trace and Determinant¶

Adjacency Matrices¶

Key Takeaways¶

Finding Eigenvalues using `numpy`¶

Example: $2 \times 2$ Matrices¶

Example: $3 \times 3$ Matrices¶