5.3. Rank and Column Space

Column Space and Rank¶

Let’s keep working with the matrix $A$ from Chapter 5.1.

A = \begin{bmatrix} 5 & 3 & 2 \\ 0 & -1 & 1 \\ 3 & 4 & -1 \\ 6 & 2 & 4 \\ 1 & 0 & 1 \end{bmatrix}

$A$ is a $5 \times 3$ matrix. It can be thought of as either:

3 vectors in $\mathbb{R}^5$ (the columns of $A$ )
5 vectors in $\mathbb{R}^3$ (the rows of $A$ )

Let’s start with the column perspective. We saw in Chapter 5.1 that if $x \in \mathbb{R}^3$ , then $A \vec x$ is a new vector in $\mathbb{R}^5$ that is a linear combination of the columns of $A$ . For instance, if we take $\color{orange} \vec x = \begin{bmatrix} 2 \\ 0 \\ -1 \end{bmatrix}$ , then

A {\color{orange} \vec x} = \begin{bmatrix} 5 & 3 & 2 \\ 0 & -1 & 1 \\ 3 & 4 & -1 \\ 6 & 2 & 4 \\ 1 & 0 & 1 \end{bmatrix} {\color{orange} \begin{bmatrix} 2 \\ 0 \\ -1 \end{bmatrix}} = {\color{orange} 2} \begin{bmatrix} 5 \\ 0 \\ 3 \\ 6 \\ 1 \end{bmatrix} + {\color{orange} 0} \begin{bmatrix} 3 \\ -1 \\ 4 \\ 2 \\ 0 \end{bmatrix} - {\color{orange} 1} \begin{bmatrix} 2 \\ 1 \\ -1 \\ 4 \\ 1 \end{bmatrix} = \begin{bmatrix} 8 \\ -1 \\ 7 \\ 8 \\ 1 \end{bmatrix}

By definition, the vector $A \vec x$ is in the span of the columns of $A$ , since it’s just a linear combination of $A$ ’s columns. Given what we’ve learned in Chapter 4.1 and Chapter 4.3 about spans and vector spaces, it’s natural to try and describe the span of $A$ ’s columns.

Definition: Column Space

If $A$ is an $n \times d$ matrix, then the column space of $A$ , denoted $\text{colsp}(A)$ , is the span of the columns of $A$ . Equivalently, it is the set of all possible results of $A \vec x$ for $\vec x \in \mathbb{R}^d$ .

So, if $A$ ’s columns are $\vec a^{(1)}, \vec a^{(2)}, \ldots, \vec a^{(d)}$ , like in

A = \begin{bmatrix} | & | & & | \\ \vec a^{(1)} & \vec a^{(2)} & \ldots & \vec a^{(d)} \\ | & | & & | \end{bmatrix}

then

\underbrace{\text{colsp}(A) = \text{span}\left( \vec a^{(1)}, \vec a^{(2)}, \ldots, \vec a^{(d)} \right) = \{ A \vec x \mid \vec x \in \mathbb{R}^d \}}_{\text{subspace of } \mathbb{R}^n}

Notice that I’ve intentionally chosen not to use subscripts to refer to columns; this is so that when we switch back to focusing on datasets and machine learning, we keep consistent the fact that subscripts refer to different rows/data points, not columns/features.

“Column space” is just a new term for a concept we’re already familiar with: the span of a set of vectors. In the example $A$ we’ve been working with, the column space is

\text{colsp}(A) = \text{span}\left( \left\{ \begin{bmatrix} 5 \\ 0 \\ 3 \\ 6 \\ 1 \end{bmatrix}, \begin{bmatrix} 3 \\ -1 \\ 4 \\ 2 \\ 0 \end{bmatrix}, \begin{bmatrix} 2 \\ 1 \\ -1 \\ 4 \\ 1 \end{bmatrix} \right\} \right)

This column space is a 2-dimensional subspace of $\mathbb{R}^5$ . Why is it 2-dimensional? The last column is a linear combination of the first two columns. Specifically,

\text{column 3} = \text{column 1} - \text{column 2}

Remember that:

the dimension of a subspace is the number of vectors in a basis for the subspace, and
a basis for a subspace is a linearly independent set of vectors that spans the entire subspace.

The first two columns of $A$ alone span the column space, and are linearly independent, and so $\text{dim}(\text{colsp}(A)) = 2$ . This number, 2, is the most important number associated with the matrix $A$ , so much so that we give it a special name.

The rank of a matrix tells us how “large” the space of possible linear combinations of the columns of $A$ is. We care about this because ultimately, our predictions in regression are just linear combinations of the columns of some data matrix.

To get a feel for the idea of rank, let’s work through some examples.

Example: Creating Matrices¶

Find three $3 \times 4$ matrices: one with rank 1, one with rank 2, and one with rank 3. Is it possible to have a $3 \times 4$ matrix with rank 4?

Solution

To create a $3 \times 4$ matrix with rank 1, we need there to only be one linearly independent column. For instance,
$A = \begin{bmatrix} 1 & 2 & 3 & 4 \\ 1 & 2 & 3 & 4 \\ 1 & 2 & 3 & 4 \end{bmatrix}$
has rank 1 because all columns are multiples of the first column.
To create a $3 \times 4$ matrix with rank 2, we need there to be two linearly independent columns. One way to construct such a matrix is to make the first two columns linearly independent, and make the last two columns linear combinations of the first two. For instance, in
$A = \begin{bmatrix} 1 & 1 & 2 & 2 \\ 1 & 2 & 2 & 3 \\ 1 & 3 & 2 & 4 \end{bmatrix}$
column 3 is a scalar multiple of column 1, and column 4 is column 1 + column 2.
To create a $3 \times 4$ matrix with rank 3, we need there to be three linearly independent columns, and the fourth column can be anything. One solution is
$A = \begin{bmatrix} 1 & 0 & 0 & 9 \\ 0 & 1 & 0 & 8 \\ 0 & 0 & 1 & 7 \end{bmatrix}$
A $3 \times 4$ matrix cannot have rank 4, because it’s impossible to have four linearly independent vectors in $\mathbb{R}^3$ . Any three linearly independent vectors in $\mathbb{R}^3$ span all of $\mathbb{R}^3$ , so a fourth vector in $\mathbb{R}^3$ would have to be a linear combination of the first three vectors.

Example: $2 \times 2$ Matrices¶

Let

A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}

Find a condition on $a, b, c, d$ that ensures $\text{rank}(A) = 2$ .

Solution

In order for $\text{rank}(A) = 2$ , the columns of $A$ must be linearly independent. This means that the second column cannot be a scalar multiple of the first column.

$\frac{b}{a}$ is the number we multiply $a$ by to get $b$ . So, we just need $\frac{b}{a} \cdot c$ to be different from $d$ .

\frac{b}{a} \cdot c \neq d \implies ad - bc \neq 0

So, if $ad - bc = 0$ , then $\text{rank}(A) = 1$ , and otherwise, $\text{rank}(A) = 2$ .

This expression, $ad - bc$ , is called the determinant of $A$ . We’ll learn more about determinants in Chapter 6.2.

Example: Diagonal Matrices¶

Suppose $d_1, d_2, \ldots, d_n$ are real numbers. What is the rank of the $n \times n$ diagonal matrix

D = \begin{bmatrix} d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n \end{bmatrix}

if

$d_i = i$ , for $i = 1, 2, \ldots, n$ ?
$d_1 = d_2 = \cdots = d_n = 0$ ?
$k$ of the $d_i$ are equal to 0, and the rest are positive?

Solution

$\text{rank}(D) = n$ : If $d_i = i$ , for $i = 1, 2, \ldots, n$ , meaning $d_1 = 1, d_2 = 2, \ldots, d_n = n$ , then the rank is $n$ , because all columns are linearly independent. None can be written as a linear combination of any others, since they all have their sole non-zero entry in a different row.
$\text{rank}(D) = 0$ : If $d_1 = d_2 = \cdots = d_n = 0$ , then the rank is 0, because all columns are the zero vector.
$\text{rank}(D) = n - k$ : If $k$ of the $d_i$ are equal to 0, and the rest are positive, then the rank is $n - k$ , because we can throw out the zero columns and the remaining columns are linearly independent.

Example: Vector Outer Product¶

Let $\vec u = \begin{bmatrix} 1 \\ -3 \\ 4 \end{bmatrix}$ and $\vec v = \begin{bmatrix} 2 \\ 5 \\ -1 \end{bmatrix}$ .

As we’ve seen before, the dot product $\vec u \cdot \vec v = \vec u^T \vec v$ is a scalar, equal to -17 here.

The outer product of $\vec u$ and $\vec v$ is the matrix

\vec u \vec v^T = \begin{bmatrix} 1 \\ -3 \\ 4 \end{bmatrix} \begin{bmatrix} 2 & 5 & -1 \end{bmatrix} = \begin{bmatrix} 2 & 5 & -1 \\ -6 & -15 & 3 \\ 8 & 20 & -4 \end{bmatrix}

In general, for any two vectors $\vec u, \vec v \in \mathbb{R}^n$ , what is the rank of $\vec u \vec v^T$ ?

Solution

The rank of $\vec u \vec v^T$ is 1, since each column in $\vec u \vec v^T$ is a scalar multiple of $\vec u$ . In the example above, column 2 is $\frac{5}{2}$ times column 1, and column 3 is $-\frac{1}{2}$ times column 1.

In Chapter 10, we’ll see that any rank $r$ matrix can be written as a sum of $r$ rank 1 matrices, each of which is of the form $\vec u \vec v^T$ . So, rank 1 matrices can be thought of as the building blocks of all matrices!

Example: Basis for Column Space¶

Find a basis for the column space of

A = \begin{bmatrix} 3 & 6 & 0 & 9 & 3 \\ 2 & 4 & 0 & 6 & 2 \\ 0 & 0 & 1 & -5 & 0 \\ 1 & 2 & 0 & 3 & 1 \end{bmatrix}

Note that we’ve already seen plenty of problems of this form in earlier homeworks. It’s just that there, the spanning set of vectors was given to you directly, and here, they’re stored as columns in a matrix. The idea is the same.

Solution

Column 2 is a multiple of column 1.
Column 3 is independent from column 1.
Column 4 is 3 times column 1 plus -5 times column 3.
Column 5 is just column 1.

$A$ only has two linearly independent columns - columns 1 and 3 - and so $\text{rank}(A) = 2$ , and a basis for the column space is

\left\{ \begin{bmatrix} 3 \\ 2 \\ 0 \\ 1 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix} \right\}

That’s not the only possible basis for the column space; two others are

\left\{ \begin{bmatrix} 30 \\ 20 \\ 0 \\ 10 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ -15 \\ 0 \end{bmatrix} \right\}

and

\left\{ \begin{bmatrix} 0 \\ 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 9 \\ 6 \\ -5 \\ 3 \end{bmatrix} \right\}

Finding the Rank Using Python¶

Shortly, we’ll learn a new technique for finding the rank of matrix by hand. But for the most part, we’ll not need to do this, and instead can use the power of Python to help us.

Returning to

A = \begin{bmatrix} 5 & 3 & 2 \\ 0 & -1 & 1 \\ 3 & 4 & -1 \\ 6 & 2 & 4 \\ 1 & 0 & 1 \end{bmatrix}

we have

import numpy as np

A = np.array([[5, 3, 2], 
              [0, -1, 1], 
              [3, 4, -1], 
              [6, 2, 4], 
              [1, 0, 1]])
              
np.linalg.matrix_rank(A)

2

Python makes it easy to experiment with how different operations affect the rank of a matrix.

For instance, in Chapter 5.4, we’ll prove that the matrix $A^TA$ has the same rank as $A$ , for any $n \times d$ matrix $A$ .

A.T @ A

array([[71, 39, 32],
       [39, 30,  9],
       [32,  9, 23]])

# Same as rank of A from above!
np.linalg.matrix_rank(A.T @ A)

2

Row Space¶

So far, we’ve focused on thinking of a matrix as a collection of “column” vectors written next to each other. This is the more common perspective, since – as I’ve harped on – $A \vec x$ is a linear combination of the columns of $A$ .

But we can also think of a matrix as a collection of “row” vectors written on top of each other.

A = \begin{bmatrix} 5 & 3 & 2 \\ 0 & -1 & 1 \\ 3 & 4 & -1 \\ 6 & 2 & 4 \\ 1 & 0 & 1 \end{bmatrix}

$A$ contains 5 vectors in $\mathbb{R}^3$ in its rows, each in $\mathbb{R}^3$ . These vectors also have a span, which in this case is a subspace of $\mathbb{R}^3$ .

Definition: Row Space

If $A$ is an $n \times d$ matrix, then the row space of $A$ , denoted $\text{rowsp}(A)$ , is the span of the rows of $A$ . Equivalently, it is the set of all possible results of $A^T \vec y$ for $\vec y \in \mathbb{R}^n$ .

So, if $A$ ’s rows are $\vec a_1, \vec a_2, \ldots, \vec a_n$ , like in

A = \begin{bmatrix} -- & \vec a_1 & -- \\ -- & \vec a_2 & -- \\ & \vdots & \\ -- & \vec a_n & -- \end{bmatrix}

then

\underbrace{\text{rowsp}(A) = \text{span}\left( \vec a_1, \vec a_2, \ldots, \vec a_n \right) = \{ A^T \vec y \mid \vec y \in \mathbb{R}^n \}}_{\text{subspace of } \mathbb{R}^d}

Where did $A^T \vec y$ come from? Remember, $A \vec x$ is a linear combination of the columns of $A$ . If we transpose $A$ , then $A^T \vec y$ is a linear combination of the columns of $A^T$ , which are the rows of $A$ .

\begin{align*} A^T {\color{orange} \vec y} &= \begin{bmatrix} 5 & 0 & 3 & 6 & 1 \\ 3 & -1 & 4 & 2 & 0 \\ 2 & 1 & -1 & 4 & 1 \end{bmatrix} {\color{orange} \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ y_4 \\ y_5 \end{bmatrix}} \\ &= \underbrace{{\color{orange} y_1} \begin{bmatrix} 5 \\ 3 \\ 2 \end{bmatrix} + {\color{orange} y_2} \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix} + {\color{orange} y_3} \begin{bmatrix} 3 \\ 4 \\ -1 \end{bmatrix} + {\color{orange} y_4} \begin{bmatrix} 6 \\ 2 \\ 4 \end{bmatrix} + {\color{orange} y_5} \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}}_\text{linear combination of rows of A} \end{align*}

Remember from Chapter 5.1 that $(A^T \vec y)^T = \vec y^T A$ . The product $\vec y^T A$ is also a linear combination of the rows of $A$ ; it just returns a row vector with shape $1 \times d$ rather than a vector with shape $d \times 1$ .

\begin{align*} {\color{orange} \vec y}^T A &= {\color{orange} \begin{bmatrix} y_1 & y_2 & y_3 & y_4 & y_5 \end{bmatrix}} \begin{bmatrix} 5 & 3 & 2 \\ 0 & -1 & 1 \\ 3 & 4 & -1 \\ 6 & 2 & 4 \\ 1 & 0 & 1 \end{bmatrix} \\ &= \underbrace{{\color{orange} y_1} \begin{bmatrix} 5 & 3 & 2 \end{bmatrix} + {\color{orange} y_2} \begin{bmatrix} 0 & -1 & 1 \end{bmatrix} + ... + {\color{orange} y_5} \begin{bmatrix} 1 & 0 & 1 \end{bmatrix}}_\text{linear combination of rows of A} \end{align*}

In $\vec y^T A$ , we left-multiplied $A$ by a vector; in $A \vec x$ , we right-multiplied $A$ by a vector. These are distinct types of multiplication, as they involve vectors of different shapes.

Since the columns of $A^T$ are the rows of $A$ , the row space of $A$ is the column space of $A^T$ , meaning

\text{rowsp}(A) = \text{colsp}(A^T)

To avoid carrying around lots of notation, I’ll often just use $\text{colsp}(A^T)$ to refer to the row space of $A$ .

Dimension of the Row Space¶

What is the dimension of the row space of $A$ ? Other ways of phrasing this question are:

How many linearly independent rows does $A$ have?
What is the rank of $A^T$ ?

We know the answer can’t be more than 3, since the rows of $A$ are vectors in $\mathbb{R}^3$ . We could use the algorithm first presented in Chapter 4.1 to find a linearly independent set of rows with the same span as all 5 rows.

An easy way to see that $\text{dim}(\text{colsp}(A^T)) = 2$ is to pick out two of the five vectors, $\text{row 2} = \vec a_2 = \begin{bmatrix} 0 \\ -1 \\ 1 \end{bmatrix}$ and $\text{row 5} = \vec a_5 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix}$ , and show that the other three vectors can be written as linear combinations of them. I chose $\vec a_2$ and $\vec a_5$ just because they have the simplest numbers, and because they’re linearly independent from one another.

$\vec a_1 = \begin{bmatrix} 5 \\ 3 \\ 2 \end{bmatrix} = -3 \vec a_2 + 5\vec a_5$
$\vec a_3 = \begin{bmatrix} 3 \\ 4 \\ -1 \end{bmatrix} = -4 \vec a_2 + 3\vec a_5$
$\vec a_4 = \begin{bmatrix} 6 \\ 2 \\ 4 \end{bmatrix} = -2 \vec a_2 + 6 \vec a_5$

You might notice that the number of linearly independent rows of $A$ and the number of linearly independent columns of $A$ were both 2.

Equivalence of Column Rank and Row Rank¶

Sometimes, we say the dimension of $\text{colsp}(A)$ is the column rank of $A$ , and the dimension of $\text{colsp}(A^T)$ is the row rank of $A$ . For the example $A$ we’ve been working with, both the column rank and row rank are 2.

But, there’s no need for separate names.

It’s not immediately obvious why this is true, and honestly, most of the proofs of it I’ve found aren’t all that convincing for a first-time linear algebra student. Still, I’ll try to prove this fact for you in just a little bit.

So, in general, what is $\text{rank}(A)$ ? Since the rank is equal to both the number of linearly independent columns and the number of linearly independent rows, then the largest the rank can be is the smaller of the number of rows and columns.

For example, if $A$ is a $7 \times 9$ matrix, then its rank is at most 7, since it cannot have more than 7 linearly independent rows. So in general, if $A$ is an $n \times d$ matrix, then

0 \leq \text{rank}(A) \leq \min(n, d)

\underbrace{\begin{bmatrix} 5 & 3 \\ 2 & 1 \\ -1 & \frac{1}{3} \\ 3 & 6 \\ 0 & 1 \end{bmatrix}}_{\text{if } n > d, \text{ rows can't be independent}} \qquad \underbrace{\begin{bmatrix} 1 & 0 & 3 & 2 & -1 \\ \frac{1}{9} & 3 & 0 & 0 & 2 \\ 9 & 0 & 0 & 6 & -3 \end{bmatrix}}_{\text{if } n < d, \text{ columns can't be independent}}

We say a matrix is full rank if it has the largest possible rank for a matrix of its shape, i.e. if $\text{rank}(A) = \min(n, d)$ . We’ll mostly use this term when refering to square matrices, which we’ll focus more on in Chapter 6.2.

What if a matrix isn’t full rank? (What a cliffhanger!) Keep reading Chapter 5.4 for the full story.

Column Space and Rank¶

Example: Creating Matrices¶

Example: 2×22 \times 22×2 Matrices¶

Example: Diagonal Matrices¶

Example: Vector Outer Product¶

Example: Basis for Column Space¶

Finding the Rank Using Python¶

Row Space¶

Dimension of the Row Space¶

Equivalence of Column Rank and Row Rank¶

Example: $2 \times 2$ Matrices¶