Skip to article frontmatterSkip to article content

2.4. Span and Linear Independence

Overview

At the end of Chapter 2.3, we considered the problem of approximating u=[136]\color{orange}\vec u = \begin{bmatrix} 1 \\ 3 \\ 6 \end{bmatrix} using a linear combination of v1=[101]\color{#3d81f6} \vec v_1 = \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} and v2=[221]\color{#3d81f6} \vec v_2 = \begin{bmatrix} 2 \\ 2 \\ 1 \end{bmatrix}.

Loading...

v1\color{#3d81f6}\vec v_1 and v2\color{#3d81f6}\vec v_2 define a plane in R3\mathbb{R}^3. But not all pairs of vectors in R3\mathbb{R}^3 define a plane – for example, the vectors [100]\begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} and [200]\begin{bmatrix} 2 \\ 0 \\ 0 \end{bmatrix} define a line. How do we know if a set of vectors define a plane, line, or something else?

To motivate our discussion, let me recap “The Three Questions” involving linear combinations from Chapter 2.1.

So far, we’ve informally answered these questions in the context of various examples and problems. Here, I want to introduce a unifying framework for addressing these questions.

Above, I’ve used set builder notation to describe the span of a set of vectors. You’ll notice that I refer to v1,v2,,vd\vec v_1, \vec v_2, \ldots, \vec v_d as a set of vectors, but you’ll notice that I don’t always write them inside {set brackets}\{ \text{set brackets} \}; this is just to save space. But when referring to a span, I’ll always write the spanning set inside {set brackets}\{ \text{set brackets} \}.

Let’s look at several examples of spans, introducing important geometrical objects and ideas as we go. Chapter 2.5 will (eventually) cover how to describe lines, planes, and hyperplanes in Rn\mathbb{R}^n, and is designed to complement this section.


Span of a Single Vector

Suppose we have just a single vector vRn{\color{#3d81f6}\vec v} \in \mathbb{R}^n.

span({v})\text{span}(\{{\color{#3d81f6}\vec v}\}) is the set of all linear combinations of just v\color{#3d81f6}\vec v. Since there is only vector in the spanning set, there aren’t any other vectors to add to v\color{#3d81f6}\vec v, so the span is just the set of all scalar multiples of v\color{#3d81f6}\vec v.

span({v})={avaR}\text{span}(\{{\color{#3d81f6}\vec v}\}) = \left\{ a {\color{#3d81f6}\vec v} \mid a \in \mathbb{R} \right\}

The span of a single vector is always a line that passes through the origin and the vector’s coordinates. Another way of saying this is that a single vector spans a line. In Chapter 2.3, we learned how to project one vector onto the span of another vector.

Example in R2\mathbb{R}^2

For example, if we consider v=[12]\color{#3d81f6}\vec v = \begin{bmatrix} 1 \\ 2 \end{bmatrix} in R2\mathbb{R}^2, the span is the line through the origin and (1,2)(1, 2), which you might also recognize as the line y=2xy = 2x.

Image produced in Jupyter

There are infinitely many vectors in span({[12]})\text{span}(\{{\color{#3d81f6} \begin{bmatrix} 1 \\ 2 \end{bmatrix}}\}), including [24]\begin{bmatrix} 2 \\ 4 \end{bmatrix} and [12]\begin{bmatrix} -1 \\ -2 \end{bmatrix}, and the line shown above contains all of them. Notationally, we could say

[24]span({[12]})\begin{bmatrix} 2 \\ 4 \end{bmatrix} \in \text{span}(\{{\color{#3d81f6} \begin{bmatrix} 1 \\ 2 \end{bmatrix}}\})

Example in R3\mathbb{R}^3

As another example, consider v=[213]\color{#3d81f6} \vec v = \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix} in R3\mathbb{R}^3. The span of v\color{#3d81f6} \vec v is the line through the origin and (2,1,3)(2, -1, 3).

Loading...

The two marked points above are at the origin and (2,1,3)(2, -1, 3); I’ve marked them to emphasize that span({[213]})\text{span}(\{{\color{#3d81f6} \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}}\}) is the unique line that passes through both (2,1,3)(2, -1, 3) and the origin. There are infinitely many lines that pass through (2,1,3)(2, -1, 3), but only one of those also passes through the origin.

Note that lines are 1-dimensional objects, regardless of the dimension of the space they live in. The line shown above is 1-dimensional in the sense that any point on the line can be described using a single variable. That variable is aa from the definition of the span:

span({[213]})={a[213]aR}\text{span}(\{{\color{#3d81f6} \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}}\}) = \left\{ a {\color{#3d81f6} \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}} \mid a \in \mathbb{R} \right\}

Give me an aa, and I’ll give you a point on the line.

Generalization to Rn\mathbb{R}^n

The fact that a single vector spans a line is true, regardless of the dimension of the vectors themselves. The span of the vector [12100]R100\begin{bmatrix} 1 \\ 2 \\ \vdots \\ 100 \end{bmatrix} \in \mathbb{R}^{100} is the line through the origin and (1,2,,100)(1, 2, \ldots, 100) in R100\mathbb{R}^{100}. We can’t visualize what a line in 100-dimensional space looks like, but we know that it exists in this abstract sense.

Finally, I’ll say that in the edge case where v=[000]=0\color{#3d81f6}\vec v = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix} = \vec 0, span({0})={0}\text{span}(\{{\color{#3d81f6} \vec 0}\}) = \left\{ {\color{#3d81f6} \vec 0} \right\} is just the single point corresponding to the origin, not a line.

To read more about how to describe lines in Rn\mathbb{R}^n – especially those that aren’t forced to pass through the origin, see Chapter 2.5.


Span of Two Vectors

Instead of considering just a single vector, let’s now consider the span of two vectors.

Examples in R2\mathbb{R}^2

Let’s start with a simple example: the span of two vectors in R2\mathbb{R}^2. Consider

v1=[20],v2=[35]{\color{orange}\vec v_1 = \begin{bmatrix} 2 \\ 0 \end{bmatrix}}, \qquad {\color{#3d81f6}\vec v_2 = \begin{bmatrix} 3 \\ 5 \end{bmatrix}}

span({v1,v2})\text{span}(\{{\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2}\}) is the set of all possible linear combinations of v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2.

span({v1,v2})={a1v1+a2v2a1,a2R}\text{span}(\{{\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2}\}) = \left\{ a_1 {\color{orange}\vec v_1} + a_2 {\color{#3d81f6}\vec v_2} \mid a_1, a_2 \in \mathbb{R} \right\}
Image produced in Jupyter

What is the set of all possible linear combinations of v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2? Using v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2 as building blocks, what are all possible vectors we can reach?

Since v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2 point in different directions, we can reach any point in R2\mathbb{R}^2 by choosing appropriate values of a1a_1 and a2a_2.

Image produced in Jupyter

So, if v1=[20]\color{orange} \vec v_1 = \begin{bmatrix} 2 \\ 0 \end{bmatrix} and v2=[35]\color{#3d81f6} \vec v_2 = \begin{bmatrix} 3 \\ 5 \end{bmatrix}, span({v1,v2})=R2\text{span}(\{{\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2}\}) = \mathbb{R}^2, meaning that v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2 span the entire xyxy-plane.

Equivalently, for any vector bR2\vec b \in \mathbb{R}^2, there exist a1a_1 and a2a_2 such that a1v1+a2v2=ba_1 {\color{orange}\vec v_1} + a_2 {\color{#3d81f6}\vec v_2} = \vec b.


But, not every pair of vectors in R2\mathbb{R}^2 spans the entire plane. For example, consider the vectors

v1=[11],v2=[22]{\color{orange}\vec v_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}}, \qquad \color{#3d81f6}\vec v_2 = \begin{bmatrix} 2 \\ 2 \end{bmatrix}
Image produced in Jupyter

v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2 are scalar multiples of each other – i.e. they are collinear – so they only span a line, despite there being two vectors. They span the same line that either one of them spans individually, which is the line through the origin, (1,1)(1, 1), and (2,2)(2, 2).

If we start with just v1\color{orange}\vec v_1, the vector v2\color{#3d81f6}\vec v_2 doesn’t “unlock” or “contribute” any new vectors to the span, since v2\color{#3d81f6}\vec v_2 is just a scalar multiple of v1\color{orange}\vec v_1. As we will soon see, this means that the vectors v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2 are linearly dependent.

If we didn’t immediately recognize that v2\color{#3d81f6}\vec v_2 is just a scalar multiple of v1\color{orange}\vec v_1 and tried to write b=[811]\vec b = \begin{bmatrix} 8 \\ 11 \end{bmatrix} as a linear combination of v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2, we might think it’s possible, since we’re working with a system of two equations and two unknowns.

a1[11]+a2[22]=[811]a_1 {\color{orange} \begin{bmatrix} 1 \\ 1 \end{bmatrix}} + a_2 {\color{#3d81f6} \begin{bmatrix} 2 \\ 2 \end{bmatrix}} = \begin{bmatrix} 8 \\ 11 \end{bmatrix}

But, when you go to solve, you’ll find

a1+2a2=8a_1 + 2a_2 = 8
a1+2a2=11a_1 + 2a_2 = 11

which is a contradiction, since 8118 \neq 11.

And, for the vectors b\vec b that are in span({v1,v2})\text{span}(\{ {\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2} \}), there are infinitely many ways to write them as linear combinations of v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2. For example,

1[11]+3[22]=[77]1 {\color{orange} \begin{bmatrix} 1 \\ 1 \end{bmatrix}} + 3 {\color{#3d81f6} \begin{bmatrix} 2 \\ 2 \end{bmatrix}} = \begin{bmatrix} 7 \\ 7 \end{bmatrix}
7[11]+0[22]=[77]7 {\color{orange} \begin{bmatrix} 1 \\ 1 \end{bmatrix}} + 0 {\color{#3d81f6} \begin{bmatrix} 2 \\ 2 \end{bmatrix}} = \begin{bmatrix} 7 \\ 7 \end{bmatrix}
35[11]+21[22]=[77]-35 {\color{orange} \begin{bmatrix} 1 \\ 1 \end{bmatrix}} + 21 {\color{#3d81f6} \begin{bmatrix} 2 \\ 2 \end{bmatrix}} = \begin{bmatrix} 7 \\ 7 \end{bmatrix}

When solving for a1a_1 and a2a_2, I like to avoid situations where there are infinitely many solutions, since I’d like to make sure that if we all solve the same problem, we’ll all get the same answer. Remember that all of this connects back to linear regression and machine learning; the analog of a1a_1 and a2a_2 are the parameters of a linear model, and we’d like to have interpretable parameter values so that we can see how the inputs of a model relate to the outputs.

Back to the main idea. In R2\mathbb{R}^2, the span of two vectors is a line if the vectors are collinear, and the entire xyxy-plane otherwise.

Example in R3\mathbb{R}^3

Similar results hold for the span of two vectors in R3\mathbb{R}^3. Hopefully I’ve convinced you that

span({[521],[1042]})=span({[521]})a line in R3\text{span}(\{ \begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix}, \begin{bmatrix} 10 \\ 4 \\ 2 \end{bmatrix} \}) = \underbrace{\text{span}(\{ \begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix} \})}_{\text{a line in }\mathbb{R}^3}

since [1042]\begin{bmatrix} 10 \\ 4 \\ 2 \end{bmatrix} is just a scalar multiple of [521]\begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix}.

Let’s see a more interesting example. Consider the vectors

v1=[521],v2=[230]{\color{orange}\vec v_1 = \begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix}}, \quad \color{#3d81f6}\vec v_2 = \begin{bmatrix} -2 \\ 3 \\ 0 \end{bmatrix}

You’ll see them below, along with some of their linear combinations.

Loading...

As was the case in R2\mathbb{R}^2, since v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2 aren’t collinear, they span a plane. In R2\mathbb{R}^2, there was only one possible plane that two vectors could span, because all of R2\mathbb{R}^2 itself is a plane, but in R3\mathbb{R}^3 there are infinitely many planes.

A plane is a flat surface that extends infinitely in all directions, with the property that if you connect any two points on the plane, the line connecting them lies entirely on the plane.

Loading...

So,

span({v1,v2})={a1v1+a2v2a1,a2R}the plane above\text{span}(\{ {\color{orange} \vec v_1 }, {\color{#3d81f6} \vec v_2} \}) = \underbrace{\{ a_1 {\color{orange} \vec v_1} + a_2 {\color{#3d81f6} \vec v_2} \mid a_1, a_2 \in \mathbb{R} \}}_{ \text{the plane above} }

I hope to have convinced you earlier that lines are 1-dimensional objects, since you only need a single “variable” to describe any point on a given line.

Similarly, planes are 2-dimensional objects, since you need two “variables” to describe any point on a given plane. In the standard xyxy-plane, all you need to describe a point is an xx coordinate and a yy coordinate. Effectively, every vector in R2\mathbb{R}^2 can be written as a linear combination of the “default” (formally basis) vectors [10]\begin{bmatrix} 1 \\ 0 \end{bmatrix} and [01]\begin{bmatrix} 0 \\ 1 \end{bmatrix}.

any point in R2=x[10]+y[01]\text{any point in } \mathbb{R}^2 = x \begin{bmatrix} 1 \\ 0 \end{bmatrix} + y \begin{bmatrix} 0 \\ 1 \end{bmatrix}

The plane shown above is also 2-dimensional, despite living in R3\mathbb{R}^3, since all I need are two variables to describe any point on it. The variables I need are a1a_1 and a2a_2, which are the multipliers on v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2, respectively.

any point in span({v1,v2})=a1[521]+a2[230]\text{any point in } \text{span}(\{ {\color{orange} \vec v_1 }, {\color{#3d81f6} \vec v_2} \}) = a_1 {\color{orange} \begin{bmatrix} 5 \\ 2 \\ 1 \end{bmatrix}} + a_2 {\color{#3d81f6} \begin{bmatrix} -2 \\ 3 \\ 0 \end{bmatrix}}

Think of v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2 as defining a new coordinate system for the plane that they span, where the coordinates themselves are the scalars a1a_1 and a2a_2. a1a_1 and a2a_2 can be arbitrarily large or small, which is what allows the plane to extend infinitely.

The picture above makes clear that there are many vectors in span({v1,v2})\text{span}(\{ {\color{orange} \vec v_1 }, {\color{#3d81f6} \vec v_2} \}), but also many vectors in R3\mathbb{R}^3 that are not in the span. a1v1+a2v2=ba_1 {\color{orange} \vec v_1} + a_2 {\color{#3d81f6} \vec v_2} = \vec b has a solution for a1a_1 and a2a_2 if and only if b\vec b lies in the plane above.

In Chapter 2.5, I’ll say more about how to find the equation of a plane in the form ax+by+cz+d=0ax + by + cz + d = 0, but that’s not particularly important to us right now.

Subspaces of Rn\mathbb{R}^n

What if we’re dealing with two vectors in some arbitrary Rn\mathbb{R}^n? Consider

v1=[51328],v2=[01401]\vec v_1 = \begin{bmatrix} 5 \\ 1 \\ 3 \\ 2 \\ -8 \end{bmatrix}, \qquad \vec v_2 = \begin{bmatrix} 0 \\ 1 \\ 4 \\ 0 \\ 1 \end{bmatrix}

These vectors live in R5\mathbb{R}^5, but what they span is a “plane” in R5\mathbb{R}^5. The more typical way of phrasing this is that these vectors span a 2-dimensional subspace of R5\mathbb{R}^5.

We will formally define vector spaces and subspaces in a later chapter, but for now, think of a subspace as a flat object that passes through the origin and contains all linear combinations of some set of vectors.

  • A line through the origin is a 1-dimensional subspace.

  • A plane through the origin is a 2-dimensional subspace.

  • In higher dimensions, we’ll have 3-dimensional subspaces, 4-dimensional subspaces, and so on.

(A line that doesn’t pass through the origin is still a line, but isn’t a subspace, since subspaces must pass through the origin.)

The dimension of a subspace is the number of coordinates you need to describe any point in the subspace.

So although v1=[51328]\vec v_1 = \begin{bmatrix} 5 \\ 1 \\ 3 \\ 2 \\ -8 \end{bmatrix} and v2=[01401]\vec v_2 = \begin{bmatrix} 0 \\ 1 \\ 4 \\ 0 \\ 1 \end{bmatrix} sit inside R5\mathbb{R}^5, what they span is not all of R5\mathbb{R}^5, but rather a 2-dimensional “slice” of it, consisting of all linear combinations of v1\vec v_1 and v2\vec v_2. Any point in that subspace can be described using two coordinates, like a1a_1 and a2a_2 in

any point in span({v1,v2})=a1[51328]+a2[01401]\text{any point in } \text{span}(\{ {\vec v_1 }, {\vec v_2} \}) = a_1 { \begin{bmatrix} 5 \\ 1 \\ 3 \\ 2 \\ -8 \end{bmatrix}} + a_2 { \begin{bmatrix} 0 \\ 1 \\ 4 \\ 0 \\ 1 \end{bmatrix}}

Span of Three Vectors

So far, as special cases, we’ve considered the span of one vector and the span of two vectors. The case of three vectors is the last special case I’ll cover, and then we’ll generalize to any number of vectors in any Rn\mathbb{R}^n.

Examples in R2\mathbb{R}^2

Suppose we have three vectors in R2\mathbb{R}^2. What are their possible arrangements?

  1. All three are the zero vector, 0\vec 0.

  2. All three are collinear, meaning they lie on the same line.

  3. Two are collinear, and the third points in a different direction.

  4. All three point in different directions.

In all of these cases, the “largest” their span can be is all of R2\mathbb{R}^2. Case 3 and Case 4 are shown below, on the left and right, respectively.

Image produced in Jupyter

In the left example, v3=[30]\color{#d81b60} \vec v_3 = \begin{bmatrix} -3 \\ 0 \end{bmatrix} travels in a direction that v1=[11]\color{orange} \vec v_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix} and v2=[22]\color{#3d81f6} \vec v_2 = \begin{bmatrix} 2 \\ 2 \end{bmatrix} don’t, so removing v3\color{#d81b60} \vec v_3 would impact the span (it would drop from R2\mathbb{R}^2 to a line). But, we don’t need both v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2 to span R2\mathbb{R}^2, since v1\color{orange} \vec v_1 already travels in the direction of v2\color{#3d81f6} \vec v_2.

In the right example, where

v1=[11],v2=[12],v3=[30]{\color{orange}\vec v_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}}, \quad {\color{#3d81f6}\vec v_2 = \begin{bmatrix} -1 \\ 2 \end{bmatrix}}, \quad {\color{#d81b60}\vec v_3 = \begin{bmatrix} -3 \\ 0 \end{bmatrix}}

Any two of these vectors will span the entirety of R2\mathbb{R}^2, meaning that if you pick any two of them, you can recreate the third one. These three vectors are linearly dependent.

span({v1,v2,v3})=span({v1,v2})=span({v1,v3})=span({v2,v3})=R2\text{span}(\{ {\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2}, {\color{#d81b60}\vec v_3} \}) = \text{span}(\{ {\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2} \}) = \text{span}(\{ {\color{orange}\vec v_1}, {\color{#d81b60}\vec v_3} \}) = \text{span}(\{ {\color{#3d81f6}\vec v_2}, {\color{#d81b60}\vec v_3} \}) = \mathbb{R}^2

Examples in R3\mathbb{R}^3

Moving on up to R3\mathbb{R}^3, instead of considering all possible arrangements, let me enumerate their possible spans.

  1. All three are the zero vector, 0\vec 0. (Annoying edge case, but I’m including it for completeness.)

  2. All three are collinear, meaning they span a line.

  3. All three are on the same plane, meaning they span that same plane.

  4. None of the above, meaning they span all of R3\mathbb{R}^3.

As before, Case 1 and Case 2 are easy enough to reason about. First, let’s consider Case 3. Suppose

v1=[111],v2=[123],v3=[301]{\color{orange}\vec v_1 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}}, \quad {\color{#3d81f6}\vec v_2 = \begin{bmatrix} 1 \\ -2 \\ -3 \end{bmatrix}}, \quad {\color{#d81b60}\vec v_3 = \begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix}}
Loading...

Notice that all three vectors lie on the same plane. But, as we saw earlier, you only need two vectors to span a plane. If you remove any one of the three vectors, the remaining two will still span the exact same plane.

This is a consequence of the fact that you can write v3{\color{#d81b60}\vec v_3} as a linear combination of v1{\color{orange}\vec v_1} and v2{\color{#3d81f6}\vec v_2}.

v1=[111],v2=[123],v3=[301]{\color{orange}\vec v_1 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}}, \quad {\color{#3d81f6}\vec v_2 = \begin{bmatrix} 1 \\ -2 \\ -3 \end{bmatrix}}, \quad {\color{#d81b60}\vec v_3 = \begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix}}
v3=2v1v2{\color{#d81b60}\vec v_3} = -2{\color{orange}\vec v_1} - {\color{#3d81f6}\vec v_2}

Starting from v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2 alone, adding v3\color{#d81b60}\vec v_3 doesn’t unlock any new directions, since it’s already a linear combination of v1\color{orange}\vec v_1 and v2\color{#3d81f6}\vec v_2.

There is nothing “especially wrong” about v3\color{#d81b60}\vec v_3. We can rearrange the above to get v2=2v1v3{\color{#3d81f6}\vec v_2} = -2{\color{orange}\vec v_1} - {\color{#d81b60}\vec v_3} and v1=12v212v3{\color{orange}\vec v_1} = -\frac{1}{2}{\color{#3d81f6}\vec v_2} - \frac{1}{2}{\color{#d81b60}\vec v_3}, too.

As we saw earlier in the case of two collinear vectors in R2\mathbb{R}^2, having a vector that can be written as a linear combination of the other vectors in the set means that when we go to write other vectors as linear combinations of the vectors in the set, if it’s possible, there are infinitely many solutions.

If we let b=[b1b2b3]\vec b = \begin{bmatrix} b_1 \\ b_2 \\ b_3 \end{bmatrix} be any arbitrary vector in R3\mathbb{R}^3, then the system

a1v1+a2v2+a3v3=ba_1 {\color{orange}\vec v_1} + a_2 {\color{#3d81f6}\vec v_2} + a_3 {\color{#d81b60}\vec v_3} = \vec b

either has no solutions, if b\vec b is not on the plane defined by span({v1,v2,v3})\text{span}(\{ {\color{orange}\vec v_1}, {\color{#3d81f6}\vec v_2}, {\color{#d81b60}\vec v_3} \}), or it has infinitely many.


Let’s now consider Case 4, from the 4 cases above.

  1. All three are the zero vector, 0\vec 0. (Annoying edge case, but I’m including it for completeness.)

  2. All three are collinear, meaning they span a line.

  3. All three are on the same plane, meaning they span that same plane.

  4. None of the above, meaning they span all of R3\mathbb{R}^3.

Consider

v1=[111],v2=[143],v3=[301]{\color{orange}\vec v_1 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}}, \quad {\color{#3d81f6}\vec v_2 = \begin{bmatrix} 1 \\ 4 \\ -3 \end{bmatrix}}, \quad {\color{#d81b60}\vec v_3 = \begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix}}
Loading...

Drag the plot around to see the space from different angles. If we look at any pair of these vectors, we see they span a plane. But, since neither is a linear combination of the other two, all three of them bring something new to the span; none are redundant.

So, the span of these three vectors is all of R3\mathbb{R}^3! You can think of these three vectors as defining a new coordinate system for R3\mathbb{R}^3.

The default coordinate system in R3\mathbb{R}^3 uses three numbers to describe any point in R3\mathbb{R}^3. Those three numbers are used to take a linear combination of the default basis vectors,

any point in R3=x[100]+y[010]+z[001]\text{any point in } \mathbb{R}^3 = x \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} + y \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} + z \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}

The coordinate system defined by v1\color{orange}\vec v_1, v2\color{#3d81f6}\vec v_2, and v3\color{#d81b60}\vec v_3 also uses three numbers to describe any point in R3\mathbb{R}^3, it’s just that the coordinates used multiply vectors other than the default ones.

any point in R3=a1[111]+a2[143]+a3[301]\text{any point in } \mathbb{R}^3 = a_1 {\color{orange}\begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}} + a_2 {\color{#3d81f6}\begin{bmatrix} 1 \\ 4 \\ -3 \end{bmatrix}} + a_3 {\color{#d81b60}\begin{bmatrix} -3 \\ 0 \\ 1 \end{bmatrix}}

So, any vector bR3\vec b \in \mathbb{R}^3 can be written as a linear combination of v1\color{orange}\vec v_1, v2\color{#3d81f6}\vec v_2, and v3\color{#d81b60}\vec v_3.


Generalization to Rn\mathbb{R}^n

Consider the following vectors in R5\mathbb{R}^5.

v1=[51328],v2=[01401],v3=[10000]\vec v_1 = \begin{bmatrix} 5 \\ 1 \\ 3 \\ 2 \\ -8 \end{bmatrix}, \qquad \vec v_2 = \begin{bmatrix} 0 \\ 1 \\ 4 \\ 0 \\ 1 \end{bmatrix}, \qquad \vec v_3 = \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix}

None of these vectors are a linear combination of the other two. So what do they span? A 3-dimensional subspace of R5\mathbb{R}^5.

That subspace is a 3-dimensional slice of the 5-dimensional space that is R5\mathbb{R}^5, and any vector in that subspace can be written using three coordinates.

any point in span({v1,v2,v3})=a1v1+a2v2+a3v3\text{any point in } \text{span}(\{\vec v_1, \vec v_2, \vec v_3 \}) = a_1 \vec v_1 + a_2 \vec v_2 + a_3 \vec v_3

If, say, v3\vec v_3 was a linear combination of v1\vec v_1 and v2\vec v_2, then the span would be a 2-dimensional subspace of R5\mathbb{R}^5.


Thinking in Higher Dimensions

Let’s generalize what we’ve discussed so far. What I’m about to say is abstract, but try and keep track of how it relates to the examples we’ve seen so far. If ever in doubt, plug in numbers for dd and nn to help make sense of it.

Remember that I’ve told you to think of a subspace as a flat object that passes through the origin and contains all linear combinations of some set of vectors. The dimension of a subspace is the number of coordinates you need to describe any point in the subspace; for instance, a line through the origin is a 1-dimensional subspace, and a plane through the origin is a 2-dimensional subspace.

Suppose we have dd vectors, all in Rn\mathbb{R}^n, labeled v1\color{#3d81f6}\vec v_1, v2\color{#3d81f6}\vec v_2, ..., vd\color{#3d81f6}\vec v_d. Then,

  • If we have fewer than nn vectors, i.e. d<nd < n, then the vectors span a 1, or 2, or 3, ..., or dd-dimensional subspace of Rn\mathbb{R}^n.

  • If we have at least nn vectors, i.e. dnd \geq n, then the vectors span a 1, or 2, or 3, ..., or nn-dimensional subspace of Rn\mathbb{R}^n. (An nn-dimensional subspace of Rn\mathbb{R}^n is all of Rn\mathbb{R}^n.)

Put this succinctly, the dimension of the subspace spanned by the vectors is between 1 and min(d,n)\text{min}(d, n).

In general, given a set of dd vectors in Rn\mathbb{R}^n, actually finding the dimension of the subspace they span involves solving a system of nn equations and dd unknowns. Later in this section, when we discuss how to find linear independent subsets, we’ll see how to do this by hand. But, know that numpy can help us.

For example, suppose we have d=5d = 5 vectors in R7\mathbb{R}^7, given by

v1=[53201431],v2=[20031900],v3=[3323531],v4=[3231005],v5=[0311352314]\vec v_1 = \begin{bmatrix} 5 \\ 3 \\ 2 \\ 0 \\ 14 \\ 3 \\ 1 \end{bmatrix}, \quad \vec v_2 = \begin{bmatrix} 2 \\ 0 \\ 0 \\ 3 \\ 19 \\ 0 \\ 0 \end{bmatrix}, \quad \vec v_3 = \begin{bmatrix} 3 \\ 3 \\ 2 \\ -3 \\ -5 \\ 3 \\ 1 \end{bmatrix}, \quad \vec v_4 = \begin{bmatrix} 3 \\ 2 \\ -3 \\ 1 \\ 0 \\ 0 \\ 5 \end{bmatrix}, \quad \vec v_5 = \begin{bmatrix} 0 \\ -3 \\ 11 \\ 3 \\ 52 \\ 3 \\ -14 \end{bmatrix}

If we store all 5 vectors as arrays, and use the magical np.linalg.matrix_rank function, we get back the dimension of the subspace they span.

import numpy as np

v1 = np.array([5, 3, 2, 0, 14, 3, 1])
v2 = np.array([2, 0, 0, 3, 19, 0, 0])
v3 = np.array([ 3, 3, 2, -3, -5, 3, 1])
v4 = np.array([3, 2, -3, 1, 0, 0, 5])
v5 = np.array([ 0, -3,  11,   3,  52,   3, -14])

np.linalg.matrix_rank(np.array([v1, v2, v3, v4, v5]).T)
3

We’ll explore matrices soon. The rank of a matrix is the dimension of the subspace spanned by its columns.

Since np.linalg.matrix_rank returned 3, the vectors v1,v2,v3,v4,v5\vec v_1, \vec v_2, \vec v_3, \vec v_4, \vec v_5 span a 3-dimensional subspace of R7\mathbb{R}^7.


Linear Independence

There’s an idea we’ve implicitly been using for a while now but haven’t given a name to.

Definitions

Intuitively, if a set of vectors is linearly dependent, then one of the vectors is a linear combination of the others. Equivalently, if a set of vectors is linearly dependent, there’s a non-trivial linear combination of the vectors that equals the zero vector (by non-trivial, I mean that at least one of the coefficients is non-zero).

Why are these two conditions equivalent? Here’s one way to see it. Suppose v1=αv2+βv3\vec v_1 = \alpha \vec v_2 + \beta \vec v_3, meaning that v1\vec v_1 can be written as a linear combination of v2\vec v_2 and v3\vec v_3. Rearranging the equation above gives us

v1αv2βv3=0\vec v_1 - \alpha \vec v_2 - \beta \vec v_3 = \vec 0

which shows us a non-trivial linear combination of v1,v2,v3\vec v_1, \vec v_2, \vec v_3 that gives 0\vec 0. The converse (reverse direction) is true too: if you start with a non-trivial linear combination of v1,v2,,vd\vec v_1, \vec v_2, \ldots, \vec v_d that gives 0\vec 0, then you can rearrange it to get v1=some linear combination of v2,,vd\vec v_1 = \text{some linear combination of } \vec v_2, \ldots, \vec v_d.

Examples

Let’s look at several sets of vectors and comment on their linear independence (or lack thereof).

VectorsLinearly...Why?
[103],[215]\begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 1 \\ 5 \end{bmatrix}IndependentNeither is a multiple of the other.
[103],[215],[5210],[111]\begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 1 \\ 5 \end{bmatrix}, \begin{bmatrix} 5 \\ 2 \\ 10 \end{bmatrix}, \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}DependentThese vectors live in R3\mathbb{R}^3, which is a universe that only has 3 independent directions, so you only need 3 vectors to span it. Give 4 vectors, we can write at least one of them as a linear combination of the others.
[103],[215],[5213]\begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 1 \\ 5 \end{bmatrix}, \begin{bmatrix} 5 \\ 2 \\ 13 \end{bmatrix}Dependentfirst vector+2(second vector)=third vector\text{first vector} + 2(\text{second vector}) = \text{third vector}
[103],[215],[5210]\begin{bmatrix} 1 \\ 0 \\ 3 \end{bmatrix}, \begin{bmatrix} 2 \\ 1 \\ 5 \end{bmatrix}, \begin{bmatrix} 5 \\ 2 \\ 10 \end{bmatrix}IndependentThe first two were already linearly independent from the first example, and we can’t write the third as a linear combination of the first two.

Note that if a set of vectors is linearly dependent, it doesn’t mean that every vector in the set can be written as a linear combination of the others. It just means that there’s at least one vector that can be written as a linear combination of the others. A good go-to example for this is the one below – v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2 are scalar multiples of each other, making the entire set of three vectors linearly dependent, but but v3\color{#d81b60} \vec v_3 is not a linear combination of v1\color{orange} \vec v_1 and v2\color{#3d81f6} \vec v_2.

Image produced in Jupyter

Unique Linear Combinations

Fact: If a set of vectors v1,v2,,vd{\color{#3d81f6}\vec v_1}, {\color{#3d81f6}\vec v_2}, \ldots, {\color{#3d81f6}\vec v_d} is linearly independent, then any vector b\vec b in the span of the vectors can be written as a unique linear combination of the vectors.

We’ve built intuition for this above, but let’s give a formal proof.

Let’s imagine an alternate universe where bspan({v1,v2,,vd})\vec b \in \text{span}(\{{\color{#3d81f6}\vec v_1}, {\color{#3d81f6}\vec v_2}, \ldots, {\color{#3d81f6}\vec v_d}\}) can be written as two different linear combinations of the vectors. (We’re doing a proof by contradiction, if you’re familiar with the idea.) In other words, suppose

a1v1+a2v2++advd=ba_1 {\color{#3d81f6}\vec v_1} + a_2 {\color{#3d81f6}\vec v_2} + \ldots + a_d {\color{#3d81f6}\vec v_d} = \vec b

and

c1v1+c2v2++cdvd=bc_1 {\color{#3d81f6}\vec v_1} + c_2 {\color{#3d81f6}\vec v_2} + \ldots + c_d {\color{#3d81f6}\vec v_d} = \vec b

and not all of the aia_i and cic_i are equal, meaning there’s at least one ii such that aicia_i \neq c_i.

What happens if we subtract the two equations?

(a1c1)v1+(a2c2)v2++(adcd)vd=0(a_1 - c_1) {\color{#3d81f6}\vec v_1} + (a_2 - c_2) {\color{#3d81f6}\vec v_2} + \ldots + (a_d - c_d) {\color{#3d81f6}\vec v_d} = \vec 0

Since v1,v2,,vd{\color{#3d81f6}\vec v_1}, {\color{#3d81f6}\vec v_2}, \ldots, {\color{#3d81f6}\vec v_d} are linearly independent, the only way this equation can hold is if all of the coefficients on the vi\color{#3d81f6}\vec v_i are zero. In other words, we’d need

a1c1=0a2c2=0adcd=0a_1 - c_1 = 0 \\ a_2 - c_2 = 0 \\ \ldots \\ a_d - c_d = 0

But if that’s the case, then ai=cia_i = c_i for all ii, which contradicts our assumption that not all of the aia_i and cic_i are equal.

So, this means that it can’t be the case that b\vec b can be written as two different linear combinations of the vectors. In other words, b\vec b can be written as a unique linear combination of the vectors, when the vectors v1,v2,,vd{\color{#3d81f6}\vec v_1}, {\color{#3d81f6}\vec v_2}, \ldots, {\color{#3d81f6}\vec v_d} are linearly independent.

Finding Linearly Independent Subsets with the Same Span

Given a set of vectors v1,v2,,vdRn\vec v_1, \vec v_2, \ldots, \vec v_d \in \mathbb{R}^n, we’d like to find a subset of the vectors that is linearly independent and has the same span as the original set of vectors. In other words, we’d like to “drop” the vectors that are linearly dependent on the others. For example, if we have 3 vectors in R3\mathbb{R}^3 and they span a plane, we can drop one of them and still span the same plane (not just any plane).

In the example below, we can remove any one of the three vectors, and the span of the remaining two is still the same plane.

Loading...

Dropping any “unnecessary” vectors will give us the desirable property that any vector in the span of the original set of vectors can be written as a unique linear combination of the vectors in the subset. (Remember that this has a connection to finding optimal model parameters in linear regression --- this is not just an arbitrary exercise in theory.)

One way to produce a linear independent subset is to execute the following algorithm:

given v_1, v_2, ..., v_d
initialize linearly independent set S = {v_1}
for i = 2 to d:
    if v_i is not a linear combination of S:
        add v_i to S

The vectors we’re left with form a basis for the span of the original set of vectors. The number of vectors we’re left with is the dimension of the span of the original set of vectors.

Let’s evaluate this algorithm on the following set of vectors:

v1=[340],v2=[011],v3=[322],v4=[653],v5=[251]\vec v_1 = \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix}, \quad \vec v_2 = \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix}, \quad \vec v_3 = \begin{bmatrix} -3 \\ -2 \\ 2 \end{bmatrix}, \quad \vec v_4 = \begin{bmatrix} 6 \\ 5 \\ -3 \end{bmatrix}, \quad \vec v_5 = \begin{bmatrix} 2 \\ 5 \\ 1 \end{bmatrix}

First, we start with S={v1}\color{orange} S = \{\vec v_1\}.

  • Iteration 1 (i=2i = 2): Is v2\vec v_2 a linear combination of the vectors in SS?
    No, since v2\vec v_2 is not a multiple of v1\vec v_1. The first components (3 in v1\vec v_1, 0 in v2\vec v_2) imply that if v2\vec v_2 were a multiple of v1\vec v_1 it’d need to be 0v10 \vec v_1, but the other components of v2\vec v_2 are non-zero.

    Outcome: Add v2\vec v_2 to SS. Now, S={v1,v2}\color{orange} S = \{\vec v_1, \vec v_2\}.

  • Iteration 2 (i=3i = 3): Is v3\vec v_3 a linear combination of the vectors in SS?
    To determine the answer, we need to try and find scalars a1a_1 and a2a_2 such that a1v1+a2v2=v3a_1 \vec v_1 + a_2 \vec v_2 = \vec v_3.

    a1[340]+a2[011]=[322]    {3a1+0a2=34a1+1a2=20a1+1a2=2a_1 \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix} + a_2 \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} -3 \\ -2 \\ 2 \end{bmatrix} \implies \begin{cases} 3a_1 + 0a_2 = -3 \\ 4a_1 + 1a_2 = -2 \\ 0a_1 + 1a_2 = 2 \end{cases}

    The first equation implies a1=1a_1 = -1 and the third equation implies a2=2a_2 = 2. Plugging both into the second equation gives 4(1)+1(2)=24(-1) + 1(2) = -2, which is consistent. This means that v3=v1+2v2\vec v_3 = - \vec v_1 + 2 \vec v_2, so v3\vec v_3 is a linear combination of v1\vec v_1 and v2\vec v_2, and we should not add it to SS.

    Outcome: Leave SS unchanged. Now, S={v1,v2}\color{orange} S = \{\vec v_1, \vec v_2\}.

  • Iteration 4 (i=4i = 4): Is v4\vec v_4 a linear combination of the vectors in SS?
    To determine the answer, we need to try and find scalars a1a_1 and a2a_2 such that a1v1+a2v2=v4a_1 \vec v_1 + a_2 \vec v_2 = \vec v_4.

    a1[340]+a2[011]=[653]    {3a1+0a2=64a1+1a2=50a1+1a2=3a_1 \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix} + a_2 \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 5 \\ -3 \end{bmatrix} \implies \begin{cases} 3a_1 + 0a_2 = 6 \\ 4a_1 + 1a_2 = 5 \\ 0a_1 + 1a_2 = -3 \end{cases}

    Similarly, we see that a1=2a_1 = 2 (from the first equation) and a2=3a_2 = -3 (from the third equation) are consistent with the second equation. This means that v4=2v13v2\vec v_4 = 2 \vec v_1 - 3 \vec v_2, so v4\vec v_4 is a linear combination of v1\vec v_1 and v2\vec v_2, and we should not add it to SS.

    Outcome: Leave SS unchanged. Now, S={v1,v2}\color{orange} S = \{\vec v_1, \vec v_2\}.

  • Iteration 5 (i=5i = 5): Is v5\vec v_5 a linear combination of the vectors in SS?
    To determine the answer, we need to try and find scalars a1a_1 and a2a_2 such that a1v1+a2v2=v5a_1 \vec v_1 + a_2 \vec v_2 = \vec v_5.

    a1[340]+a2[011]=[251]    {3a1+0a2=24a1+1a2=50a1+1a2=1a_1 \begin{bmatrix} 3 \\ 4 \\ 0 \end{bmatrix} + a_2 \begin{bmatrix} 0 \\ 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 \\ 5 \\ 1 \end{bmatrix} \implies \begin{cases} 3a_1 + 0a_2 = 2 \\ 4a_1 + 1a_2 = 5 \\ 0a_1 + 1a_2 = 1 \end{cases}

    The first equation implies a1=23a_1 = \frac{2}{3} and the third equation implies a2=1a_2 = 1. Plugging both into the second equation gives 4(23)+1(1)=11354(\frac{2}{3}) + 1(1) = \frac{11}{3} \neq 5, which means the system is inconsistent. So, v5\vec v_5 is not a linear combination of v1\vec v_1 and v2\vec v_2, and we should add it to SS.

    Outcome: Add v5\vec v_5 to SS. Now, S={v1,v2,v5}\color{orange} S = \{\vec v_1, \vec v_2, \vec v_5\}.

v1,v2,v5\vec v_1, \vec v_2, \vec v_5 are linearly independent vectors that have the same span as the original set of vectors. And since these are 3 linearly independent vetors in R3\mathbb{R}^3, their span is all of R3\mathbb{R}^3, since R3\mathbb{R}^3 is 3-dimensional and only has 3 independent directions to begin with!

Note that the subset that this algorithm produces is not unique, meaning that there exist other subsets of 3 of {v1,v2,v3,v4,v5}\{\vec v_1, \vec v_2, \vec v_3, \vec v_4, \vec v_5\} that are also linearly independent and have the same span as all of {v1,v2,v3,v4,v5}\{\vec v_1, \vec v_2, \vec v_3, \vec v_4, \vec v_5\} do (which is also the span of {v1,v2,v5}\{\vec v_1, \vec v_2, \vec v_5\}) If you started with v5\vec v_5, then considered v4\vec v_4, then considered v3\vec v_3, and so on, you’d end up with a subset that includes v4\vec v_4, for instance.

What is fixed, though, is how many linearly independent vectors you need to span the entire subspace that these five vectors span, and the answer to that is 3.

Homework 4 will have you practice this algorithm several times, though – as mentioned above – we’ll use the power of Python to handle some of this for us, soon.

Here’s one final abstract activity to think about. Answers aren’t provided since there’s a very similar question on Homework 4. But come ask us questions about it in office hours!