Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

3.1. Vectors and Linear Combinations

Linear algebra can be thought of as the study of vectors, matrices, and linear transformations, all of which are ideas we’ll need to use in our journey to understand machine learning. We’ll start with vectors, which are the building blocks of linear algebra.

Definition

There are many ways to define vectors, but I’ll give you the most basic and practically relevant definition of a vector for now. I’ll introduce more abstract definitions later if we need them.

By ordered list, I mean that the order of the numbers in the vector matters.

In general, we’re mostly concerned with vectors in Rn\mathbb{R}^n, which is the set of all vectors with nn components or elements, each of which is a real number. It’s possible to consider vectors with complex components (the set of all vectors with complex components is denoted Cn\mathbb{C}^n), but we’ll stick to real vectors for now.

The vector v\vec v defined in the box above is in R3\mathbb{R}^3, which we can express as vR3\vec v \in \mathbb{R}^3. This is pronounced as “v is an element of R three”, or “v is in R three”.

A general vector in Rn\mathbb{R}^n can be expressed in terms of its nn components:

v=[v1v2vn]\vec v = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix}

Subscripts can be used for different, sometimes conflicting purposes:

The meaning of the subscript depends on the context, so just be careful!

While we’ll use the definition of a vector as a list of numbers for now, I hope you’ll soon appreciate that vectors are more than just a list of numbers – they encode remarkable amounts of information and beauty.


Norm (i.e. Length or Magnitude)

In the context of physics, vectors are often described as creatures with “a magnitude and a direction”. While this is not a physics class – this is EECS 245, after all! – this interpretation has some value for us too.

To illustrate what we mean, let’s consider some concrete vectors in R2\mathbb{R}^2, since it is easy to visualize vectors in 2 dimensions on a computer screen. Suppose:

u=[31],v=[46]{\color{orange}\vec u = \begin{bmatrix} 3 \\ 1 \end{bmatrix}}, \quad {\color{3d81f6}\vec v = \begin{bmatrix} 4 \\ -6 \end{bmatrix}}

Then, we can visualize u\color{orange}{\vec u} and v\color{#3d81f6}{\vec v} as arrows pointing from the origin (0,0)(0,0) to the points (3,1)(3, 1) and (4,6)(4, -6) in the two dimensional Cartesian plane, respectively.

Loading...
Image produced in Jupyter

The vector v=[46]\vec v = \begin{bmatrix} 4 \\ -6 \end{bmatrix} moves 4 units to the right and 6 units down, which we know by reading the components of the vector. In Chapter 7.3, we’ll see how to describe the direction of v\vec v in terms of the angle it makes with the xx-axis (and you may remember how to calculate that angle using trigonometry).

It’s worth noting that v\vec v isn’t “fixed” to start at the origin – vectors don’t have fixed positions. All three vectors in the figure below are the same vector, v\vec v.

Image produced in Jupyter

To compute the length of v\vec v – i.e. the distance between (0,0)(0, 0) and (4,6)(4, -6) – we should remember the Pythagorean theorem, which states that if we have a right triangle with legs of length aa and bb, then the length of the hypotenuse is a2+b2\sqrt{a^2 + b^2}. Here, that’s 42+(6)2=16+36=52=213\sqrt{4^2 + (-6)^2} = \sqrt{16 + 36} = \sqrt{52} = 2\sqrt{13}.

Image produced in Jupyter

Note that the norm involves a sum of squares, much like mean squared error 🤯. This connection will be made more explicit in Chapter 7, when we return to studying linear regression.

Shortly, we’ll see other norms, which describe different ways of measuring the “length” of a vector.

What may not be immediately obvious is why the Pythagorean theorem seems to extend to higher dimensions. The two dimensional case seems reasonable, but why is the length of the vector w=[623]\color{#d81b60}\vec w = \begin{bmatrix} 6 \\ -2 \\ 3 \end{bmatrix} in R3\mathbb{R}^3 equal to 62+(2)2+32\sqrt{6^2 + (-2)^2 + 3^2}?

Loading...
Loading...

There are two right angle triangles in the picture above:

  • One triangle has legs of length 6 and 2, with a hypotenuse of hh; this triangle is shaded light blue\color{lightblue} \text{light blue} above.

  • Another triangle has legs of length 3 and hh, with a hypotenuse of w\left\| \vec w \right\|; this triangle is shaded dark pink\color{#d81b60} \text{dark pink} above.

To find w\left\| \vec w \right\|, we can use the Pythagorean theorem twice:

h2=62+(2)2=36+4=40    h=40h^2 = 6^2 + (-2)^2 = 36 + 4 = 40 \implies h = \sqrt{40}

Then, we can use the Pythagorean theorem again to find w\left\| \vec w \right\|:

w=h2+32=40+9=7=62+(2)2+32\left\| \vec w \right\| = \sqrt{h^2 + 3^2} = \sqrt{40 + 9} = 7 = \sqrt{6^2 + (-2)^2 + 3^2}

So, to find w\left\| \vec w \right\|, we used the Pythagorean theorem twice, and ended up computing the square root of the sum of the squares of the components of the vector, which is what the definition above states.

This argument naturally extends to higher dimensions. We will do this often: build intuition in the dimensions we can visualize (two dimensions, and with the help of interactive graphics, three dimensions), and then rely on the power of abstraction to extend our understanding to higher dimensions, even when we can’t visualize. Thinking in higher dimensions is one of the key objectives of this course.

Vector norms satisfy several interesting properties, which we will introduce shortly once we have more context.


Addition and Scalar Multiplication

Vectors support two core operations: addition and scalar multiplication. These two operations are core to the study of linear algebra – so much so, that sometimes vectors are defined abstractly as “things that can be added and multiplied by scalars”.

Addition

Using our examples from earlier, u=[31]\color{orange}\vec u = \begin{bmatrix} 3 \\ 1 \end{bmatrix} and v=[46]\color{#3d81f6}\vec v = \begin{bmatrix} 4 \\ -6 \end{bmatrix}, we have that u+v=[75]{\color{orange}\vec u} + {\color{#3d81f6}\vec v} = \begin{bmatrix} 7 \\ -5 \end{bmatrix}.

Geometrically, we can arrive at the vector [75]\begin{bmatrix} 7 \\ -5 \end{bmatrix} by drawing u{\color{orange}\vec u} at the origin, then placing v{\color{#3d81f6}\vec v} at the tip of u{\color{orange}\vec u}.

Image produced in Jupyter

Vector addition is commutative, i.e. u+v=v+u{\color{orange}\vec u} + {\color{#3d81f6}\vec v} = {\color{#3d81f6}\vec v} + {\color{orange}\vec u}, for any two vectors u,vRn{\color{orange}\vec u}, {\color{#3d81f6}\vec v} \in \mathbb{R}^n. Algebraically, this should not be a surprise, since ui+vi=vi+ui{\color{orange}u_i} + {\color{#3d81f6}v_i} = {\color{#3d81f6}v_i} + {\color{orange}u_i} for all ii.

Visually, this means that we can instead start with v{\color{#3d81f6}\vec v} at the origin and then draw u{\color{orange}\vec u} starting from the tip of v{\color{#3d81f6}\vec v}, and we should land in the same place.

Image produced in Jupyter

We cannot, however, add w=[623]\vec w = \begin{bmatrix} 6 \\ -2 \\ 3 \end{bmatrix} to u\vec u, since u\vec u and w\vec w have different numbers of components.

In Python, we define vectors using numpy arrays, and addition occurs element-wise by default.

Scalar Multiplication

Using our examples from earlier, v=[46]\color{#3d81f6}\vec v = \begin{bmatrix} 4 \\ -6 \end{bmatrix} as an example, 3v=[1218]3 {\color{#3d81f6}\vec v} = \begin{bmatrix} 12 \\ -18 \end{bmatrix}. Note that I’ve deliberately defined this operation as scalar multiplication, not just “multiplication” in general, as there’s more nuance to the definition of multiplication in linear algebra.

Visually, a scalar multiple is equivalent to stretching or compressing a vector by a factor of the scalar. If the scalar is negative, the direction of the vector is reversed. Below, 23v-\frac{2}{3} \color{#3d81f6}\vec v points opposite to v\color{#3d81f6}\vec v and 3v3 \color{#3d81f6}\vec v.

Image produced in Jupyter

An important observation is that v\color{#3d81f6}\vec v, 3v3 \color{#3d81f6}\vec v, and 23v-\frac{2}{3} \color{#3d81f6}\vec v all lie on the same line.


Linear Combinations

Motivation and Definition

The two operations we’ve defined – vector addition and scalar multiplication – are the building blocks of linear algebra, and are often used in conjunction. For example, if we stick with the same vectors u\color{orange} \vec u and v\color{#3d81f6} \vec v from earlier, what might the vector 3u12v3 {\color{orange} \vec u} - \frac{1}{2} {\color{#3d81f6} \vec v} look like?

3u12v=3[31]12[46]=[93][23]=[76]3 {\color{orange} \vec u} - \frac{1}{2} {\color{#3d81f6} \vec v} = 3 {\color{orange} \begin{bmatrix} 3 \\ 1 \end{bmatrix}} - \frac{1}{2} {\color{#3d81f6} \begin{bmatrix} 4 \\ -6 \end{bmatrix}} = \begin{bmatrix} 9 \\ 3 \end{bmatrix} - \begin{bmatrix} 2 \\ -3 \end{bmatrix} = \begin{bmatrix} 7 \\ 6 \end{bmatrix}
Image produced in Jupyter

The vector [76]\begin{bmatrix} 7 \\ 6 \end{bmatrix}, drawn in black above, is a linear combination of u\color{orange}\vec u and v\color{#3d81f6}\vec v, since it can be written in the form 3u12v3{\color{orange}\vec u} - \frac{1}{2}{\color{#3d81f6}\vec v}. 3 and 12-\frac{1}{2} are the scalars that the definition above refers to as a1a_1 and a2a_2, and we’ve used u\color{orange}\vec u and v\color{#3d81f6}\vec v in place of v1\color{#d81b60}\vec v_1 and v2\color{#d81b60}\vec v_2. (I’ve tried to make the definition a bit more general – here, we’re just working with d=2d = 2 vectors in n=2n = 2 dimensions, but in practice dd and nn could both be much larger.)

Example in 2D

Here’s another linear combination of u\color{orange}\vec u and v\color{#3d81f6}\vec v, namely 6u+5v6{\color{orange}\vec u} + 5{\color{#3d81f6}\vec v}. Algebraically, this is:

6u+5v=6[31]+5[46]=[3824]6{\color{orange}\vec u} + 5{\color{#3d81f6}\vec v} = 6{\color{orange}\begin{bmatrix} 3 \\ 1 \end{bmatrix}} + 5{\color{#3d81f6}\begin{bmatrix} 4 \\ -6 \end{bmatrix}} = \begin{bmatrix} 38 \\ -24 \end{bmatrix}

Visually:

Image produced in Jupyter

I like thinking of a linear combination as taking “a little bit of the first vector, a little bit of the second vector, etc.” and then adding them all together. (By “little bit”, I mean some amount of, e.g. 6u6 {\color{orange}\vec u} is a little bit of u\color{orange}\vec u.) Another useful analogy is to think of the original vectors as “building blocks” that we can use to create new vectors through addition and scalar multiplication.

This idea, of creating new vectors by scaling and adding existing vectors, is so important that it’s essentially what our multiple linear regression problem boils down to.

In the context of our commute times example, imagine dept\vec{\text{dept}} contains the home departure time, in hours, for each row in our dataset, and dom\vec{\text{dom}} contains the day of the month for each row in our dataset. If we want to use these two features in a linear model to predict commute time, our problem boils down to finding the optimal coefficients w0w_0, w1w_1, and w2w_2 in a linear combination of 1\vec 1, dept\vec{\text{dept}} and dom\vec{\text{dom}} that best predicts commute times.

vector of predicted commute times=w0[111]1+w1dept+w2dom\text{vector of predicted commute times} = w_0 \underbrace{\begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix}}_{\vec 1} + w_1 \vec{\text{dept}} + w_2 \vec{\text{dom}}

Think about why 1\vec 1 is necessary.

The Three Questions

We’re going to spend a lot of time thinking about linear combinations. Specifically:

Again, just as an example, suppose the two vectors we’re dealing with are our familiar friends:

u=[31],v=[46]{\color{orange}\vec u = \begin{bmatrix} 3 \\ 1 \end{bmatrix}}, \quad {\color{#3d81f6}\vec v = \begin{bmatrix} 4 \\ -6 \end{bmatrix}}

These are d=2d = 2 vectors in n=2n = 2 dimensions. With regards to the Three Questions:

  1. Can we write b\vec b as a linear combination of u{\color{orange}\vec u} and v{\color{#3d81f6}\vec v}?

    If b=[76]\vec b = \begin{bmatrix} 7 \\ 6 \end{bmatrix}, then the answer to the first question is yes, because we’ve shown that:

    3u12v=[76]3 {\color{orange}\vec u} - \frac{1}{2} {\color{#3d81f6}\vec v} = \begin{bmatrix} 7 \\ 6 \end{bmatrix}

    Similarly, if b=[3824]\vec b = \begin{bmatrix} 38 \\ -24 \end{bmatrix}, then the answer to the first question is also yes, because we’ve shown that:

    6u+5v=[3824]6 {\color{orange}\vec u} + 5 {\color{#3d81f6}\vec v} = \begin{bmatrix} 38 \\ -24 \end{bmatrix}

    If b\vec b is some other vector, the answer may be yes or no, for all we know right now.

  2. If so, are the values of the scalars on u{\color{orange}\vec u} and v{\color{#3d81f6}\vec v} unique?

    Not sure! It’s true that [76]=3u12v\begin{bmatrix} 7 \\ 6 \end{bmatrix} = 3 {\color{orange}\vec u} - \frac{1}{2} {\color{#3d81f6}\vec v}, but for all I know at this point, there could be other scalars a13a_1 \neq 3 and a212a_2 \neq -\frac{1}{2} such that:

    a1u+a2v=[76]a_1 {\color{orange}\vec u} + a_2 {\color{#3d81f6}\vec v} = \begin{bmatrix} 7 \\ 6 \end{bmatrix}

    (As it turns out, the answer is that the values 3 and 12-\frac{1}{2} are unique – you’ll show why this is the case in a following activity.)

  3. What is the shape of the set of all possible linear combinations of u{\color{orange}\vec u} and v{\color{#3d81f6}\vec v}?

    Also not sure! I know that [76]\begin{bmatrix} 7 \\ 6 \end{bmatrix} and [3824]\begin{bmatrix} 38 \\ -24 \end{bmatrix} are both linear combinations of u\color{orange}\vec u and v\color{#3d81f6}\vec v, and presumably there are many more, but I don’t know what they are.

    (It turns out that any vector in R2\mathbb{R}^2 can be written as a linear combination of u\color{orange}\vec u and v\color{#3d81f6}\vec v! Again, you’ll show this in an activity.)

We’ll more comprehensively study the “Three Questions” in Chapter 4.1. I just wanted to call them out for you here so that you know where we’re heading.

Example in 3D

As a final example, let’s consider the vectors:

w=[1246],r=[7110]{\color{#d81b60}\vec w = \begin{bmatrix} 12 \\ -4 \\ 6 \end{bmatrix}}, \quad {\color{#004d40}\vec r = \begin{bmatrix} 7 \\ 1 \\ 10 \end{bmatrix}}

These are d=2d = 2 vectors, as before, but now in n=3n = 3 dimensions. What do some of their linear combinations look like?

Loading...
Loading...

Next, we revisit norms in more depth, including geometry and alternative norms.