Ordinary Least Squares (OLS)

Let’s start by defining the goal of our algorithm, what do we want to achieve with our OLS algorithm? Well if we have data points in a region (or XY-axis), then we want to be able to find an equation that fits as closely to these points as possible. We thus want to minimize the sum of the squared residuals. Just take a look at the picture below to see an illustration of this.


Calculating our OLS Algorithm - Model with 2 parameters

Now how can we calculate this? Well we know that our linear line can be represented in the form of $\hat{y} = a + bx$ which in its turn represents our predicted value. If our original value is then given by $y$, then we can say that our error is represented by the equation $y - \hat{y}$.

Simplifying this gives us that our error can be represented as: $d_i = y - (a + bx)$ or the sum for all the errors can be represented as $\sum^n_{i=1}{d_i^2} = \sum^n_{i=1}(y_i - (a + bx))^2$

Remembering our original goal minimize the sum of the squared residuals, we know that we will need a derivative, which will show us where the minimums are. Seeing that we have 2 parameters a and b, we know that this will be a partial derivative. So to find our minimums, we will calculate the partial derivatives for a and b, and calculate these to 0. Or $\frac{d}{da} = 0$ and $\frac{d}{db} = 0$

Let’s get started

Note: The following chapters will be purely mathematical. Make sure to remember that $(a - b)^2 = a^2 - 2ab + b^2$

Calculating our Derivative $\frac{d}{da} = 0$

$\frac{d}{da}\sum(y - (a + bx))^2$

$= \sum\frac{d}{da}(y^2 - 2y(a+bx)+(a + bx)^2)$

$= \sum\frac{d}{da}(y^2 - 2ya - 2ybx + a^2 + 2abx + b^2x^2)$

$= \sum(-2y + 2a + 2bx)$

$= 2\sum^n_{i=0}(a + bx - y)$

$\rightarrow na + \sum^n_{i=0}(bx - y) = 0 \leftrightarrow \sum^n_{i=0}{y} = na + b\sum^n_{i=0}{x}$

Calculating our Derivative $\frac{d}{db} = 0$

$\frac{d}{db}\sum(y - (a + bx))^2$

$= \sum\frac{d}{db}(y^2 - 2y(a+bx)+(a + bx)^2)$

$= \sum\frac{d}{db}(y^2 - 2ya - 2ybx + a^2 + 2abx + b^2x^2)$

$\rightarrow \sum(-2yx + 2ax + 2bx^2) = 0$

$\leftrightarrow -\sum{yx} + a\sum{x} + b\sum{x^2} = 0$

$\leftrightarrow a\sum^n_{i=0}{x} + b\sum^n_{i=0}{x^2} = \sum^n_{i=0}yx$

Calculating our OLS Algorithm - Multiple Terms

Linear Regression is however the easiest. So what do we do if we need to find more terms? Well then we are better of using Matrixes.

In matrixes this comes down to finding our parameters

so that is close to $\begin{bmatrix}y_1 \ y_2 \ … \ y_n\end{bmatrix}$

Now to solve this, we can use the formula $A^TAx = A^Tb$

For the proof to this formula, please refer to Hayashi, Fumio (2000). Econometrics. Princeton University Press https://press.princeton.edu/titles/6946.html

Let’s illustrate this with an example. What if we want to find the best line through (1,1), (2,3), (3,3), (4,5)?

Note: 0 since $\frac{3}{2} * 12 + \frac{-1}{2} * 36$ and $\frac{6}{5}$ since $\frac{-1}{2} * 12 + \frac{1}{5} * 36$


Xavier Geerinck

Xavier works as a Cloud Solution Architect at Microsoft, helping its customer unlock the full potential of the cloud. Even though he is still considered a young graduate, he achieved his first success at the age 16, by creating and selling his first startup. He then took this knowledge to create and help more startups in different markets such as technology, social media, philanthropy and home care. While in the meantime gaining more enterprise insights at renowned enterprises such as Nokia, Cisco and now Microsoft.

Read More