Review of Linear Algebra

Matrix

Example application, we can create a matrix where we index the rows by companies, and columns by dates. Each entry in the matrix represents the stock price of a given company on a given date.

Ok, so we can represent data in a matrix, but what makes it so powerful?

From a theoretical point of view, a matrix is an operation! That is, an $m \times n$ matrix represents a Linear Transformation.

Now let’s try to reconcile the two points of views. That is, what does it mean for a dataset to be a linear transformation?

Eigenvalues and Eigenvectors

The most important concepts for us are the Eigenvector and Eigenvalues of a matrix. Recall that, due to the Characteristic Equation we find that all square matrices have eigenvectors and eigenvalues which may be in the complex plane.

We also learn about the geometrical meaning of eigenvalues and eigenvectors.

As matrices are linear transformations which simply transform some vector $v$ into a new vector $A v$ . We note that an eigenvector is some special vector that, when transformed by the matrix $A$ simply results in a scaled version of itself. The scaling factor is the eigenvalue.

Diagonalizable Matrices

A $n \times n$ matrix $A$ is orthonormally diagonalizable if there exists an orthonormal matrix $U$ such that:

A = U D U^{- 1}

Where $D$ is a diagonal matrix

Geometrical Meaning

This means that we can decompose the transformation $A$ into three simpler transformations:

$U^{- 1}$ : Rotate the vector into the eigenvector basis
$D$ : Scale each component by the corresponding eigenvalue
$U$ : Rotate back to the original basis

This decomposition is powerful because it allows us to understand and compute the effect of $A$ by breaking it down into simpler operations.

Symmetric Matrices

It is known that all Symmetric Matrices are orthonormally diagonalizable and have real eigenvalues.

This is by the Spectral Theorem

What about General Matrices?

First of all, not all matrices are diagonalizable.

But we still want to understand the matrix $A$ through “simple” operations such as scaling.

But there are still tools we can apply to all matrices, that is the Singular Value Decomposition (SVD)!

Comparing EVD with SVD

EVD only takes in square diagonalizable matrices. Whereas SVD can take in any $m \times n$ matrix. The reason why EVD is still more powerful is because

EVD gives you a set of $n$ vectors in which applying $A$ will simply scale them.

Whereas SVD gives you a set of $m$ vectors, tells you how they map and scale onto a set of $n$ vectors. What happens to the $m - n$ vectors left? They get mapped to the zero vector!

Looking at our data matrix example, we can assume that we will have a lot more columns than rows if columns are dates and rows are companies. Thus, we will often use the Simplified SVD form as it saves an enormous amount of computation.

Bringing it back

Let’s reconsider a data matrix $A$ with $m$ companies and $n$ dates, where each row represents the returns of that company.

Then $A A^{T}$ is a $m \times m$ covariance matrix, where entry $(i, j)$ represents the covariance between companies $i$ and $j$ .

An eigenvector of this matrix represents a weighted combination of companies (a portfolio or factor), and its corresponding eigenvalue measures the amount of variance explained by that factor.

Grape

Explorer

lectures/MIT 18.S096/Lecture 2 - Linear Algebra