Tech Notes: Tips for linear algebra

Recently, I was asked about some equation on Pattern Recognition and Machine learning by my friend. Hence, I will share some of them here :) In this article, we're gonna work on two following equations on page 80, ( Let $\Sigma$ be variance-covariance matrix, $\vec{u}$ be eigen vector, $\lambda$ be eigen value. )

$\Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$

$\Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$

1. Prerequisite Knowledge¶

To understand the above two equations, there are two prerequisite knowledge.

Inverse matrix of diagonal matrix
Suppose we have n x n diagonal matrix as following,

$D = \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}$
This matrix is *invertible*, if all of elemnts on the main diagonal is non-zero.
Then, *Inverse matrix* of

$D$ has *reciprocals* of the elements in the main diagonal
as below.

$D = \begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix}$

Inverse matrix of orthogonal matrix
If the matrix we wanna know about inverse matrix is orthogonal matrix, you can cut down on time to compute inverse matrix. Let's say matrix $D$ is orthogonal matrix. Then inverse matrix of $D$ would be transoposed $D$ .

2. Derivation of equation¶

From the definition of eigen vector,

$\Sigma \vec{u}_i = \lambda_i \vec{u}_i$

Therefore,

$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\tag{1}$

Since $(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)$ is orthogonal matrix , we can apply "Inverse matrix of othogonal matrix" we discussed above.

$\begin{eqnarray}\Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)^T\\ \Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\left( \begin{array}\vec{u}_1^T\\ \vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)\\ \end{eqnarray}$

As a result we can get equation $(2.48)$

$\Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$

Equation $(1)$ can be expressed as below,

$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}$

Multiply inverse matrix of co-variance matrix from left on both side,

$\Sigma^{-1}(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}=(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)$

Now is the time to apply inverse matrix of diagonal matrix we disscussed above,

$\Sigma^{-1} =(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)\begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix} \left( \begin{array}\vec{u}_1^T\\\vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)$

Finally, we can get $(2.49)$ :),

$\Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$

Reference :
C. M. Bishop. Pattern recognition and machine learning.

Tech Notes

Tuesday, September 18, 2018

Tips for linear algebra

1. Prerequisite Knowledge¶

2. Derivation of equation¶

No comments:

Post a Comment