Tuesday, September 18, 2018

Tips for linear algebra

Recently, I was asked about some equation on Pattern Recognition and Machine learning by my friend. Hence, I will share some of them here :) In this article, we're gonna work on two following equations on page 80, ( Let $\Sigma$ be variance-covariance matrix, $\vec{u}$ be eigen vector, $\lambda$ be eigen value. )
$$ \Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$$ $$ \Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$$

1. Prerequisite Knowledge

To understand the above two equations, there are two prerequisite knowledge.

  • Inverse matrix of diagonal matrix
    Suppose we have n x n diagonal matrix as following,
$$D = \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}$$
         This matrix is *invertible*, if all of elemnts on the main diagonal is non-zero.
         Then, *Inverse matrix* of $D$ has *reciprocals* of the elements in the main diagonal
         as below. $$D = \begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix}$$
  • Inverse matrix of orthogonal matrix
    If the matrix we wanna know about inverse matrix is orthogonal matrix, you can cut down on time to compute inverse matrix. Let's say matrix $D$ is orthogonal matrix. Then inverse matrix of $D$ would be transoposed $D$.

2. Derivation of equation

From the definition of eigen vector,

$$\Sigma \vec{u}_i = \lambda_i \vec{u}_i$$

Therefore,

$$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\tag{1}$$

Since $(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)$ is orthogonal matrix , we can apply "Inverse matrix of othogonal matrix" we discussed above.

$$\begin{eqnarray}\Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)^T\\ \Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\left( \begin{array}\vec{u}_1^T\\ \vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)\\ \end{eqnarray}$$

As a result we can get equation $(2.48)$ $$ \Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$$

Equation $(1)$ can be expressed as below, $$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}$$

Multiply inverse matrix of co-variance matrix from left on both side,

$$\Sigma^{-1}(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}=(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) $$

Now is the time to apply inverse matrix of diagonal matrix we disscussed above,

$$\Sigma^{-1} =(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)\begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix} \left( \begin{array}\vec{u}_1^T\\\vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)$$

Finally, we can get $(2.49)$ :), $$ \Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$$

No comments:

Post a Comment