Recently, I was asked about some equation on Pattern Recognition and Machine learning by my friend. Hence, I will share some of them here :) In this article, we're gonna work on two following equations on page 80, ( Let $\Sigma$ be variance-covariance matrix, $\vec{u}$ be eigen vector, $\lambda$ be eigen value. )
$$ \Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$$
$$ \Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$$
1. Prerequisite Knowledge¶
To understand the above two equations, there are two prerequisite knowledge.
- Inverse matrix of diagonal matrix
Suppose we have n x n diagonal matrix as following,
This matrix is *invertible*, if all of elemnts on the main diagonal is non-zero.
Then, *Inverse matrix* of $D$ has *reciprocals* of the elements in the main diagonal
as below. $$D = \begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix}$$
- Inverse matrix of orthogonal matrix
If the matrix we wanna know about inverse matrix is orthogonal matrix, you can cut down on time to compute inverse matrix. Let's say matrix $D$ is orthogonal matrix. Then inverse matrix of $D$ would be transoposed $D$.
2. Derivation of equation¶
From the definition of eigen vector,
$$\Sigma \vec{u}_i = \lambda_i \vec{u}_i$$Therefore,
$$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\tag{1}$$Since $(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)$ is orthogonal matrix , we can apply "Inverse matrix of othogonal matrix" we discussed above.
$$\begin{eqnarray}\Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)^T\\ \Sigma &=& (\lambda_1\vec{u}_1, \lambda_2 \vec{u}_2, \dots, \lambda_D\vec{u}_D)\left( \begin{array}\vec{u}_1^T\\ \vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)\\ \end{eqnarray}$$As a result we can get equation $(2.48)$ $$ \Sigma = \sum_{i=1}^{D}\lambda_i \vec{u}_i \vec{u}_i^T \tag{2.48}$$
Equation $(1)$ can be expressed as below, $$\Sigma (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) = (\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}$$
Multiply inverse matrix of co-variance matrix from left on both side,
$$\Sigma^{-1}(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) \begin{pmatrix} \lambda_1& & & 0 \\ & \lambda_2 & & \\ & & \ddots & & \\ 0 & & & \lambda_n \end{pmatrix}=(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D) $$Now is the time to apply inverse matrix of diagonal matrix we disscussed above,
$$\Sigma^{-1} =(\vec{u}_1, \vec{u}_2, \dots, \vec{u}_D)\begin{pmatrix} \frac{1}{\lambda_1}& & & 0 \\ & \frac{1}{\lambda_2} & & \\ & & \ddots & & \\ 0 & & & \frac{1}{\lambda_n} \end{pmatrix} \left( \begin{array}\vec{u}_1^T\\\vec{u}_2^T\\ \vdots\\ \vec{u}_D^T \end{array}\right)$$
Finally, we can get $(2.49)$ :), $$ \Sigma^{-1} = \sum_{i=1}^{D}\frac{1}{\lambda_i} \vec{u}_i \vec{u}_i^T \tag{2.49}$$
No comments:
Post a Comment