"Decorrelation" is one of the commonly used technique in machine learning. Practically, it's well known as whitening in conjunction with standardization. Should you be not savvy about standardization, you can check here .

0. Intoroduction to "decorrelation"¶

"Decorrelation" is linear transformation that transfer a vector of random variables with known covariance matrix into new random variables whose covariance is diagonal matrix, which means each random variables are uncorrelated.

1. Matrix consists of eigen vectors of covariance matrix¶

Suppose observed data $x$ is k-dimentional vector, $\left(\begin{array}{c} x_1\\x_2\\\vdots\\ x_d \end{array}\right)$ . Variance co-variance matrix of $x$ is $\Sigma$. Let us assume eigen value of $\Sigma$ is $(\begin{array}{v}\lambda_1, \lambda_2, \cdots,\lambda_d)\end{array}\ s.t\ (\lambda_1\leqq \lambda_2 \leqq \cdots \leqq \lambda_d)$ and corresponding eigen vectors are $(\begin{array}{v}\vec{s_1}, \vec{s_2}, \cdots,\vec{s_d})\end{array}$. From definition, $\Sigma\vec{s_i} = \lambda_i \vec{s_i}$ is trivial.
Now if we take $\left\{\begin{array}{v}\vec{s_1}, \vec{s_2}, \cdots,\vec{s_d}\end{array}\right\}$ as Orthonormal basis, matrix $S = \left(\begin{array}{v}\vec{s_1}, \vec{s_2}, \cdots,\vec{s_d}\end{array}\right)$ is diagnal matrix. The fact $S$ is rotation matrix which rotate vectores to the dicrection spanned by eigen vectors is proven.

2. Linear transformation by $S^T$¶

Let us think about linear transformation by $S^T$. Suppose $y$ is the data obtained by linear transformation from $\vec{x}$.

$$\vec{y} = S^T\vec{x}$$

Mean of $\vec{y}$ , $E\{\vec{y}\}$ and co-variance matrix, $Var\{\vec{y}\}$ are calculated as bellow.

$$\begin{eqnarray}E\{y\} &=& E\{S^T\vec{x}\}\\ &=&S^TE\{\vec{x}\} \\ &=& S^T\mu\ \ (\mu = E\{\vec{x}\})\end{eqnarray}$$$$\begin{eqnarray}Var\{y\} &=& E\{(y-E\{y\})(y-E\{y\})^T\}\\ &=& E\{(S^T\vec{x}-S^T\mu)(S^T\vec{x}-S^T\mu)^T\} \\ &=& E\{(S^T\vec{x}-S^T\mu)(\vec{x}^TS-\mu^TS)\} \\ &=& E\{S^T(\vec{x}-\mu)(\vec{x}^T-\mu^T)S\} \\ &=& S^TE\{(\vec{x}-\mu)(\vec{x}-\mu)^T\}S \\ &=& S^TVar\{x\}S \\ &=& S^T\Sigma S \\ \end{eqnarray}$$

Needless to say, $S^T\Sigma S$ is "diagonalization" . Therefore,
$$Var\{y\} = \begin{pmatrix} \lambda_1 & & & 0 \\ & \lambda_2 & & \\ & & \ddots & \\ 0 & & & \lambda_n \end{pmatrix}$$

Now we can tell correlation between random variales are ZERO. We can call this linear transformation "decorrelation" :)

3.Visualization of decorrelation¶

So far, I explained logic of decorrelation. But, "easier said than done", I believe the most efficient way to understand logic is implementing by hand and visualizing it. I will implement example of decorrelation.

First of all, we're gonna prepare observed data which is relatively correlated.

In [1]:

from scipy.stats import multivariate_normal
import numpy as np
import numpy.linalg as LA
import matplotlib.pyplot as plt
% matplotlib inline

In [2]:

# parameter of 2-dimentional gaussian distribution
mean = np.array([0,0])
var_matrix = np.array([[1,0.8],
                                    [0.8,1]])

# Sample from 2-dimentional gaussian distributin
observ_data = multivariate_normal.rvs(mean=mean,
                                      cov=var_matrix,size=500,random_state=42)
x = observ_data[:,0]
y = observ_data[:,1]

# plot observed data
plt.scatter(x,y)
plt.xlim((-6,6))
plt.ylim((-6,6))
plt.title('Original data')
plt.show()

Now is the time to decorrelate these data :)

In [3]:

# Compute eigen value and eigen vector
eigen_val = LA.eig(var_matrix)[0]
eigen_vec = LA.eig(var_matrix)[1]

# Create matrix S consists of eigen vector of covariance matrix
# Note : numpy.linalg.LA.eig returns vector whose norm is 1.
S_matrix = eigen_vec

# Decorrelate  observed. In essence, this is linear transformation
# which rotate data :)
decorrelated_data = S_matrix.T@observ_data.T

deco_x = decorrelated_data[0,:]
deco_y = decorrelated_data[1,:]

# plot
plt.scatter(deco_x,deco_y)
plt.xlim((-6,6))
plt.ylim((-6,6))
plt.title('Decorrelated data')
plt.show()

Now you can see data is decorrelated successfully :)

Tech Notes

Sunday, June 24, 2018

Decorrelation

0. Intoroduction to "decorrelation"¶

1. Matrix consists of eigen vectors of covariance matrix¶

2. Linear transformation by $S^T$¶

3.Visualization of decorrelation¶

No comments:

Post a Comment