Thursday, July 12, 2018

Gentle introduction to Gibbs Sampling

In Bayesian inference machine learning, sometimes we face the intractable probability density function. At the time, approximation is required. In this article, I'll share with you "Gibbs Sampling" which is one of slick, imperative approximation technique. I would be glad if you find this article informative :)

0. Gentle introduction

"Gibbs sampling" is one of "Markov chain Monte Carlo" algorithm. It's the approach to obtain some infomation, such as expectation, variance, by sampling. Suppose we have distribution $p(z_1,z_2,z_3)$ over three variables and at step $i$ of the "gibbs sampling", we have selected values $z_1^i, z_2^i,z_3^i$. We can draw $z_1^{i+1}$ from the conditional distribution which is conditioned by $z_2^i$ and $z_3^i$.

$$z_1^{i+1} = p(z_1 | z_2^i,z_3^i)$$$$z_2^{i+1} = p(z_1 | z_1^{i+1},z_3^i)$$$$z_3^{i+1} = p(z_1 | z_1^{i+1},z_2^{i+1})$$

By above repetition, we can obtain sample of $z_1^{i+1},z_2^{i+1},z_3^{i+1}$.

1. Example of 2-dimentional gaussian distribution

As a example of "Gibbs sampling", 2-dimentional gaussian distribution will be approximated by gibbs sampling.

$$\begin{eqnarray}p(x_1,x_2) &=& N(x_1,x_2 | \mu, \Lambda^{-1}) \\&=& \frac{1}{\sqrt{2\pi}^2|\lambda ^{-1}|}\left\{-\frac{1}{2}\left(\begin{array}{c} x_1 - \mu_1\\ x_2 - \mu_2\end{array}\right)^T\begin{pmatrix} \Lambda_{11} & \Lambda_{12} \\ \Lambda_{21} & \Lambda_{22} \end{pmatrix}\left(\begin{array}{c} x_1 - \mu_1\\ x_2 - \mu_2\end{array}\right)\right\}\end{eqnarray}$$

So as to approximate this distribution, we have to be familier with $p(x_1|x_2)$ and $p(x_2|x_1)$. We can derive them with comparative ease.

$$\log(p(x_1 | x_2) = -\frac{1}{2}\left\{\Lambda_{11}x_1^2 -2(\Lambda_{11}\mu_1 - \Lambda_{12}x_2 - \Lambda_{12}\mu)x_1 +\ const\right\}$$

From the above form, we can say $p(x_1|x_2)$ will be 1-dimentional gaussian distribution. $$p(x_1|x_2) = N(x_1| \hat{\mu},\hat{\lambda})$$

$$\begin{eqnarray}\hat{\lambda} &=& \Lambda_{11} \\ \hat{\mu} &=& \mu_{1} - \frac{\Lambda_{12}}{\Lambda_{11}}(x_2 -\mu_2)\end{eqnarray}$$

By the same token, $p(x_2|x_1)$ should be 1-dimentional gaussian distribution as following.
$$p(x_2|x_1) = N(x_2| \hat{\mu},\hat{\lambda})$$

$$\begin{eqnarray}\hat{\lambda} &=& \Lambda_{22} \\ \hat{\mu} &=& \mu_{2} - \frac{\Lambda_{12}}{\Lambda_{22}}(x_1 - \mu_1)\end{eqnarray}$$

2. Implementation

I'll show you the example of 2-dimentional gaussian distribution. In the example we assume mean and co-variance matrix of the distribution is $\mu = \left(\begin{array}{c} 0 \\ 0 \end{array}\right)$ and $\Sigma = \begin{pmatrix} 1&0.5\\0.5&1 \end{pmatrix} $, respectively.

In [1]:
import numpy as np
import numpy.linalg as LA
from scipy.stats import norm
from scipy.stats import multivariate_normal
import matplotlib.pyplot as plt
% matplotlib inline
In [2]:
# Parameters of 2-dimentional gaussian distribution
mean = np.array([0,0])
cov = np.array([[1,0.5],
               [0.5,1]])

x = np.linspace(-2,2,101)
y = np.linspace(-2,2,101)
xx,yy = np.meshgrid(x,y)
data = np.empty(xx.shape + (2,))
data[:,:,0] = xx
data[:,:,1] = yy

# Obtain value from 2-dimentional gaussian distribution
z = multivariate_normal.pdf(x=data,mean=mean, cov=cov)

# Plot 2-dim gaussian distribution to be approximated
plt.figure(figsize=(8,6))
plt.contour(xx,yy,z,levels=[0.05,0.1,0.15,0.18],colors='blue')
plt.xlabel('x_1',fontsize=12)
plt.ylabel('x_2',fontsize=12)
plt.xlim((-3,3))
plt.ylim((-3,3))
plt.title('2-dimentional gaussian distribution to be approximated',
              fontsize=14)
plt.show()

Now is the time for gibbs-sampling :)

In [3]:
class gibbs_gauss2D():
    """
    Class for gibbs sampling from 2-dim gaussian
    """
    
    def __init__(self, mean, cov):
        """
        Mean and covariance matrix of 2-dim gaussian
        distribution to be approximated are assumed 
        to be given.
        """
        self.mean = mean
        self.cov = cov
        self.prec = LA.inv(cov)
        
    def sample_x_1(self, x_2):
        """
        Sample x_1 given x_2.
        """
        lambda_inv =  self.prec[0,0]
        mu = mean[0] - (self.prec[1,0]/self.prec[0,0]) * (x_2 - mean[1])
        return norm.rvs(loc=mu,scale=np.sqrt(1/lambda_inv))
        
    def sample_x_2(self, x_1):
        """
        Sample x_2 given x_2
        """
        lambda_inv =  self.prec[1,1]
        mu = mean[1] - (self.prec[1,0]/self.prec[1,1]) * (x_1 - mean[0])
        return norm.rvs(loc=mu,scale=np.sqrt(1/lambda_inv))
In [4]:
gibbs = gibbs_gauss2D(mean=mean,cov=cov)
iter_num = 100

# Numpy to store result of gibbs sampling
sample_result = np.empty((iter_num,2))
trace_result = []

# Initial value for x_2.
x_2_pre = 0.1

for i in np.arange(iter_num):

    # For only first sampling, x_2 to be given.
    if i ==0:
        sample_result[i,0] = gibbs.sample_x_1(x_2=x_2_pre)
        trace_result.append([sample_result[i,0],x_2_pre])
    else:
        sample_result[i,0] = gibbs.sample_x_1(
                                            x_2=sample_result[i-1,1])
        trace_result.append([sample_result[i,0],sample_result[i-1,1]])
        
    sample_result[i,1] = gibbs.sample_x_2(
                                            x_1=sample_result[i,0])
    trace_result.append([sample_result[i,0], sample_result[i,1]])

I'm gonna visualize the result of gibbs sampling. Somehow you can tell, samplings are close to original 2-dimentional gaussian distribution.

In [5]:
trace_result = np.array(trace_result)
plt.figure(figsize=(13,10))

# Plot 2-dim gaussian distribution to be approximated
plt.contour(xx,yy,z,levels=[0.05,0.1,0.15,0.18],colors='blue',alpha=0.5)

# Plot result of sampling
plt.scatter(x=sample_result[:,0],y=sample_result[:,1])
# Plot trace data
plt.plot(trace_result[:,0],trace_result[:,1],color='gray',ls=':')

# Plot setting
plt.title('Gibbs-samplibg of 2-dim gaussian distribution',fontsize=16)
plt.xlim((-3,3))
plt.ylim((-3,3))
plt.xlabel('x_1',fontsize=14)
plt.ylabel('x_2',fontsize=14)
plt.show()

From the result of gibbs sampling, I'm gonna plot approximate distribution.

In [6]:
plt.figure(figsize=(13,10))

# Compute mean and covariance matrix from result of gibbs sampling
app_cov = np.cov(sample_result.T)
app_mean = sample_result.mean(axis=0)

# Compute probability of approximate distribution
app_z = multivariate_normal.pdf(x=data,mean=app_mean,cov=app_cov)

# Plot 2-dim gaussian distribution to be approximated
plt.scatter(mean[0],mean[1],marker='x',color='blue',
                   label='original distribution')
plt.contour(xx,yy,z,levels=[0.05,0.1,0.15,0.18],colors='blue',alpha=0.5)

# Plot approximate distribution
plt.scatter(app_mean[0],app_mean[1],marker='x',color='red',
                    label='approximate distribution')
plt.contour(xx,yy,app_z,levels=[0.05,0.1,0.15,0.18],colors='red')

# Plot result of sampling
plt.scatter(x=sample_result[:,0],y=sample_result[:,1])
# Plot trace data
plt.plot(trace_result[:,0],trace_result[:,1],color='gray',ls=':')

# Plot setting
plt.title('Approximation 2-dim gaussian distribution with gibbs sampling',
              fontsize=16)
plt.xlabel('x_1',fontsize=14)
plt.ylabel('x_2',fontsize=14)
plt.xlim((-3,3))
plt.ylim((-3,3))
plt.legend(loc=2)
plt.show()

No comments:

Post a Comment