Sunday, June 9, 2019

Markov Chain Monte Carlo (Metropolis algorithm)

In this article, I will share the implementation Metropolis algorithm , one of Markov Chain Monte Carlo sampling, and see the how it capture the target distribution.

0. Logic behind the scene

Let's say we wanna know probability distribution $p(z)$

$$p(z) = \frac{1}{Z_p}\tilde{p}(z)$$


Usually, $\tilde{p}(z)$ can be readily known whereas the value of $\frac{1}{Z_p} $ which is normalization term is unknown.
In Metropolis algorithm, proposal distribution $q(z)$ is required to be symmentric, which means $q(z_a|z_b) = q(z_b| z_a)$ for all values of $z_a$ and $z_b$.
Then candidate sample with state $z^{(\tau)}$ is then accepted with probability,

$$min\left(1, \frac{\tilde{p}(z)}{\tilde{p}(z^{(\tau)})}\right)$$

If the candidate sample is accepted, state $z^{(\tau + 1)} = z$ will be updated. Otherewise candidate sample is discarded, then $z^{(\tau +1)} = z^{(\tau)}$.

1. Target distribution

In this blog, target distribution is 2-dimentional gaussian distribution without normalization term like following.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal
import numpy.linalg as LA
%matplotlib inline
plt.style.use('dark_background')
In [2]:
def p_tilda(X, mean=np.zeros(2).reshape(-1,1),
                cov=np.array([[1,0.6],[0.6,1]])):
    """
    2-dimentional gaussian distribution
    without normalization term
    """
    try:
        X.shape[1]
    except IndexError:
        X = X.reshape(-1,1)
    
    return np.exp(
        -0.5 * (X-mean).T@LA.inv(cov)@(X-mean)
    ).ravel()

2. Implementation

In following program, isotropic gaussian distribution whose standard deviation is 0.2 is utilized as proposal distribution. Also I set burnin is 30. Since early sampling tend to be affected by initial value of proposal distribution, thsese are plotted with low alpha. Rejected sample doesn't have dot in plot. Accepted sample is plotted by blue line, whereas rejected line is drawn by red line.

In [12]:
# As a proposal distribution, 
# isotropic gaussian dist is used
cov_prop = 0.4 * np.eye(2)

iter_num = 130
# initial state
current_state = np.array([2,1])
# list to store samples
samples = []

plt.figure(figsize=(10,8))
burnin = 30

for i in range(iter_num):
    # sampling from proposal distribution
    X_new = current_state + np.random.multivariate_normal(
        mean=np.zeros(2), 
        cov=cov_prop
    )
    # compute ptilda
    p_curr = p_tilda(current_state)
    p_new = p_tilda(X_new)
    
    # accept or reject 
    if (p_new / p_curr) > np.random.uniform():
        samples.append(X_new)
    
        plt.plot(
            np.hstack([current_state[0], X_new[0]]),
            np.hstack([current_state[1], X_new[1]]),
            color='blue'
        )
        alpha = 0.5 if len(samples) < burnin else 1
        plt.scatter(X_new[0], X_new[1],color='white', alpha=alpha)
            
        # update state
        current_state = X_new
    else :
        plt.plot(
            np.hstack([current_state[0], X_new[0]]),
            np.hstack([current_state[1], X_new[1]]),
            color='red'
        )

# plot contour of true distribution
x = np.linspace(-4,4,101)
y = np.linspace(-4,4,101)
xx, yy = np.meshgrid(x,y)
prob = multivariate_normal.pdf(
    np.vstack([xx.ravel(),yy.ravel()]).T,
    mean=np.zeros(2),
    cov=np.array([[1,0.6],[0.6,1]])
)
plt.contour(
    xx, yy, prob.reshape(xx.shape),
    levels=[0.02],colors=['yellow'], 
)
plt.xlim(-4,4)
plt.ylim(-4,4)
plt.title('Markov Chain Monte Carlo with {} iteration'.format(iter_num)
          ,fontsize=16)
plt.show()

It looks that trainsitions by sampling from center to outside of contour is inclined to be rejected as you expected.

3. Sampling large number

Now we understand how it works, we wanna check the result with more iteration. The following is the result with 5000 iteration :)

In [186]:
# As a proposal distribution, 
# isotropic gaussian dist is used
cov_prop = 0.4 * np.eye(2)

iter_num = 5000
# samples = np.empty((iter_num+1, 2))
# initial state
current_state = np.array([2,1])
#samples[0] = np.array([2,1])
samples = []

plt.figure(figsize=(10,8))
alpha=0.5

burnin = 30

for i in range(iter_num):
    # sampling from proposal distribution
    X_new = current_state + np.random.multivariate_normal(
        mean=np.zeros(2), 
        cov=cov_prop
    )
    # compute ptilda
    p_curr = p_tilda(current_state)
    p_new = p_tilda(X_new)
    
    # accept or reject 
    if (p_new / p_curr) > np.random.uniform():
        samples.append(X_new)
        if len(samples) > burnin:
            plt.scatter(X_new[0], X_new[1],color='white', alpha=alpha)
        # update state
        current_state = X_new

# plot contour of true distribution
x = np.linspace(-4,4,101)
y = np.linspace(-4,4,101)
xx, yy = np.meshgrid(x,y)
prob = multivariate_normal.pdf(
    np.vstack([xx.ravel(),yy.ravel()]).T,
    mean=np.zeros(2),
    cov=np.array([[1,0.6],[0.6,1]])
)

plt.contour(xx, yy, prob.reshape(xx.shape),levels=[0.02],
            colors=['yellow'])
plt.xlim(-4,4)
plt.ylim(-4,4)
plt.title('Markov Chain Monte Carlo with {} iteration'.format(iter_num),
          fontsize=16)
plt.show()

Yay ! The result looks like sampling from target distribution which is represented by contour coloured with yellow. Also you can check the mean of sampling as following. It looks like that mean approaches (0, 0) which is actual mean of target distribution :)

In [198]:
# Compute mean of samples
samples = np.array(samples)
avg_sample = np.array([
    samples[:i+1].mean(axis=0) 
    for i in 
    range(samples.shape[0])
])

# plot mean
plt.figure(figsize=(10,5))
plt.plot(range(avg_sample.shape[0]), avg_sample[:,0], label='mean of x')
plt.plot(range(avg_sample.shape[0]), avg_sample[:,1], label='mean of y')
plt.xlabel('Number of samples',fontsize=14)
plt.ylabel('Mean',fontsize=14)
plt.legend(fontsize=14)
plt.show()