Tuesday, June 12, 2018

Linear regression with bayesian inference①

I will share "Linear regression with bayesian inference" in this article. I would be glad if you find this article informative.

Linear regression with bayesian inference①

0. Model of linear regression

First of all, unless model is defined, nothing will be happen. Let $y_n \in \mathbb{R}$ be output value, $x_n \in \mathbb{R^n}$ be input variable, $w \in \mathbb{R^n}$ be parameter, $\varepsilon^n \in \mathbb{R}$ be noize. Model of linear regression can be depicted as below.
$$y_n = w^Tx_n + \varepsilon_n \tag{1}$$
Now let us assume noize $\varepsilon$ is generated in accordance with gauss distribution. Tentatively $\mu = 0$ here. $$\varepsilon_n \tilde{} N(\varepsilon_n|0,\lambda^{-1}) \tag{2}$$
From $(1)$ and $(2)$, $p(y_n|w, x_n)$ can be depicted as below.
$$p(y_n|w,x_n) = N(w^Tx_n|0,\lambda^{-1}) \tag{3}$$
Now, we want to train parameter $w$ from observed data $y_n$ and $x_n$. Therefore, "prior distribution" $p(w)$ should be prepared. Here we place "Multivariate gaussian distribution" with $m$ as mean, $\Lambda$ as precision matrix. $$p(w) = N(w| m,\Lambda^{-1}) \tag{4}$$

1. Sample from model

After determining model, we should check where model is preferable or not. let's sample with equation (4),

In [1]:
import numpy as np
import matplotlib.pyplot as plt
% matplotlib inline
from scipy.stats import multivariate_normal
from scipy.stats import norm
In [2]:
# Function required
def make_inputx(x):
    """
    Create matrix consists of x features as 
    column vector.
    
    Arguments
    ----------------------------
    x : values of x which assume numpy.
    """
    ones= np.ones_like(x).reshape(1,-1)
    x1=(x**1).reshape(1,-1)
    x2=(x **2).reshape(1,-1)
    x3=(x **3).reshape(1,-1)
    
    return np.concatenate([ones,x1,x2,x3],axis=0) 

def sample_prior(num_features):
    """
    Sample parameter w.
    """
    # Tentatively, Initial parameter is zero vector.
    mean = np.zeros(num_features)
    coValMatrix = np.eye(num_features)
    
    return multivariate_normal.rvs(mean=mean,cov=coValMatrix)

def sample_noize(num_feature):
    """
    Sample epsilon for linear regression
    """
    mean = 0
    val = 0.2
    return norm.rvs(mean,val)

With following code, we can check shape of function generated with parameter $w$ by proir distribution.

In [3]:
# Plot Function generated from prior distribution
w_sample = np.empty((4,4))
for i in range(0,4):
    w_sample[i] = sample_prior(4)

# 100 Observed value
x_obs = np.linspace(-1,1,100)

# Get features of x
x_plot = make_inputx(x_obs)

# Obtain output value
y = w_sample@x_plot

for i in range(0,4):
    plt.plot(x_obs,y[i])

Next, we will check $y_n$ with noize $\epsilon$. Tentatively variance is 2.0.

In [4]:
# This time we use one 'w' for y and y_n
w = sample_prior(4)
x = np.linspace(-1,1,40)
x_input = make_inputx(x)
# Generate same number of noize as x.
noize = np.array([sample_noize(4) for i in range(0,len(x))])
# y value for function generated by parameter 'w'.
y = w@x_plot
# y_n = w^t@x_n + e_n
y_n = (w@x_input) + noize
plt.scatter(x,y_n)
plt.plot(x_obs,y)
Out[152]:
[<matplotlib.lines.Line2D at 0x114a2b5f8>]

No comments:

Post a Comment