Linear regression with bayesian inference①¶
0. Model of linear regression¶
First of all, unless model is defined, nothing will be happen. Let yn∈R be output value, xn∈Rn be input variable, w∈Rn be parameter, εn∈R be noize. Model of linear regression can be depicted as below.
yn=wTxn+εn
Now let us assume noize ε is generated in accordance with gauss distribution. Tentatively μ=0 here.
εn~N(εn|0,λ−1)
From (1) and (2), p(yn|w,xn) can be depicted as below.
p(yn|w,xn)=N(wTxn|0,λ−1)
Now, we want to train parameter w from observed data yn and xn. Therefore, "prior distribution" p(w) should be prepared. Here we place "Multivariate gaussian distribution" with m as mean, Λ as precision matrix.
p(w)=N(w|m,Λ−1)
1. Sample from model¶
After determining model, we should check where model is preferable or not. let's sample with equation (4),
import numpy as np
import matplotlib.pyplot as plt
% matplotlib inline
from scipy.stats import multivariate_normal
from scipy.stats import norm
# Function required
def make_inputx(x):
"""
Create matrix consists of x features as
column vector.
Arguments
----------------------------
x : values of x which assume numpy.
"""
ones= np.ones_like(x).reshape(1,-1)
x1=(x**1).reshape(1,-1)
x2=(x **2).reshape(1,-1)
x3=(x **3).reshape(1,-1)
return np.concatenate([ones,x1,x2,x3],axis=0)
def sample_prior(num_features):
"""
Sample parameter w.
"""
# Tentatively, Initial parameter is zero vector.
mean = np.zeros(num_features)
coValMatrix = np.eye(num_features)
return multivariate_normal.rvs(mean=mean,cov=coValMatrix)
def sample_noize(num_feature):
"""
Sample epsilon for linear regression
"""
mean = 0
val = 0.2
return norm.rvs(mean,val)
With following code, we can check shape of function generated with parameter w by proir distribution.
# Plot Function generated from prior distribution
w_sample = np.empty((4,4))
for i in range(0,4):
w_sample[i] = sample_prior(4)
# 100 Observed value
x_obs = np.linspace(-1,1,100)
# Get features of x
x_plot = make_inputx(x_obs)
# Obtain output value
y = w_sample@x_plot
for i in range(0,4):
plt.plot(x_obs,y[i])
Next, we will check yn with noize ϵ. Tentatively variance is 2.0.
# This time we use one 'w' for y and y_n
w = sample_prior(4)
x = np.linspace(-1,1,40)
x_input = make_inputx(x)
# Generate same number of noize as x.
noize = np.array([sample_noize(4) for i in range(0,len(x))])
# y value for function generated by parameter 'w'.
y = w@x_plot
# y_n = w^t@x_n + e_n
y_n = (w@x_input) + noize
plt.scatter(x,y_n)
plt.plot(x_obs,y)
No comments:
Post a Comment