Loading [MathJax]/jax/output/HTML-CSS/jax.js

Monday, July 30, 2018

Basics of Linear Regression

I will share with you about basics of Linear Regression. It's somehow entry point to statistics. However my ulterior motive is understanding of regularization. Actually, it's gonna be too long and intricate to cope with regularization here. Thus, in this article, I will handle only basics of linear regression. In terms of regularization, I will upload very soon :)

0. Simple linear regression

First of all, we have to wrap our head around "Simple linear regression".
Suppose x be predictor variable, y be dependent variable. Then linear regression line takes form

ˆy=β0+β1x

Let experimental unit be (xi,yi) (i=1,2,,n), "Residential error" S can be denoted as following.

y^yi


One of the way to obtain "best fitting" is to invoke "least squares criterion" which says "minimize the sum of the squared residual errors".
S=ni=1{yi(β0+β1x)}2

As you can see, S is gonna be quadratic function of the β0 and β1.
Thereby we can obtain "best fitting" by computing the followings.

Sβ0=0Sβ1=0

1. Multiple linear regression

Now let's getting into "Multiple linear regression". Let predictor variable be (x1,x2,,xd), dependent variable y . Multiple linear regression line takes form

y=β0+β1x1+β2x2+βdxd


where (β0,β1,βd) are called "regression coefficient".
Let's say predictor variable is xt=(xt1,xt2,,xtd)T (t=1,2,,n), response variable is yt (t=1,2,,n),

^yt=β0+β1xt1+β2xt2++βdxtd

Now we'd like to denote this for n experimental unit. Let X be

X=(1x11a1d1x21a2d1xn1and)

ˆy be

ˆy=(y1,y2,,yn)

β be

β=(β1,β2,,βd)


We can denote, ˆy=Xβ

Suppose ϵ is ϵ=(ϵ1,ϵ2,,ϵn)T as a "residual error". y=Xβ+ϵ
Thereby,
yˆy=ϵ and
ϵt=yt^yt
Same as "simple linear regression", we will apply "least square criterion" for residual error, S=nt=1ϵ2t=nt=1(yt^yt)2=nt=1(yXβ)2=(yXβ)T(yXβ)

For the sake of best fitting, what we have to do is compute Sβ=0 S=(yXβ)T(yXβ)=(Xβy)T(Xβy)

Sβ=2XTXβ2XTy=2XT(yXβ)=0 β=(XTX)1XTy

However sometimes we can't find β due to (XTX)1 doesn't exist which is equivalent to (XTX)1 is not regular matrix. In that situation, we can apply "regularization". I will write an article about "regularization" very soon.

2. Coefficient of determination

After creating "linear regression model", you might wanna assess how fit your model is. "Coefficient of determination" is the quotient of the variances of the fitted value and observed values of dependent variable. Let Sy and Sϵ be,
Sy=1nnt=1(ytˉy)2 where ˉy is mean of y. Sϵ=1nnt=1(yt^yt)2 and Sr is
Sr=SySϵ
Then "coefficient determination " R2 can be denoted as folows,

R2=SrSy=1SϵSy

It's tribial that R is 0R21, and as bigger the coefficient determinant is, linear regression line fits well.

1 comment:

  1. Hotel & Casino in Las Vegas - MapYRO
    Harrah's 양주 출장안마 Hotel & Casino is a 34-story high-rise building in Las Vegas, Nevada, U.S.A.. 용인 출장안마 View a detailed profile of 평택 출장마사지 the structure 186727 including further 안성 출장마사지 data 의정부 출장마사지 and

    ReplyDelete