0. Simple linear regression¶
First of all, we have to wrap our head around "Simple linear regression".
Suppose x be predictor variable, y be dependent variable. Then linear regression line takes form
Let experimental unit be (xi,yi) (i=1,2,⋯,n), "Residential error" S can be denoted as following.
y−^yi
One of the way to obtain "best fitting" is to invoke "least squares criterion" which says "minimize the sum of the squared residual errors".
S=n∑i=1{yi−(β0+β1x)}2
As you can see, S is gonna be quadratic function of the β0 and β1.
Thereby we can obtain "best fitting" by computing the followings.
1. Multiple linear regression¶
Now let's getting into "Multiple linear regression". Let predictor variable be (x1,x2,⋯,xd), dependent variable y . Multiple linear regression line takes form
y=β0+β1x1+β2x2+⋯βdxd
where (β0,β1,⋯βd) are called "regression coefficient".
Let's say predictor variable is →xt=(xt1,xt2,⋯,xtd)T (t=1,2,⋯,n), response variable is →yt (t=1,2,⋯,n),
Now we'd like to denote this for n experimental unit. Let X be
X=(1x11…a1d1x21…a2d⋮⋮…⋮1xn1…and)→ˆy be
→ˆy=(y1,y2,⋯,yn)→β be
→β=(β1,β2,⋯,βd)
We can denote,
→ˆy=X→β
Suppose ϵ is ϵ=(ϵ1,ϵ2,⋯,ϵn)T as a "residual error".
→y=X→β+ϵ
Thereby,
→y−→ˆy=ϵ
and
ϵt=→yt−→^yt
Same as "simple linear regression", we will apply "least square criterion" for residual error,
S=n∑t=1ϵ2t=n∑t=1(→yt−→^yt)2=n∑t=1(→y−Xβ)2=(→y−Xβ)T(→y−Xβ)
For the sake of best fitting, what we have to do is compute ∂S∂β=→0 S=(→y−Xβ)T(→y−Xβ)=(Xβ−→y)T(Xβ−→y)
∂S∂β=2XTXβ−2XTy=−2XT(→y−X→β)=→0∴ →β=(XTX)−1XT→yHowever sometimes we can't find β due to (XTX)−1 doesn't exist which is equivalent to (XTX)−1 is not regular matrix. In that situation, we can apply "regularization". I will write an article about "regularization" very soon.
2. Coefficient of determination¶
After creating "linear regression model", you might wanna assess how fit your model is. "Coefficient of determination" is the quotient of the variances of the fitted value and observed values of dependent variable.
Let Sy and Sϵ be,
Sy=1nn∑t=1(yt−ˉy)2
where ˉy is mean of y.
Sϵ=1nn∑t=1(yt−^yt)2
and Sr is
Sr=Sy−Sϵ
Then "coefficient determination " R2 can be denoted as folows,
It's tribial that R is 0≦R2≦1, and as bigger the coefficient determinant is, linear regression line fits well.
Hotel & Casino in Las Vegas - MapYRO
ReplyDeleteHarrah's 양주 출장안마 Hotel & Casino is a 34-story high-rise building in Las Vegas, Nevada, U.S.A.. 용인 출장안마 View a detailed profile of 평택 출장마사지 the structure 186727 including further 안성 출장마사지 data 의정부 출장마사지 and