다중 회귀 계수의 표준 오차?


18

나는 이것이 매우 기본적인 질문이라는 것을 알고 있지만 어디서도 답을 찾을 수 없습니다.

일반 방정식이나 QR 분해를 사용하여 회귀 계수를 계산하고 있습니다. 각 계수에 대한 표준 오차를 어떻게 계산할 수 있습니까? 나는 보통 표준 오류를 다음과 같이 계산한다고 생각합니다.

SEx¯ =σx¯n

각 계수에 대한 는 무엇입니까 ? OLS와 관련하여 이것을 계산하는 가장 효율적인 방법은 무엇입니까?σx¯

답변:


19

(통상의 랜덤 요소 가정) 최소 자승 추정을 수행하는 경우, 회귀 파라미터 추정치는 일반적으로 평균 진정한 회귀 파라미터에 동등한 및의 공분산 매트릭스와 분산 여기서, S (2) 잔류 편차이며 X T X 는 설계 행렬입니다. X T는 의 전치이며 XX는 모델 식에 의해 정의된다 Y = X β + ε 와 회귀 파라미터Σ=s2(XTX)1s2XTXXTXXY=Xβ+ϵβ 는 오차 항입니다. 베타 파라미터의 추정 표준 편차에 대응하는 기간을 고려하여 받고있다 ( X T X ) - 1 승산을 잔류 편차의 샘플 추계하고 제곱근 복용. 이것은 매우 간단한 계산은 아니지만 모든 소프트웨어 패키지가 계산하여 출력으로 제공합니다.ϵ(XTX)1

드레이퍼와 (내 의견에서 참조) 스미스의 134 페이지, 그들은 최소 제곱하여 모델을 피팅에 대해 다음 데이터를 제공 여기서 ε ~ N ( 0 , 나는 σ 2 ) .Y=β0+β1X+εεN(0,Iσ2)

                      X                      Y                    XY
                      0                     -2                     0
                      2                      0                     0
                      2                      2                     4
                      5                      1                     5
                      5                      3                    15
                      9                      1                     9
                      9                      0                     0
                      9                      0                     0
                      9                      1                     9
                     10                     -1                   -10
                    ---                     --                   ---
Sum                  60                      5                    32
Sum of  Squares     482                     21                   528

기울기가 0에 가까워 야하는 예와 같습니다.

Xt=(111111111102255999910).

그래서

XtX=(nXiXiXi2)=(106060482)

and

(XtX)1=(Xi2n(XiX¯)2X¯(XiX¯)2X¯(XiX¯)21(XiX¯)2)=(48210(122)612261221122)=(0.3950.0490.0490.008)

where X¯=Xi/n=60/10=6.

Estimate for β=(XTX)1XTY = ( b0 ) =(Yb-b1 Xb) b1 Sxy/Sxx

b1 = 1/61 = 0.0163 and b0 = 0.5- 0.0163(6) = 0.402

From (XTX)1 above Sb1 =Se (0.008) and Sb0=Se(0.395) where Se is the estimated standard deviation for the error term. Se =√2.3085.

Sorry that the equations didn't carry subscripting and superscripting when I cut and pasted them. The table didn't reproduce well either because the spaces got ignored. The first string of 3 numbers correspond to the first values of X Y and XY and the same for the followinf strings of three. After Sum comes the sums for X Y and XY respectively and then the sum of squares for X Y and XY respectively. The 2x2 matrices got messed up too. The values after the brackets should be in brackets underneath the numbers to the left.


2
Not meant as a plug for my book but i go through the computations of the least squares solution in simple linear regression (Y=aX+b) and calculate the standard errors for a and b, pp.101-103, The Essentials of Biostatistics for Physicians, Nurses, and Clinicians, Wiley 2011. a more detailed description can be found In Draper and Smith Applied Regression Analysis 3rd Edition, Wiley New York 1998 page 126-127. In my answer that follows I will take an example from Draper and Smith.
Michael R. Chernick

8
When I started interacting with this site, Michael, I had similar feelings. With experience, they have changed. It's worthwhile knowing some TEX and once you do, it's (almost) as fast to type it in as it is to type in anything in English. I also learned, by studying exemplary posts (such as many replies by @chl, cardinal, and other high-reputation-per-post users), that providing references, clear illustrations, and well-thought out equations is usually highly appreciated and well received. High quality is one thing distinguishing this site from most others.
whuber

2
That is all nice Bill and it is nice that so many people are dedicated to give those high quality posts. I may use Latex for other purposes, like publishing papers. But I don't have the time to go to all the effort that people expect of me on this site. i am not going to invest the time just to provide service on this site.
Michael R. Chernick

4
I think the disconnect is here: "This is just one of many things about this site that requires those posting to put in extra time and effort" - @whuber and I are both saying that it, in fact, does not take extra time if you know how to do it. We don't learn TEX so that we can post on this site - we (at least I) learn TEX because it's an important skill to have as a statistician and happens to make posts much more readable on this site.
Macro

3
Like many of the people on here, yes, I work as a statistician, but I also happen to find it fun - this site is recreational for me and it's a nice bonus that others find some of my posts useful. If you find marking up your equations with TEX to be work and don't think it's worth learning then so be it, but know that some of your content will be overlooked.
Macro
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.