회귀 계수의 역변환


17

변환 된 종속 변수로 선형 회귀를 수행하고 있습니다. 잔차의 정규성의 가정이 유지되도록 다음과 같은 변환이 수행되었습니다. 변환되지 않은 종속 변수는 음으로 비뚤어졌으며 다음 변환으로 변수가 정상에 가깝습니다.

Y=50Yorig

여기서 Yorig 는 원래 척도의 종속 변수입니다.

나는 원래의 척도로 돌아 가기 위해 계수 에 약간의 변환을 사용하는 것이 합리적이라고 생각합니다 . 다음 회귀 방정식을 사용하여β

Y=50Yorig=α+βX

고정하여 , 우리가X=0

α=50Yorig=50αorig

그리고 마지막으로,

αorig=50α2

같은 논리를 사용하여

βorig=α (α2β)+β2+αorig50

1 ~ 2 개의 예측 변수가있는 모형의 경우 상황이 매우 잘 작동합니다. 역변환 된 계수는 원래의 계수와 비슷하지만 이제는 표준 오차를 신뢰할 수 있습니다. 다음과 같은 상호 작용 용어를 포함 할 때 문제가 발생합니다.

Y=α+X1βX1+X2βX2+X1X2βX1X2

그런 다음 대한 역변환 은 원래 스케일의 역변환과 너무 가깝지 않으며 왜 그런지 잘 모르겠습니다. 또한 베타 계수를 역변환하기 위해 발견 된 공식이 3 차 β (상호 작용 항에 대해)에서 와 같이 사용 가능한지 확실하지 않습니다 . 미친 대수학에 들어가기 전에 조언을 구할 줄 알았는데 ...ββ


β o r i g를 어떻게 정의 합니까? αorigβorig
mark999

원래 저울에서 알파와 베타의 가치로서
Dominic Comtois

1
그러나 그것은 무엇을 의미합니까?
mark999

우리가 얻는 추정치는 선형 회귀에 적합한 원래 데이터였습니다.
Dominic Comtois

2
To me that seems like a meaningless concept. I agree with gung's answer.
mark999

답변:


19

One problem is that you've written

Y=α+βX

That is a simple deterministic (i.e. non-random) model. In that case, you could back transform the coefficients on the original scale, since it's just a matter of some simple algebra. But, in usual regression you only have E(Y|X)=α+βX ; you've left the error term out of your model. If transformation from Y back to Yorig is non-linear, you may have a problem since E(f(X))f(E(X)), in general. I think that may have to do with the discrepancy you're seeing.

Edit: Note that if the transformation is linear, you can back transform to get estimates of the coefficients on the original scale, since expectation is linear.


4
+1 for explaining why we can't back transform the betas.
gung - Reinstate Monica

15

I salute your efforts here, but you're barking up the wrong tree. You don't back transform betas. Your model holds in the transformed data world. If you want to make a prediction, for example, you back transform y^i, but that's it. Of course, you can also get a prediction interval by computing the high and low limit values, and then back transform them as well, but in no case do you back transform the betas.


1
What to make of the fact that the back-transformed coefficients get very close to the ones obtained when modelling the untransformed variable? Doesn't that allow for some inference on the original scale?
Dominic Comtois

2
I don't know, exactly. It could depend any number of things. My first guess is that you're getting lucky w/ your 1st couple of betas, but then your luck runs out. I have to agree w/ @mark999 that "the estimates that we'd get were the original data suited to linear regression" doesn't actually make any sense; I wish it did & it sort of seems to at first blush, but unfortunately it doesn't. And it doesn't license any inferences on the original scale.
gung - Reinstate Monica

1
@gung for non linear transformations (say box cox): I can back transform fitted values as well as prediction intervals, but I can't transform betas nor coefficient intervals for the betas. Is there any additional limitation I should be aware of? btw, this is a very interesting topic, where can I get a better understanding?
mugen

2
@mugen, it's hard to say what else you should be aware of. 1 thing maybe to hold in mind is that the back transformation of y-hat gives you the conditional median whereas the un-back-transformed (bleck) y-hat is the conditional mean. Other than that, this material should be covered in a good regression textbook.
gung - Reinstate Monica

3
@mugen, you're welcome. Feel free to ask more questions via the normal mechanisms (clicking ASK QUESTION); there will be more resources for answering, you will get the attention of more CVers, & the information will be better accessible for posterity.
gung - Reinstate Monica
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.