R의 우도 비 검정

다음과 같이 여러 독립 변수에 대해 일 변량 로지스틱 회귀 분석을 수행한다고 가정합니다.

mod.a <- glm(x ~ a, data=z, family=binominal("logistic"))
mod.b <- glm(x ~ b, data=z, family=binominal("logistic"))

이 명령으로 모델이 null 모델보다 나은지 확인하기 위해 모델 비교 (우도 비율 테스트)를 수행했습니다.

1-pchisq(mod.a$null.deviance-mod.a$deviance, mod.a$df.null-mod.a$df.residual)

그런 다음 모든 변수가 포함 된 다른 모델을 만들었습니다.

mod.c <- glm(x ~ a+b, data=z, family=binomial("logistic"))

다변량 모델에서 변수가 통계적으로 유의한지 확인하기 위해 다음 lrtest명령을 사용했습니다.epicalc

lrtest(mod.c,mod.a) ### see if variable b is statistically significant after adjustment of a
lrtest(mod.c,mod.b) ### see if variable a is statistically significant after adjustment of b

pchisq방법과 lrtest방법이 로그 우도 검정을 수행하는 데 동등한 지 궁금합니다 . lrtestunivate 물류 모델 에 사용하는 방법을 모르겠습니다.

r logistic diagnostic

— lokheart
소스

@Gavin은 스택 오버 플로우와 비교할 때 답변이 적절한 지 여부를 결정하기 전에 답변을 "소화"하는 데 더 많은 시간을 할애해야한다는 것을 상기시켜 주셔서 감사합니다.

— lokheart

lmtest에서 waldtest를 사용하지 않는 것이 좋습니다. 모델 테스트에는 aod 패키지를 사용하십시오. 훨씬 더 간단합니다. cran.r-project.org/web/packages/aod/aod.pdf

— Mr. Nobody

epicalc제거되었습니다 ( 소스 ). 대안이 될 수 있습니다 lmtest.

— Martin Thoma

답변:

기본적으로 그렇습니다. 로그 우도의 올바른 차이를 사용하면 다음과 같습니다.

> library(epicalc)
> model0 <- glm(case ~ induced + spontaneous, family=binomial, data=infert)
> model1 <- glm(case ~ induced, family=binomial, data=infert)
> lrtest (model0, model1)
Likelihood ratio test for MLE method 
Chi-squared 1 d.f. =  36.48675 , P value =  0 
> model1$deviance-model0$deviance
[1] 36.48675

그리고 널 모델의 일탈하지 두 경우 모두 동일하다. df 수는 두 개의 중첩 된 모델 (여기서 df = 1)간에 다른 매개 변수 수입니다. BTW, 당신은 lrtest()입력하여 소스 코드를 볼 수 있습니다

> lrtest

R 프롬프트에서.

— chl
소스

고마워, 나는 방금 NULL 모델을 만들기 위해 glm (output ~ NULL, data = z, family = binomial ( "logistic"))을 사용할 수 있다는 것을 알았으므로 나중에 lrtest를 사용할 수 있습니다. 참고로, 덕분에 다시

— lokheart

@lokheart anova(model1, model0)도 작동합니다.

— chl

@lokheart glm(output ~ 1, data=z, family=binomial("logistic"))는보다 자연스러운 null 모델이 될 것입니다. 즉 output, 일정한 용어 (절편)로 설명됩니다. 절편은 모든 모델에 내포되어 있으므로 a절편을 고려한 후의 효과를 테스트하고 있습니다.

— 복원 Monica Monica-G. Simpson

또는 "수동으로"수행 할 수 있습니다. LR 테스트의 p- 값 = 1-pchisq (deviance, dof)

— Umka

대안은 단일 모델을 수용 lmtest하는 lrtest()기능을 가진 패키지 입니다. 다음의 예이다 ?lrtest에서 lmtest은 LM을위한 패키지는하지만, 방법은 GLMS와 그 일이 있습니다 :

> require(lmtest)
Loading required package: lmtest
Loading required package: zoo
> ## with data from Greene (1993):
> ## load data and compute lags
> data("USDistLag")
> usdl <- na.contiguous(cbind(USDistLag, lag(USDistLag, k = -1)))
> colnames(usdl) <- c("con", "gnp", "con1", "gnp1")
> fm1 <- lm(con ~ gnp + gnp1, data = usdl)
> fm2 <- lm(con ~ gnp + con1 + gnp1, data = usdl)
> ## various equivalent specifications of the LR test
>
> ## Compare two nested models
> lrtest(fm2, fm1)
Likelihood ratio test

Model 1: con ~ gnp + con1 + gnp1
Model 2: con ~ gnp + gnp1
  #Df  LogLik Df  Chisq Pr(>Chisq)    
1   5 -56.069                         
2   4 -65.871 -1 19.605  9.524e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
>
> ## with just one model provided, compare this model to a null one
> lrtest(fm2)
Likelihood ratio test

Model 1: con ~ gnp + con1 + gnp1
Model 2: con ~ 1
  #Df   LogLik Df  Chisq Pr(>Chisq)    
1   5  -56.069                         
2   2 -119.091 -3 126.04  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

— 복원 모니카-G. 심슨
소스

+1 알고있는 것이 좋습니다 (그리고 그 패키지를 잊어 버린 것 같습니다).

— Chl

@GavinSimpson 이것은 어리석은 것처럼 보이지만 'lrtest (fm2, fm1)'결과를 어떻게 해석 하시겠습니까? 모델 2는 모델 1과 크게 다르므로 con1 변수를 추가하는 것이 유용 했습니까? 또는 lrtest (fm2)는 모델 2가 모델 1과 크게 다르다고 말하고 있습니까? 그러나 어떤 모델이 더 낫습니까?

— Kerry

@Kerry fm1는 로그 가능성이 낮기 때문에보다 적합하지 않습니다 fm2. LRT는 fm1모델 fm2간에 다른 용어가 유용 할 경우 예상치 못한 모델보다 불량한 모델 을 만드는 정도를 알려줍니다 (응답 설명). 출력과 같이 다음과 같은 경우 모델 lrtest(fm2)이 전혀 비교되지 않습니다 . null 모델 인이 모델은 최적 예측 변수의 표본 평균 (절편 / 일정 항) 이라고 말합니다 . fm1fm2con ~ 1concon

— 복원 Monica Monica-G. Simpson