짧은 대답 : 아직 정답은 아니지만 링크 된 질문과 관련된 다음 분포에 관심이있을 수 있습니다. z-test (glm에서도 사용됨)와 t-test를 비교합니다.
layout(matrix(1:2,1,byrow=TRUE))
# trying all 100 possible outcomes if the true value is p=0.7
px <- dbinom(0:100,100,0.7)
p_model = rep(0,101)
p_model2 = rep(0,101)
for (i in 0:100) {
xi = c(rep(1,i),rep(0,100-i))
model = glm(xi ~ 1, offset=rep(qlogis(0.7),100), family="binomial")
p_model[i+1] = 1-summary(model)$coefficients[4]
model2 <- glm(xi ~ 1, family = "binomial")
coef <- summary(model2)$coefficients
p_model2[i+1] = 1-2*pt(-abs((qlogis(0.7)-coef[1])/coef[2]),99,ncp=0)
}
# plotting cumulative distribution of outcomes z-test
outcomes <- p_model[order(p_model)]
cdf <- cumsum(px[order(p_model)])
plot(1-outcomes,1-cdf,
ylab="cumulative probability",
xlab= "calculated glm p-value",
xlim=c(10^-4,1),ylim=c(10^-4,1),col=2,cex=0.5,log="xy")
lines(c(0.00001,1),c(0.00001,1))
for (i in 1:100) {
lines(1-c(outcomes[i],outcomes[i+1]),1-c(cdf[i+1],cdf[i+1]),col=2)
# lines(1-c(outcomes[i],outcomes[i]),1-c(cdf[i],cdf[i+1]),col=2)
}
title("probability for rejection with z-test \n as function of set alpha level")
# plotting cumulative distribution of outcomes t-test
outcomes <- p_model2[order(p_model2)]
cdf <- cumsum(px[order(p_model2)])
plot(1-outcomes,1-cdf,
ylab="cumulative probability",
xlab= "calculated glm p-value",
xlim=c(10^-4,1),ylim=c(10^-4,1),col=2,cex=0.5,log="xy")
lines(c(0.00001,1),c(0.00001,1))
for (i in 1:100) {
lines(1-c(outcomes[i],outcomes[i+1]),1-c(cdf[i+1],cdf[i+1]),col=2)
# lines(1-c(outcomes[i],outcomes[i]),1-c(cdf[i],cdf[i+1]),col=2)
}
title("probability for rejection with t-test \n as function of set alpha level")
[![p-test vs t-test][1]][1]
그리고 작은 차이가 있습니다. 또한 z-test가 실제로 더 낫습니다 (그러나 이것은 t-test와 z-test가 모두 "잘못된"이고 z-test의 오류가이 오류를 보상하기 때문일 수 있습니다).
긴 답변 : ...