와 이블 분포에 대한 EM 최대 가능성 추정


24

참고 : 기술적 인 이유로 본인의 게시물을 게시 할 수없는 이전 학생으로부터 질문을 게시하고 있습니다.

pdf Weibull 분포 의 iid 표본 x1,,xn 을 고려하면 유용한 누락 변수 표현 따라서 대신 MLE을 찾는 데 사용할 수있는 관련 EM (예상 최대화) 알고리즘 간단한 수치 최적화?

fk(x)=kxk1exkx>0
fk(x)=Zgk(x,z)dz
k

2
검열이 있습니까?
ocram

2
뉴턴 랩슨의 문제는 무엇입니까?
확률

2
@probabilityislogic : 아무것도 잘못되었습니다! 내 학생은 EM 버전이 있는지 알고 싶습니다. 그게 전부입니다.
Xi'an

1
가우시안 또는 균일 한 랜덤 변수를 관찰하는 것과 같이보다 단순하고 다른 상황에서 찾고있는 것을 예를 들어 주시겠습니까? 모든 데이터가 확인되면 본인 (및 의견에 따라 다른 포스터 중 일부)에서 EM이 귀하의 질문과 어떤 관련이 있는지 알 수 없습니다.
ahfoss

1
@probabilityislogic "오, 당신은 Newton Raphson을 사용하고 싶다는 뜻입니까?"라고 말했을 것입니다. Weibulls는 일반 가족입니다 ... 제 생각에 ML 솔루션은 독특합니다. 따라서 EM은 "E"에 대한 것이 없기 때문에 "M"을하는 것뿐입니다. 점수 방정식의 근본을 찾는 것이 최선의 방법입니다!
AdamO

답변:


7

질문을 올바르게 이해했다면 대답은 '예'라고 생각합니다.

쓴다 . 그런 다음 반복의 EM 알고리즘 유형은, 예를위한 시작하여 K = 1zi=xikk^=1

  • 단계 E z^i=xik^

  • 단계 M k^=n[(z^i1)logxi]

이것은 Aitkin and Clayton (1980)에 의해 Weibull 비례 위험 모델에 제안 된 반복의 특별한 경우 (검열 및 공변량이없는 경우)입니다. Aitkin et al (1989)의 섹션 6.11에서도 찾을 수 있습니다.

  • Aitkin, M. and Clayton, D., 1980. GLIM을 사용하여 복잡한 검열 생존 데이터에 지수, Weibull 및 극단 값 분포를 적합시킵니다. 응용 통계 , pp.156-163.

  • Aitkin, M., Anderson, D., Francis, B. and Hinde, J., 1989. GLIM의 통계 모델링 . 옥스포드 대학 출판부. 뉴욕.


고마워 데이빗! 잃어버린 변이로 를 대하는 것은 결코 내 마음을 넘어 가지 않았다 ...! xik
시안

7

이블 MLE는 단지 수치 적으로 풀 수있다 :

f λ , β ( x ) = { β라고 하자 β,

fλ,β(x)={βλ(xλ)β1e(xλ)β,x00,x<0
.β,λ>0

1) Likelihoodfunction :

Lx^(λ,β)=i=1Nfλ,β(xi)=i=1Nβλ(xiλ)β1e(xiλ)β=βNλNβei=1N(xiλ)βi=1Nxiβ1

로그인을 Likelihoodfunction :

x^(λ,β):=lnLx^(λ,β)=NlnβNβlnλi=1N(xiλ)β+(β1)i=1Nlnxi

2) MLE 문제 : 3)0-구배에의한최대화: l

max(λ,β)R2x^(λ,β)s.t.λ>0β>0
0
lλ=Nβ1λ+βi=1Nxiβ1λβ+1=!0lβ=NβNlnλi=1Nln(xiλ)eβln(xiλ)+i=1Nlnxi=!0
Nβ1λ+βi=1Nxiβ1λβ+1=0β1λN+β1λi=1Nxiβ1λβ=01+1Ni=1Nxiβ1λβ=01Ni=1Nxiβ=λβ
λ=(1Ni=1Nxiβ)1β

Plugging λ into the second 0-gradient condition:

β=[i=1Nxiβlnxii=1Nxiβlnx¯]1

This equation is only numerically solvable, e.g. Newton-Raphson algorithm. β^ can then be placed into λ to complete the ML estimator for the Weibull distribution.


11
Unfortunately, this does not appear to answer the question in any discernible way. The OP is very clearly aware of Newton-Raphson and related approaches. The feasibility of N-R in no way precludes the existence of a missing-variable representation or associated EM algorithm. In my estimation, the question is not concerned at all with numerical solutions, but rather is probing for insight that might become apparent if an interesting missing-variable approach were demonstrated.
cardinal

@cardinal It is one thing to say there was only numerical solution, and it is another thing to show there is only numerical solution.
emcor

5
Dear @emcor, I think you may be misunderstanding what the question is asking. Perhaps reviewing the other answer and associated comment stream would be helpful. Cheers.
cardinal

@cardinal I agree it is not direct answer, but it is the exact expressions for the MLE's e.g. can be used to verify the EM.
emcor

4

Though this is an old question, it looks like there is an answer in a paper published here: http://home.iitk.ac.in/~kundu/interval-censoring-REVISED-2.pdf

In this work the analysis of interval-censored data, with Weibull distribution as the underlying lifetime distribution has been considered. It is assumed that censoring mechanism is independent and non-informative. As expected, the maximum likelihood estimators cannot be obtained in closed form. In our simulation experiments it is observed that the Newton-Raphson method may not converge many times. An expectation maximization algorithm has been suggested to compute the maximum likelihood estimators, and it converges almost all the times.


1
Can you post a full citation for the paper at the link, in case it goes dead?
gung - Reinstate Monica

1
This is an EM algorithm, but does not do what I believe the OP wants. Rather, the E-step imputes the censored data, after which the M-step uses a fixed point algorithm with the complete data set. So the M-step is not in closed form (which I think is what the OP is looking for).
Cliff AB

1
@CliffAB: thank you for the link (+1) but indeed the EM is naturally induced in this paper by the censoring part. My former student was looking for a plain uncensored iid Weibull likelihood optimisation via EM.
Xi'an

-1

In this case the MLE and EM estimators are equivalent, since the MLE estimator is actually just a special case of the EM estimator. (I am assuming a frequentist framework in my answer; this isn't true for EM in a Bayesian context in which we're talking about MAP's). Since there is no missing data (just an unknown parameter), the E step simply returns the log likelihood, regardless of your choice of k(t). The M step then maximizes the log likelihood, yielding the MLE.

EM would be applicable, for example, if you had observed data from a mixture of two Weibull distributions with parameters k1 and k2, but you didn't know which of these two distributions each observation came from.


6
I think you may have misinterpreted the point of the question, which is: Does there exist some missing-variable interpretation from which one would obtain the given Weibull likelihood (and which would allow an EM-like algorithm to be applied)?
cardinal

4
The question statement in @Xi'an's post is quite clear. I think the reason it hasn't been answered is because any answer is likely nontrivial. (It's interesting, so I wish I had more time to think about it.) At any rate, your comment appears to betray a misunderstanding of the EM algorithm. Perhaps the following will serve as an antidote:
cardinal

6
Let f(x)=πφ(xμ1)+(1π)φ(xμ2) where φ is the standard normal density function. Let F(x)=xf(u)du. With U1,,Un iid standard uniform, take Xi=F1(Ui). Then, X1,,Xn is a sample from a Gaussian mixture model. We can estimate the parameters by (brute-force) maximum likelihood. Is there any missing data in our data-generation process? No. Does it have a latent-variable representation allowing for the use of an EM algorithm? Yes, absolutely.
cardinal

4
My apologies @cardinal; I think I have misunderstood two things about your latest post. Yes, in the GMM problem you could search R2×[0,1] via a brute force ML approach. Also, I now see that the original problem looks for a solution that involves introducing a latent variable that allows for an EM approach to estimating the parameter k in the given density kxk1exk. An interesting problem. Are there any examples of using EM like this in such a simple context? Most of my exposure to EM has been in the context of mixture problems and data imputation.
ahfoss

3
@ahfoss: (+1) to your latest comment. Yes! You got it. As for examples: (i) it shows up in censored data problems, (ii) classical applications like hidden Markov models, (iii) simple threshold models like probit models (e.g., imagine observing the latent Zi instead of Bernoulli Xi=1(Zi>μ)), (iv) estimating variance components in one-way random effects models (and much more complex mixed models), and (v) finding the posterior mode in a Bayesian hierarchical model. The simplest is probably (i) followed by (iii).
cardinal
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.