부러진 막대의 가장 큰 조각 분포 (간격)


21

길이 1의 스틱을 무작위 로 k+1 조각으로 균일하게 끊습니다. 가장 긴 조각의 길이 분포는 무엇입니까?

더 공식적으로하자 BE IID U ( 0 , 1 ) 및하자 ( U ( 1 ) , ... , U는 ( k는 ) ) , 관련 주문 통계 수 즉, 우리는 단순히 같은에서 샘플 주문 그 방법 U ( 1 )U ( 2 ), ... , U ( K ) . 방해(U1,Uk)U(0,1)(U(1),,U(k))U(1)U(2),,U(k)Zk=max(U(1),U(2)U(1),,U(k)U(k1),1U(k)).

I am interested in the distribution of Zk. Moments, asymptotic results, or approximations for k are also interesting.


9
This is a well studied problem; see R. Pyke (1965), "Spacings," JRSS(B) 27:3, pp. 395-449. I'll try to come back to add some information later unless someone beats me to it. There's also a 1972 paper by the same author ("Spacings revisited") but I think what you're after is pretty much all in the first. There's some asymptotics in Devroye (1981), "Laws of the Iterated Logarithm for Order Statistics of Uniform Spacings" Ann. Probab., 9:5, 860-867.
Glen_b -Reinstate Monica

4
Those should also give some good search terms to find later work if you need it.
Glen_b -Reinstate Monica

3
This is awesome. The first reference is hard to find. For those interested, I put it on The Grand Locus.
gui11aume

Please correct the misprint: Y(k) instead of U(k).
Viktor

Thanks @Viktor! For such small things, don't hesitate to do the edit yourself (I think that it will be reviewed by other users for approval).
gui11aume

답변:


18

With the information given by @Glen_b I could find the answer. Using the same notations as the question

P(Zkx)=j=0k+1(k+1j)(1)j(1jx)+k,

where a+=a if a>0 and 0 otherwise. I also give the expectation and the asymptotic convergence to the Gumbel (NB: not Beta) distribution

E(Zk)=1k+1i=1k+11ilog(k+1)k+1,P(Zkx)exp(e(k+1)x+log(k+1)).

The material of the proofs is taken from several publications linked in the references. They are somewhat lengthy, but straightforward.

1. Proof of the exact distribution

Let (U1,,Uk) be IID uniform random variables in the interval (0,1). By ordering them, we obtain the k order statistics denoted (U(1),,U(k)). The uniform spacings are defined as Δi=U(i)U(i1), with U(0)=0 and U(k+1)=1. The ordered spacings are the corresponding ordered statistics Δ(1)Δ(k+1). The variable of interest is Δ(k+1).

For fixed x(0,1), we define the indicator variable 1i=1{Δi>x}. By symmetry, the random vector (11,,1k+1) is exchangeable, so the joint distribution of a subset of size j is the same as the joint distribution of the first j. By expanding the product, we thus obtain

P(Δ(k+1)x)=E(i=1k+1(11i))=1+j=1k+1(k+1j)(1)jE(i=1j1i).

We will now prove that E(i=1j1i)=(1jx)+k, which will establish the distribution given above. We prove this for j=2, as the general case is proved similarly.

E(i=121i)=P(Δ1>xΔ2>x)=P(Δ1>x)P(Δ2>x|Δ1>x).

If Δ1>x, the k breakpoints are in the interval (x,1). Conditionally on this event, the breakpoints are still exchangeable, so the probability that the distance between the second and the first breakpoint is greater than x is the same as the probability that the distance between the first breakpoint and the left barrier (at position x) is greater than x. So

P(Δ2>x|Δ1>x)=P(all points are in (2x,1)|all points are in (x,1)),soP(Δ2>xΔ1>x)=P(all points are in (2x,1))=(12x)+k.

2. Expectation

For distributions with finite support, we have

E(X)=P(X>x)dx=1P(Xx)dx.

Integrating the distribution of Δ(k+1), we obtain

E(Δ(k+1))=1k+1j=1k+1(k+1j)(1)j+1j=1k+1j=1k+11j.

The last equality is a classic representation of harmonic numbers Hi=1+12++1i, which we demonstrate below.

Hk+1=011+x++xkdx=011xk+11xdx.

With the change of variable u=1x and expanding the product, we obtain

Hk+1=01j=1k+1(k+1j)(1)j+1uj1du=j=1k+1(k+1j)(1)j+1j.

3. Alternative construction of uniform spacings

In order to obtain the asymptotic distribution of the largest fragment, we will need to exhibit a classical construction of uniform spacings as exponential variables divided by their sum. The probability density of the associated order statistics (U(1),,U(k)) is

fU(1),U(k)(u(1),,u(k))=k!,0u(1)u(k+1).

If we denote the uniform spacings Δi=U(i)U(i1), with U(0)=0, we obtain

fΔ1,Δk(δ1,,δk)=k!,0δi++δk1.

By defining U(k+1)=1, we thus obtain

fΔ1,Δk+1(δ1,,δk+1)=k!,δ1++δk=1.

Now, let (X1,,Xk+1) be IID exponential random variables with mean 1, and let S=X1++Xk+1. With a simple change of variable, we can see that

fX1,Xk,S(x1,,xk,s)=es.

Define Yi=Xi/S, such that by a change of variable we obtain

fY1,Yk,S(y1,,yk,s)=skes.

Integrating this density with respect to s, we thus obtain

fY1,Yk,(y1,,yk)=0skesds=k!,0yi++yk1,and thusfY1,Yk+1,(y1,,yk+1)=k!,y1++yk+1=1.

So the joint distribution of k+1 uniform spacings on the interval (0,1) is the same as the joint distribution of k+1 exponential random variables divided by their sum. We come to the following equivalence of distribution

Δ(k+1)X(k+1)X1++Xk+1.

4. Asymptotic distribution

Using the equivalence above, we obtain

P((k+1)Δ(k+1)log(k+1)x)=P(X(k+1)(x+log(k+1))X1++Xk+1k+1)=P(X(k+1)log(k+1)x+(x+log(k+1))Tk+1),

where Tk+1=X1++Xk+1k+11. This variable vanishes in probability because E(Tk+1)=0 and Var(log(k+1)Tk+1)=(log(k+1))2k+10. Asymptotically, the distribution is the same as that of X(k+1)log(k+1). Because the Xi are IID, we have

(엑스(케이+1)로그(케이+1)엑스)=(엑스1엑스+로그(케이+1))케이+1=(1이자형엑스로그(케이+1))케이+1=(1이자형엑스케이+1)케이+1특급{이자형엑스}.

5. 그래픽 개요

아래 그림은 다른 값에 대한 가장 큰 조각의 분포를 보여줍니다. 케이. 에 대한케이=10,20,50, 나는 점근선 Gumbel 분포 (가는 선)를 겹쳐 놓았습니다. Gumbel은 작은 값에 대한 매우 나쁜 근사치입니다.케이그래서 나는 사진에 과부하가 걸리지 않도록 생략합니다. Gumbel 근사값은케이50.

부러진 막대기의 가장 큰 조각의 분포

6. 참고 문헌

위의 증거는 참고 문헌 2와 3에서 가져온 것입니다. 인용 문헌에는 임의의 순위 정렬 간격 분포, 한계 분포 및 정렬 된 균일 간격의 일부 대체 구성과 같은 더 많은 결과가 포함되어 있습니다. 주요 참고 문헌에 쉽게 접근 할 수 없으므로 전체 텍스트에 대한 링크도 제공합니다.

  1. Bairamov et al. (2010) 주문 된 균일 간격에 대한 결과 제한 , 통계 지, 51 : 1, pp 227-240
  2. Holst (1980) 무작위로 끊어진 막대기 조각의 길이에서 , J. Appl. 잠언 17, pp 623-634
  3. 파이크 (1965) 간격 , JRSS (B) 27 : 3, pp. 395-449
  4. Renyi (1953) 주문 통계 이론 , Acta math Hung, 4, pp 191-231

Brilliant. By the way, is there a known asymptotics to E(Zk2)?
Amir Sagiv

@ AmirSagiv 이것은 좋은 질문입니다. 나는 참고 문헌을 빨리 보았고 그것을 찾을 수 없었다. 또한 위의 증거를 조정할 수 없습니다. 이것은 Gumbel의 제곱 분포가 무엇인지 모른다는 것을 깨달았습니다. 시작하기에 좋은 곳일까요?
gui11aume

1
$gui11aume Look here : mathoverflow.net/a/293381/42864
Amir Sagiv

1
@AmirSagiv This is a very good post. For some reason, I misunderstood your question and thought you were interested in the asymptotic distribution of Zk2 (even though your comment was very clear), so my comment above is not so relevant.
gui11aume

3

This is not a complete answer, but I did some quick simulations, and this is what I obtained: Histogram of the longest fragment

This looks remarkably beta-ish, and this makes a bit of sense, since the order statistics of i.i.d. uniform distributions are beta wiki.

This might give some starting point to derive the resulting p.d.f..

I'll update if I get to a final closed solution.

Cheers!


한 가지 더, k를 증가시키기위한 히스토그램의 모양은 0에 가까워지는 "찌그러짐"을 제외하고는 크게 변하지 않습니다.
리마

1
Thank you for your thoughts @Lima (and welcome to Cross Validated). I think your answer can be improved. First, I would refrain from making statements without proof. If this is incorrect, you may put the people who see this thread on the wrong track. Second, I would document what you did. Without the value of k that you used nor the code, the figure does not help anybody. Finally, I would copy-edit the answer and remove everything that is not directly answering the question.
gui11aume

1
제안 해 주셔서 감사합니다. 그것들은 스택 교환을 넘어서 유효하며, 그것들을 사용하는 것을 기억할 것입니다.
리마

1

2005 년 시에나 (이탈리아)에서 회의에 대한 답변을 작성했습니다.이 논문 (2006)은 내 웹 사이트 (pdf)에 나와 있습니다. 모든 간격 (최소에서 최대)의 정확한 분포는 75 및 76 페이지에 있습니다.

2016 년 9 월 맨체스터 (영국)에서 열린 RSS 컨퍼런스에서이 주제에 대한 프레젠테이션을하고 싶습니다.


2
Welcome to the site. We are trying to build a permanent repository of high-quality statistical information in the form of questions & answers. Thus, we're wary of link-only answers, due to linkrot. Can you post a full citation & a summary of the information at the link, in case it goes dead? Also, please don't sign your posts here. Every post has a link to your userpage where you can post that information.
gung-복직 모니카
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.