With the information given by @Glen_b I could find the answer. Using the same notations as the question
P(Zk≤x)=∑j=0k+1(k+1j)(−1)j(1−jx)k+,
where a+=a if a>0 and 0 otherwise. I also give the expectation and the asymptotic convergence to the Gumbel (NB: not Beta) distribution
E(Zk)=1k+1∑i=1k+11i∼log(k+1)k+1,P(Zk≤x)∼exp(−e−(k+1)x+log(k+1)).
The material of the proofs is taken from several publications linked in the references. They are somewhat lengthy, but straightforward.
1. Proof of the exact distribution
Let (U1,…,Uk) be IID uniform random variables in the interval (0,1). By ordering them, we obtain the k order statistics denoted (U(1),…,U(k)). The uniform spacings are defined as Δi=U(i)−U(i−1), with U(0)=0 and U(k+1)=1. The ordered spacings are the corresponding ordered statistics Δ(1)≤…≤Δ(k+1). The variable of interest is Δ(k+1).
For fixed x∈(0,1), we define the indicator variable 1i=1{Δi>x}. By symmetry, the random vector (11,…,1k+1) is exchangeable, so the joint distribution of a subset of size j is the same as the joint distribution of the first j. By expanding the product, we thus obtain
P(Δ(k+1)≤x)=E(∏i=1k+1(1−1i))=1+∑j=1k+1(k+1j)(−1)jE(∏i=1j1i).
We will now prove that E(∏ji=11i)=(1−jx)k+, which will establish the distribution given above. We prove this for j=2, as the general case is proved similarly.
E(∏i=121i)=P(Δ1>x∩Δ2>x)=P(Δ1>x)P(Δ2>x|Δ1>x).
If Δ1>x, the k breakpoints are in the interval (x,1). Conditionally on this event, the breakpoints are still exchangeable, so the probability that the distance between the second and the first breakpoint is greater than x is the same as the probability that the distance between the first breakpoint and the left barrier (at position x) is greater than x. So
P(Δ2>x|Δ1>x)=P(all points are in (2x,1)∣∣all points are in (x,1)),soP(Δ2>x∩Δ1>x)=P(all points are in (2x,1))=(1−2x)k+.
2. Expectation
For distributions with finite support, we have
E(X)=∫P(X>x)dx=1−∫P(X≤x)dx.
Integrating the distribution of Δ(k+1), we obtain
E(Δ(k+1))=1k+1∑j=1k+1(k+1j)(−1)j+1j=1k+1∑j=1k+11j.
The last equality is a classic representation of harmonic numbers Hi=1+12+…+1i, which we demonstrate below.
Hk+1=∫101+x+…+xkdx=∫101−xk+11−xdx.
With the change of variable u=1−x and expanding the product, we obtain
Hk+1=∫10∑j=1k+1(k+1j)(−1)j+1uj−1du=∑j=1k+1(k+1j)(−1)j+1j.
3. Alternative construction of uniform spacings
In order to obtain the asymptotic distribution of the largest fragment, we will need to exhibit a classical construction of uniform spacings as exponential variables divided by their sum. The probability density of the associated order statistics (U(1),…,U(k)) is
fU(1),…U(k)(u(1),…,u(k))=k!,0≤u(1)≤…≤u(k+1).
If we denote the uniform spacings Δi=U(i)−U(i−1), with U(0)=0, we obtain
fΔ1,…Δk(δ1,…,δk)=k!,0≤δi+…+δk≤1.
By defining U(k+1)=1, we thus obtain
fΔ1,…Δk+1(δ1,…,δk+1)=k!,δ1+…+δk=1.
Now, let (X1,…,Xk+1) be IID exponential random variables with mean 1, and let S=X1+…+Xk+1. With a simple change of variable, we can see that
fX1,…Xk,S(x1,…,xk,s)=e−s.
Define Yi=Xi/S, such that by a change of variable we obtain
fY1,…Yk,S(y1,…,yk,s)=ske−s.
Integrating this density with respect to s, we thus obtain
fY1,…Yk,(y1,…,yk)=∫∞0ske−sds=k!,0≤yi+…+yk≤1,and thusfY1,…Yk+1,(y1,…,yk+1)=k!,y1+…+yk+1=1.
So the joint distribution of k+1 uniform spacings on the interval (0,1) is the same as the joint distribution of k+1 exponential random variables divided by their sum. We come to the following equivalence of distribution
Δ(k+1)≡X(k+1)X1+…+Xk+1.
4. Asymptotic distribution
Using the equivalence above, we obtain
P((k+1)Δ(k+1)−log(k+1)≤x)=P(X(k+1)≤(x+log(k+1))X1+…+Xk+1k+1)=P(X(k+1)−log(k+1)≤x+(x+log(k+1))Tk+1),
where Tk+1=X1+…+Xk+1k+1−1. This variable vanishes in probability because E(Tk+1)=0 and Var(log(k+1)Tk+1)=(log(k+1))2k+1↓0. Asymptotically, the distribution is the same as that of X(k+1)−log(k+1). Because the Xi are IID, we have
P(X(k+1)−log(k+1)≤x)=P(X1≤x+log(k+1))k+1= ( 1 − e− x − 로그( k + 1 ))k + 1= ( 1 − e− xk + 1)k + 1∼ 특급{ − e− x} .
5. 그래픽 개요
아래 그림은 다른 값에 대한 가장 큰 조각의 분포를 보여줍니다. 케이. 에 대한k = 10 , 20 , 50, 나는 점근선 Gumbel 분포 (가는 선)를 겹쳐 놓았습니다. Gumbel은 작은 값에 대한 매우 나쁜 근사치입니다.케이그래서 나는 사진에 과부하가 걸리지 않도록 생략합니다. Gumbel 근사값은k ≈ 50.
6. 참고 문헌
위의 증거는 참고 문헌 2와 3에서 가져온 것입니다. 인용 문헌에는 임의의 순위 정렬 간격 분포, 한계 분포 및 정렬 된 균일 간격의 일부 대체 구성과 같은 더 많은 결과가 포함되어 있습니다. 주요 참고 문헌에 쉽게 접근 할 수 없으므로 전체 텍스트에 대한 링크도 제공합니다.
- Bairamov et al. (2010) 주문 된 균일 간격에 대한 결과 제한 , 통계 지, 51 : 1, pp 227-240
- Holst (1980) 무작위로 끊어진 막대기 조각의 길이에서 , J. Appl. 잠언 17, pp 623-634
- 파이크 (1965) 간격 , JRSS (B) 27 : 3, pp. 395-449
- Renyi (1953) 주문 통계 이론 , Acta math Hung, 4, pp 191-231