불연속 균일 분포로 교체하지 않고 채취 한 시료 간 최대 간격


16

이 문제는 실험실에서 로봇 적용 범위에 대한 연구와 관련이 있습니다.

교체없이 { 1 , 2 , , m } 집합에서 n 숫자를 임의로 그리고 오름차순으로 정렬합니다. 1 n m .{1,2,,m}1nm

숫자 소트 목록에서 {a(1),a(2),,a(n)} : 연속 번호와 경계 사이의 차이를 생성 g={a(1),a(2)a(1),,a(n)a(n1),m+1a(n)} . 이것은n+1 간격을제공합니다.

최대 간격의 분포는 무엇입니까?

P(max(g)=k)=P(k;m,n)=?

주문 통계를 사용하여 프레임을 지정할 수 있습니다 . P(g(n+1)=k)=P(k;m,n)=?

간격 분포대한 링크를 참조하십시오 . 그러나이 질문은 최대 간격 분포를 묻습니다 .

평균값 E[g(n+1)] 합니다.

이면 n=m모든 간격이 크기 1입니다. n+1=m 이면 크기 간격이 하나 2있으며 가능한 위치는 n+1 입니다. 최대 갭 크기는 mn+1 이며,이 갭은 총 n + 1 개의 가능한 위치에 대해 임의의 n 수 앞뒤에 배치 될 수 있습니다 . 가장 작은 최대 간격 크기는 m - n입니다.n+1mnn+1. 주어진 조합의 확률을 정의합니다. T=(mn)1

I 부분적으로 확률 질량 함수를 해결 한 (1)P(g(n+1)=k)=P(k;m,n)={0k<mnn+11k=mnn+11k=1 (occurs when m=n)T(n+1)k=2 (occurs when m=n+1)T(n+1)k=m(n1)n?m(n1)nkmn+1T(n+1)k=mn+10k>mn+1

Current work (1): The equation for the first gap, a(1) is straightforward:

P(a(1)=k)=P(k;m,n)=1(mn)k=1mn+1(mk1n1)
The expected value has a simple value: E[P(a(1))]=1(mn)k=1mn+1(mk1n1)k=mn1+n. By symmetry, I expect all n gaps to have this distribution. Perhaps the solution could be found by drawing from this distribution n times.

Current work (2): it is easy to run Monte Carlo simulations.

simMaxGap[m_, n_] := Max[Differences[Sort[Join[RandomSample[Range[m], n], {0, m+1}]]]];
m = 1000; n = 1; trials = 100000;
SmoothHistogram[Table[simMaxGap[m, n], {trials}], Filling -> Axis,
Frame -> {True, True, False, False},
FrameLabel -> {"k (Max gap)", "Probability"},
PlotLabel -> StringForm["m=``,n=``,smooth histogram of maximum map for `` trials", m, n, trials]][![enter image description here][1]][1]

1
With these conditions you must have n<=m. I think you want g={a_(1), a_(2)-a_(1),..., a_(n)-a_(n-1)}. Does randomly select mean selecting each number with probability 1/m on the first draw? Since you do not replace the probability would be 1/(m-1) on the second and so on down to 1 on the mth draw if n=m. If n<m this would stop earlier with the last draw having probability 1/(m-(n-1)) on the nth draw.
Michael R. Chernick

2
Your original description of g made no sense, because (I believe) you transposed two of the subscripts. Please verify that my edit conforms with your intention: in particular, please confirm that you mean for there to be n gaps, of which a(1) is the first.
whuber

1
@gung I think this is research, rather than self-study
Glen_b -Reinstate Monica

1
I think your minimum and maximum gap sizes should be 1 and mn+1. The minimum gap size is when consecutive integers are chosen, and the maximum gap size occurs when you select m and n1 first integers 1,,n1 (or 1 and mn+2,,m)
probabilityislogic

1
Thank you Michael Chernick and probabilityislogic, your corrections have been made. Thank you @whuber for making the correction!
AaronBecker

답변:


9

Let f(g;n,m) be the chance that the minimum, a(1), equals g; that is, the sample consists of g and an n1-subset of {g+1,g+2,,m}. There are (mgn1) such subsets out of the (mn) equally likely subsets, whence

Pr(a(1)=g=f(g;n,m)=(mgn1)(mn).

Adding f(k;n,m) for all possible values of k greater than g yields the survival function

Pr(a(1)>g)=Q(g;n,m)=(mg)(mg1n1)n(mn).

Let Gn,m be the random variable given by the largest gap:

Gn,m=max(a(1),a(2)a(1),,a(n)a(n1)).

(This responds to the question as originally framed, before it was modified to include a gap between a(n) and m.) We will compute its survival function

P(g;n,m)=Pr(Gn,m>g),
from which the entire distribution of Gn,m is readily derived. The method is a dynamic program beginning with n=1, for which it is obvious that

(1)P(g;1,m)=Pr(G1,m>1)=mgm, g=0,1,,m.

For larger n>1, note that the event Gn,m>g is the disjoint union of the event

a1>g,

for which the very first gap exceeds g, and the g separate events

a1=k and Gn1,mk>g, k=1,2,,g

for which the first gap equals k and a gap greater than g occurs later in the sample. The Law of Total Probability asserts the probabilities of these events add, whence

(2)P(g;n,m)=Q(g;n,m)+k=1gf(k;n,m)P(g;n1,mk).

Fixing g and laying out a two-way array indexed by i=1,2,,n and j=1,2,,m, we may compute P(g;n,m) by using (1) to fill in its first row and (2) to fill in each successive row using O(gm) operations per row. Consequently the table can be completed in O(gmn) operations and all tables for g=1 through g=mn+1 can be constructed in O(m3n) operations.

Figure

These graphs show the survival function gP(g;n,64) for n=1,2,4,8,16,32,64. As n increases, the graph moves to the left, corresponding to the decreasing chances of large gaps.

Closed formulas for P(g;n,m) can be obtained in many special cases, especially for large n, but I have not been able to obtain a closed formula that applies to all g,n,m. Good approximations are readily available by replacing this problem with the analogous problem for continuous uniform variables.

Finally, the expectation of Gn,m is obtained by summing its survival function starting at g=0:

E(Gn,m)=g=0mn+1P(g;n,m).

Figure 2: contour plot of expectation

This contour plot of the expectation shows contours at 2,4,6,,32, graduating from dark to light.


Suggestion: line "Let Gn,m be the random variable given by the largest gap:", please add the last gap of m+1an. Your expectation plot matches my Monte Carlo simulation.
AaronBecker
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.