junta 분할의 견고성


16

부울 함수 f : { 0 , 1 } n{ 0 , 1 }f:{0,1}n{0,1}f 에 최대 k 개의 영향을 미치는 변수 가있는 경우 k-k junta라고 합니다.fk

하자 F : { 0 , 1 } N{ 0 , 1 }f:{0,1}n{0,1} 가 수 2 유전율2k -junta한다. 변수를 나타내는 F를f 하여 , X 1 , X 2 , ... , X Nx1,x2,,xn . 수정 S 1 = { x 1 , x 2 , , x n2 },S 2 = { x n2 +1,xn2 +2,,xn}.

S1={x1,x2,,xn2},S2={xn2+1,xn2+2,,xn}.
분명히 존재S를{S1,S2}S{S1,S2}와 같은 것을S가S적어도 포함Kk의 영향 변수F를f.

지금하자 ε > 0ϵ>0 , 그 가정 F : { 0 , 1 } N을{ 0 , 1 }f:{0,1}n{0,1}ε는ϵ 각에서 -far 2 유전율2k -junta (즉, 하나는 적어도 한 부분 바뀌어야 εϵ 의 값을 ff2 k2k -junta 로 만들기 위해 ). 위의 진술을 "견고한"버전으로 만들 수 있습니까? 즉, 범용 상수 cc 와 집합 S { S 1 , S 2 }가 있습니까?S{S1,S2}되도록 F는f 이고 εcϵc -S에 최대k 개의k영향을 미치는 변수를 포함하는 모든 함수와 거리가 멀습니까?S

참고 : 질문의 원래 공식에서 cc2 로 고정되었습니다 2. Neal의 예는 그러한 c 값이 c충분하지 않다는 것을 보여줍니다 . 그러나 속성 테스트에서 우리는 일반적으로 상수에 너무 관심이 없기 때문에 조건을 약간 완화했습니다.


Can you clarify your terms? Is a variable "influencing" unless the value of f is always independent of the variable? Does "change a value of ff" mean, change one of the values f(x)f(x) for some particular xx?
Neal Young

물론, 변수 X xi 이 생길 있으면 좌우되는 n 개의n 비트 스트링 Yy 되도록 F ( Y ) F ( Y ' )f(y)f(y) , y는 'y 문자열이고 Yy 는 함께 i '대칭 좌표 토륨. f 의 값을 변경하면 f진리표가 변경됩니다.

답변:


17

The answer is “yes”. The proof is by contradiction.

For notational convenience, let us denote the first n/2n/2 variables by xx and the second n/2n/2 variables by yy. Suppose that f(x,y)f(x,y) is δδ-close to a function f1(x,y)f1(x,y) which depends only on kk coordinates of xx. Denote its influential coordinates by T1T1. Similarly, suppose that f(x,y)f(x,y) is δδ-close to a function f2(x,y)f2(x,y) which depends only on kk coordinates of yy. Denote its influential coordinates by T2T2. We need to prove that ff is 4δ4δ- close to a 2k2k-junta ˜f(x,y)f~(x,y).

Let us say that (x1,y1)(x2,y2)(x1,y1)(x2,y2) if x1x1 and x2x2 agree on all coordinates in T1T1 and y1y1 and y2y2 agree on all coordinates in T2T2. We choose uniformly at random a representative from each equivalence class. Let (ˉx,ˉy)(x¯,y¯) be the representative for the class of (x,y)(x,y). Define ˜ff~ as follows: ˜f(x,y)=f(ˉx,ˉy).

f~(x,y)=f(x¯,y¯).

It is obvious that ˜ff~ is a 2k2k-junta (it depends only on variables in T1T2)T1T2). We shall prove that it is at distance 4δ4δ from ff in expectation.

We want to prove that Pr˜f(Prx,y(˜f(x,y)f(x,y)))=Pr(f(ˉx,ˉy)f(x,y))4δ,

Prf~(Prx,y(f~(x,y)f(x,y)))=Pr(f(x¯,y¯)f(x,y))4δ,
where xx and yy are chosen uniformly at random. Consider a random vector ˜xx~ obtained from xx by keeping all bits in T1T1 and randomly flipping all bits not in T1T1, and a vector ˜yy~ defined similarly. Note that Pr(˜f(x,y)f(x,y))=Pr(f(ˉx,ˉy)f(x,y))=Pr(f(˜x,˜y)f(x,y)).
Pr(f~(x,y)f(x,y))=Pr(f(x¯,y¯)f(x,y))=Pr(f(x~,y~)f(x,y)).

We have, Pr(f(x,y)f(˜x,y))Pr(f(x,y)f1(x,y))+Pr(f1(x,y)f1(˜x,y))+Pr(f1(˜x,y)f(˜x,y))δ+0+δ=2δ.

Pr(f(x,y)f(x~,y))Pr(f(x,y)f1(x,y))+Pr(f1(x,y)f1(x~,y))+Pr(f1(x~,y)f(x~,y))δ+0+δ=2δ.

Similarly, Pr(f(˜x,y)f(˜x,˜y))2δPr(f(x~,y)f(x~,y~))2δ. We have Pr(f(ˉx,ˉy)f(x,y))4δ.

Pr(f(x¯,y¯)f(x,y))4δ.
QED

It easy to “derandomize” this proof. For every (x,y)(x,y), let ˜f(x,y)=1f~(x,y)=1 if f(x,y)=1f(x,y)=1 for most (x,y)(x,y) in the equivalence class of (x,y)(x,y), and ˜f(x,y)=0f~(x,y)=0, otherwise.


12

The smallest c that the bound holds for is c=1212.41.

Lemmas 1 and 2 show that the bound holds for this c. Lemma 3 shows that this bound is tight.

(In comparison, Juri's elegant probabilistic argument gives c=4.)

Let c=121. Lemma 1 gives the upper bound for k=0.

Lemma 1: If f is ϵg-near a function g that has no influencing variables in S2, and f is ϵh-near a function h that has no influencing variables in S1, then f is ϵ-near a constant function, where ϵ(ϵg+ϵh)/2c.

Proof. Let ϵ be the distance from f to a constant function. Suppose for contradiction that ϵ does not satisfy the claimed inequality. Let y=(x1,x2,,xn/2) and z=(xn/2+1,,xn) and write f, g, and h as f(y,z), g(y,z) and h(y,z), so g(y,z) is independent of z and h(y,z) is independent of y.

(I find it helpful to visualize f as the edge-labeling of the complete bipartite graph with vertex sets {y} and {z}, where g gives a vertex-labeling of {y}, and h gives a vertex-labeling of {z}.)

Let g0 be the fraction of pairs (y,z) such that g(y,z)=0. Let g1=1g0 be the fraction of pairs such that g(y,z)=1. Likewise let h0 be the fraction of pairs such that h(y,z)=0, and let h1 be the fraction of pairs such that h(y,z)=1.

Without loss of generality, assume that, for any pair such that g(y,z)=h(y,z), it also holds that f(y,z)=g(y,z)=h(y,z). (Otherwise, toggling the value of f(y,z) allows us to decrease both ϵg and ϵh by 1/2n, while decreasing the ϵ by at most 1/2n, so the resulting function is still a counter-example.) Say any such pair is ``in agreement''.

The distance from f to g plus the distance from f to h is the fraction of (x,y) pairs that are not in agreement. That is, ϵg+ϵh=g0h1+g1h0.

The distance from f to the all-zero function is at most 1g0h0.

The distance from f to the all-ones function is at most 1g1h1.

Further, the distance from f to the nearest constant function is at most 1/2.

Thus, the ratio ϵ/(ϵg+ϵh) is at most min(1/2,1g0h0,1g1h1)g0h1+g1h0,

where g0,h0[0,1] and g1=1g0 and h1=1h0.

By calculation, this ratio is at most 12(21)=c/2. QED

Lemma 2 extends Lemma 1 to general k by arguing pointwise, over every possible setting of the 2k influencing variables. Recall that c=121.

Lemma 2: Fix any k. If f is ϵg-near a function g that has k influencing variables in S2, and f is ϵh-near a function h that has k influencing variables in S1, then f is ϵ-near a function ˆf that has at most 2k influencing variables, where ϵ(ϵg+ϵh)/2c.

Proof. Express f as f(a,y,b,z) where (a,y) contains the variables in S1 with a containing those that influence h, while (b,z) contains the variables in S2 with b containing those influencing g. So g(a,y,b,z) is independent of z, and h(a,y,b,z) is independent of y.

For each fixed value of a and b, define Fab(y,z)=f(a,y,b,z), and define Gab and Hab similarly from g and h respectively. Let ϵgab be the distance from Fab to Gab (restricted to (y,z) pairs). Likewise let ϵhab be the distance from Fab to Hab.

By Lemma 1, there exists a constant cab such that the distance (call it ϵab) from Fab to the constant function cab is at most (ϵhab+ϵgab)/(2c). Define ˆf(a,y,b,z)=cab.

Clearly ˆf depends only on a and b (and thus at most k variables).

Let ϵˆf be the average, over the (a,b) pairs, of the ϵab's, so that the distance from f to ˆf is ϵˆf.

Likewise, the distances from f to g and from f to h (that is, ϵg and ϵh) are the averages, over the (a,b) pairs, of, respectively, ϵgab and ϵhab.

Since ϵab(ϵhab+ϵgab)/(2c) for all a,b, it follows that ϵˆf(ϵg+ϵh)/(2c). QED

Lemma 3 shows that the constant c above is the best you can hope for (even for k=0 and ϵ=0.5).

Lemma 3: There exists f such that f is (0.5/c)-near two functions g and h, where g has no influencing variables in S2 and h has no influencing variables in S1, and f is 0.5-far from every constant function.

Proof. Let y and z be x restricted to, respectively, S1 and S2. That is, y=(x1,,xn/2) and z=(xn/2+1,,xn).

Identify each possible y with a unique element of [N], where N=2n/2. Likewise, identify each possible z with a unique element of [N]. Thus, we think of f as a function from [N]×[N] to {0,1}.

Define f(y,z) to be 1 iff max(y,z)12N.

By calculation, the fraction of f's values that are zero is (12)2=12, so both constant functions have distance 12 to f.

Define g(y,z) to be 1 iff y12N. Then g has no influencing variables in S2. The distance from f to g is the fraction of pairs (y,z) such that y<12N and z12N. By calculation, this is at most 12(112)=0.5/c

Similarly, the distance from f to h, where h(y,z)=1 iff z12N, is at most 0.5/c.

QED


First of all, thanks Neal! This indeed sums it up for k=0, and sheds some light on the general problem. However in the case of k=0 the problem is a bit degenerate (as 2k=k), so I'm more curious regarding the case of k1. I didn't manage to extend this claim for k>0, so if you have an idea on how to do it - I'd appreciate it. If it simplifies the problem, then the exact constants are not crucial; that is, ϵ/2-far can be replaced by ϵ/c-far, for some universal constant c.

2
I've edited it to add the extension to general k. And Yuri's argument below gives a slightly looser factor with an elegant probabilistic argument.
Neal Young

Sincere thanks Neal! This line of reasoning is quite enlightening.
당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.