공분산 행렬의 반전이 왜 랜덤 변수 사이의 부분 상관을 생성합니까?


32

공분산 행렬을 반전시키고 결과 정밀도 행렬에서 적절한 셀을 가져 와서 무작위 변수 간의 부분 상관 관계를 찾을 수 있다고 들었습니다 (이 사실은 http://en.wikipedia.org/wiki/Partial_correlation 에 있지만 증거는 없습니다) .

왜 그런가요?


1
다른 모든 변수에 대해 제어 된 셀에서 부분 상관 관계를 얻으려는 경우 여기 의 마지막 단락에 빛이 비칠 수 있습니다.
ttnphns

답변:


34

다변량 랜덤 변수 (X1,X2,,Xn) 에 비 변성 공분산 행렬 C=(γij)=(Cov(Xi,Xj)) , 모든 실제 선형 조합 세트 Xi 형성 n 기준으로 실제 차원 벡터 공간 및 비 분해성 내적E=(X1,X2,,Xn)

Xi,Xj=γij .

이중 기준 이 내적에 대하여 , 고유의 관계에 의해 정의되고E=(X1,X2,,Xn)

Xi,Xj=δij ,

크로네 커 델타 ( i = j0 인 경우 과 동일)1i=j0 기타).

의 부분 상관 관계 때문에 이중으로 여기에 관심있는 X의 J는 부분 사이의 상관 관계로 얻을 수있다 X 나는 공간으로 투사가 다른 모든 벡터에 의해 스팬 후 남아 (의는 단순히 전화를하자 그 " 잔차 ", X i ) 및 X j 의 비교 가능한 부분 , 그의 잔류 X j . 그러나 X는 * 저는 또한 모든 벡터에 직교하는 벡터이며 X I 및 긍정적 내적 갖는 X I 어디서 XXiXjXiXiXjXjXiXiXi X * i 의 음이 아닌 배수 여야하며 X j의 경우에도 마찬가지입니다. 그러므로 우리가 쓰자XiXiXj

Xi=λiXi, Xj=λjXj

양의 실수 번호에 대한 λ J를 .λiλj

부분 상관은 잔차의 정규화 된 내적이며, 크기 조정에 의해 변경되지 않습니다.

ρij=Xi,XjXi,XiXj,Xj=λiλjXi,Xjλi2Xi,Xiλj2Xj,Xj=Xi,XjXi,XiXj,Xj .

(두 경우 모두 잔차가 0이 아닌지 여부에 관계없이 잔차가 직교 할 때마다 부분 상관 관계는 0이됩니다.)

이중 기본 요소의 내부 제품을 찾아야합니다. 이를 위해 원래 기준 이중 기본 요소를 확장하십시오 .E

Xi=j=1nβijXj .

그런 다음 정의에 따라

δik=Xi,Xk=j=1nβijXj,Xk=j=1nβijγjk .

의 항등 행렬과 B = ( β i j ) 의 행렬 변화 표기법을 사용하여 행렬 표기법에서I=(δij)B=(βij)

I=BC .

That is, B=C1, which is exactly what the Wikipedia article is asserting. The previous formula for the partial correlation gives

ρij=βijβiiβjj=Cij1Cii1Cjj1 .

3
+1, great answer. But why do you call this dual basis "dual basis with respect to this inner product" -- what does "with respect to this inner product" exactly mean? It seems that you use the term "dual basis" as defined here mathworld.wolfram.com/DualVectorSpace.html in the second paragraph ("Given a vector space basis v1,...,vn for V there exists a dual basis...") or here en.wikipedia.org/wiki/Dual_basis, and it's independent of any scalar product.
amoeba says Reinstate Monica

3
@amoeba There are two kinds of duals. The (natural) dual of any vector space V over a field R is the set of linear functions ϕ:VR, called V. There is no canonical way to identify V with V, even though they have the same dimension when V is finite-dimensional. Any inner product γ corresponds to such a map g:VV, and vice versa, via
g(v)(w)=γ(v,w).
(Nondegeneracy of γ ensures g is a vector space isomorphism.) This gives a way to view elements of V as if they were elements of the dual V--but it depends on γ.
whuber

3
@mpettis Those dots were hard to notice. I have replaced them with small open circles to make the notation easier to read. Thanks for pointing this out.
whuber

4
@Andy Ron Christensen's Plane Answers to Complex Questions might be the sort of thing you are looking for. Unfortunately, his approach makes (IMHO) undue reliance on coordinate arguments and calculations. In the original introduction (see p. xiii), Christensen explains that's for pedagogical reasons.
whuber

3
@whuber, Your proof is awesome. I wonder whether any book or article contains such a proof so that I can cite.
Harry

12

Here is a proof with just matrix calculations.

I appreciate the answer by whuber. It is very insightful on the math behind the scene. However, it is still not so trivial how to use his answer to obtain the minus sign in the formula stated in the wikipediaPartial_correlation#Using_matrix_inversion.

ρXiXjV{Xi,Xj}=pijpiipjj

To get this minus sign, here is a different proof I found in "Graphical Models Lauriten 1995 Page 130". It is simply done by some matrix calculations.

The key is the following matrix identity:

(ABCD)1=(E1E1GFE1D1+FE1G)
where E=ABD1C, F=D1C and G=BD1.

Write down the covariance matrix as

Ω=(Ω11Ω12Ω21Ω22)
where Ω11 is covariance matrix of (Xi,Xj) and Ω22 is covariance matrix of V{Xi,Xj}.

Let P=Ω1. Similarly, write down P as

P=(P11P12P21P22)

By the key matrix identity,

P111=Ω11Ω12Ω221Ω21

We also know that Ω11Ω12Ω221Ω21 is the covariance matrix of (Xi,Xj)|V{Xi,Xj} (from Multivariate_normal_distribution#Conditional_distributions). The partial correlation is therefore

ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22.
I use the notation that the (k,l)th entry of the matrix M is denoted by [M]kl.

Just simple inversion formula of 2-by-2 matrix,

([P111]11[P111]12[P111]21[P111]22)=P111=1detP11([P11]22[P11]12[P11]21[P11]11)

Therefore,

ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22=1detP11[P11]121detP11[P11]221detP11[P11]11=[P11]12[P11]22[P11]11
which is exactly what the Wikipedia article is asserting.

If we let i=j, then rho_ii V\{X_i, X_i} = -1, How do we interpret those diagonal elements in the precision matrix?
Jason

Good point. The formula should be only valid for i=/=j. From the proof, the minus sign comes from the 2-by-2 matrix inversion. It would not happen if i=j.
Po C.

So the diagonal numbers can't be associated with partial correlation. What do they represent? They are not just inverses of the variances, are they?
Jason

This formula is valid for i=/=j. It is meaningless for i=j.
Po C.

4

Note that the sign of the answer actually depends on how you define partial correlation. There is a difference between regressing Xi and Xj on the other n1 variables separately vs. regressing Xi and Xj on the other n2 variables together. Under the second definition, let the correlation between residuals ϵi and ϵj be ρ. Then the partial correlation of the two (regressing ϵi on ϵj and vice versa) is ρ.

This explains the confusion in the comments above, as well as on Wikipedia. The second definition is used universally from what I can tell, so there should be a negative sign.

I originally posted an edit to the other answer, but made a mistake - sorry about that!

당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.