Any linear model can be written Y=μ+σGY=μ+σG where GG has the standard normal distribution on RnRn and μμ is assumed to belong to a linear subspace WW of RnRn. In your case W=Im(X)W=Im(X).
Let [1]⊂W[1]⊂W be the one-dimensional linear subspace generated by the vector (1,1,…,1)(1,1,…,1). Taking U=[1]U=[1] below, the R2R2 is highly related to the classical Fisher statistic
F=‖PZY‖2/(m−ℓ)‖P⊥WY‖2/(n−m),
F=∥PZY∥2/(m−ℓ)∥P⊥WY∥2/(n−m),
for the hypothesis test of
H0:{μ∈U}H0:{μ∈U} where
U⊂WU⊂W is a linear subspace, and denoting by
Z=U⊥∩WZ=U⊥∩W the orthogonal complement of
UU in
WW, and denoting
m=dim(W)m=dim(W) and
ℓ=dim(U)ℓ=dim(U) (then
m=pm=p and
ℓ=1ℓ=1 in your situation).
Indeed,
‖PZY‖2‖P⊥WY‖2=R21−R2
∥PZY∥2∥P⊥WY∥2=R21−R2
because the definition of
R2R2 is
R2=‖PZY‖2‖P⊥UY‖2=1−‖P⊥WY‖2‖P⊥UY‖2.R2=∥PZY∥2∥P⊥UY∥2=1−∥P⊥WY∥2∥P⊥UY∥2.
Obviously PZY=PZμ+σPZGPZY=PZμ+σPZG and
P⊥WY=σP⊥WGP⊥WY=σP⊥WG.
When H0:{μ∈U}H0:{μ∈U} is true then PZμ=0PZμ=0 and therefore
F=‖PZG‖2/(m−ℓ)‖P⊥WG‖2/(n−m)∼Fm−ℓ,n−m
F=∥PZG∥2/(m−ℓ)∥P⊥WG∥2/(n−m)∼Fm−ℓ,n−m
has the Fisher
Fm−ℓ,n−mFm−ℓ,n−m distribution. Consequently, from the classical relation between the Fisher distribution and the Beta distribution,
R2∼B(m−ℓ,n−m)R2∼B(m−ℓ,n−m).
In the general situation we have to deal with PZY=PZμ+σPZGPZY=PZμ+σPZG when PZμ≠0PZμ≠0. In this general case one has ‖PZY‖2∼σ2χ2m−ℓ(λ)∥PZY∥2∼σ2χ2m−ℓ(λ), the noncentral χ2χ2 distribution with m−ℓm−ℓ degrees of freedom and noncentrality parameter λ=‖PZμ‖2σ2λ=∥PZμ∥2σ2, and then
F∼Fm−ℓ,n−m(λ)F∼Fm−ℓ,n−m(λ) (noncentral Fisher distribution). This is the classical result used to compute power of FF-tests.
The classical relation between the Fisher distribution and the Beta distribution hold in the noncentral situation too. Finally R2R2 has the noncentral beta distribution with "shape parameters" m−ℓm−ℓ and n−mn−m and noncentrality parameter λλ. I think the moments are available in the literature but they possibly are highly complicated.
Finally let us write down PZμPZμ. Note that PZ=PW−PUPZ=PW−PU. One has PUμ=ˉμ1PUμ=μ¯1 when U=[1]U=[1], and PWμ=μPWμ=μ. Hence PZμ=μ−ˉμ1PZμ=μ−μ¯1 where here μ=Xβμ=Xβ for the unknown parameters vector ββ.