I will show another possible solution, that is quite widely applicable, and with todays R software, quite easy to implement. That is the saddlepoint density approximation, which ought to be wider known!
For terminology about the gamma distribution, I will follow https://en.wikipedia.org/wiki/Gamma_distribution with the shape/scale parametrization, k is shape parameter and θ is scale. For the saddlepoint approximation I will follow Ronald W Butler: "Saddlepoint approximations with applications" (Cambridge UP). The saddlepoint approximation is explained here: How does saddlepoint approximation work?
here I will show how it is used in this application.
Let X be a random variable with existing momentgenerating function
M(s)=EesX
which must exist for
s in some open interval that contains zero. Then define the cumulant generating function by
K(s)=logM(s)
It is known that
EX=K′(0),Var(X)=K′′(0). The saddlepoint equation is
K′(s^)=x
which implicitely defines
s as a function of
x (which must be in the range of
X). We write this implicitely defined function as
s^(x). Note that the saddlepoint equation always has exactly one solution, because the cumulant function is convex.
Then the saddlepoint approximation to the density f of X is given by
f^(x)=12πK′′(s^)−−−−−−−√exp(K(s^)−s^x)
This approximate density function is not guaranteed to integrate to 1, so is the unnormalized saddlepoint approximation. We could integrate it numerically and the renormalize to get a better approximation. But this approximation is guaranteed to be non-negative.
Now let X1,X2,…,Xn be independent gamma random variables, where Xi has the distribution with parameters (ki,θi). Then the cumulant generating function is
K(s)=−∑i=1nkiln(1−θis)
defined for
s<1/max(θ1,θ2,…,θn).
The first derivative is
K′(s)=∑i=1nkiθi1−θis
and the second derivative is
K′′(s)=∑i=1nkiθ2i(1−θis)2.
In the following I will give some
R
code calculating this, and will use the parameter values
n=3,
k=(1,2,3),
θ=(1,2,3). Note that the following
R
code uses a new argument in the uniroot function introduced in R 3.1, so will not run in older R's.
shape <- 1:3 #ki
scale <- 1:3 # thetai
# For this case, we get expectation=14, variance=36
make_cumgenfun <- function(shape, scale) {
# we return list(shape, scale, K, K', K'')
n <- length(shape)
m <- length(scale)
stopifnot( n == m, shape > 0, scale > 0 )
return( list( shape=shape, scale=scale,
Vectorize(function(s) {-sum(shape * log(1-scale * s) ) }),
Vectorize(function(s) {sum((shape*scale)/(1-s*scale))}) ,
Vectorize(function(s) { sum(shape*scale*scale/(1-s*scale)) })) )
}
solve_speq <- function(x, cumgenfun) {
# Returns saddle point!
shape <- cumgenfun[[1]]
scale <- cumgenfun[[2]]
Kd <- cumgenfun[[4]]
uniroot(function(s) Kd(s)-x,lower=-100,
upper = 0.3333,
extendInt = "upX")$root
}
make_fhat <- function(shape, scale) {
cgf1 <- make_cumgenfun(shape, scale)
K <- cgf1[[3]]
Kd <- cgf1[[4]]
Kdd <- cgf1[[5]]
# Function finding fhat for one specific x:
fhat0 <- function(x) {
# Solve saddlepoint equation:
s <- solve_speq(x, cgf1)
# Calculating saddlepoint density value:
(1/sqrt(2*pi*Kdd(s)))*exp(K(s)-s*x)
}
# Returning a vectorized version:
return(Vectorize(fhat0))
} #end make_fhat
fhat <- make_fhat(shape, scale)
plot(fhat, from=0.01, to=40, col="red", main="unnormalized saddlepoint approximation\nto sum of three gamma variables")
resulting in the following plot:
I will leave the normalized saddlepoint approximation as an exercise.