차별화 된 시리즈의 ARIMA vs ARMA

R (2.15.2)에서는 시계열에 ARIMA (3,1,3)를 한 번, 한 번 다른 시계열에 ARMA (3,3)를 한 번 장착했습니다. ARIMA의 피팅 방법으로 인해 피팅 매개 변수가 다릅니다.

또한 ARMA (3,3)과 동일한 데이터에 ARIMA (3,0,3)를 피팅하면 내가 사용하는 피팅 방법에 관계없이 동일한 매개 변수가 생성되지 않습니다.

ARMA에서와 동일한 피팅 계수를 얻기 위해 ARIMA에 맞는 매개 변수를 사용하여 차이가 어디에서 발생하는지와 매개 변수를 사용하여 식별 할 수 있습니다.

설명하기위한 샘플 코드 :

library(tseries)
set.seed(2)
#getting a time series manually
x<-c(1,2,1)
e<-c(0,0.3,-0.2)
n<-45
AR<-c(0.5,-0.4,-0.1)
MA<-c(0.4,0.3,-0.2)
for(i in 4:n){
tt<-rnorm(1)
t<-x[length(x)]+tt+x[i-1]*AR[1]+x[i-2]*AR[2]+x[i-3]*AR[3]+e[i-1]*MA[1]+e[i-2]*MA[2]+e[i-3]*MA[3]
x<-c(x,t)
e<-c(e,tt)
}
par(mfrow=c(2,1))
plot(x)
plot(diff(x,1))

#fitting different versions. What I would like to get is fit1 with ARIMA()
fit1<-arma(diff(x,1,lag=1),c(3,3),include.intercept=F)
fit2<-arima(x,c(3,1,3),include.mean=F)
fit3<-arima(diff(x,1),c(3,0,3),include.mean=F)
fit4<-arima(x,c(3,1,3),method="CSS",include.mean=F)
fit5<-arima(diff(x,1),c(3,0,3),method="CSS",include.mean=F)

cbind(fit1$coe,fit2$coe,fit3$coe,fit4$coe,fit5$coe)

편집 : 조건부 제곱합을 사용하면 꽤 가깝지만 실제로는 아닙니다. fit1에 대한 힌트를 주셔서 감사합니다!

편집 2 : 나는 이것이 중복이라고 생각하지 않습니다. 포인트 2와 3은 내 문제와 다른 문제를 해결하며 포인트 1에서 언급 한 초기화를 무시하더라도

fit4<-arima(x,c(3,1,3),method="CSS",include.mean=F,init=fit1$coe)

나는 여전히 다른 계수를 얻는다

— 사용자
소스

fit11 MA & 1 AR 매개 변수 만 있습니다 fit1<-arma(diff(x,1,lag=1),c(3,3),include.intercept=F).

— Scortchi-Monica Monica 복원

조건부 제곱 오차를 최소화하도록 지정하더라도 피팅 알고리즘에 약간의 차이가 있다고 가정합니다. 도움말 페이지 에는 시리즈 시작시 무시할 계산 수를 제공 arima하는 n.cond인수에 대한 설명이 있습니다. (어쨌든 최대 가능성을 사용하는 데 문제가 있습니까?)

— Scortchi-Monica Monica

AFAIK n.cond는 처음 몇 개의 관측치를 적합하게 사용하지 않습니다. 그것은 나를 도와주지 않았다. ML에는 전혀 문제가 없습니다. 차이점을 이해하고 싶습니다.

— user1965813

복제? stats.stackexchange.com/a/32799/159

— Rob Hyndman

답변:

에 대한 ARIMA 및에서 ARIMA를 사용하여 차이가있는 계열에 대해 ARMA 모델에서 약간 다른 결과를 초래하는 tseries::arma것과 비교할 때 세 가지 사소한 문제 stats::arima가 있습니다 .tseries::armastats::arima

계수의 시작 값 : stats::arima초기 AR 및 MA 계수를 0으로 설정하고 tseries::armaHannan and Rissanen (1982)에 설명 된 절차 를 사용하여 계수의 초기 값을 얻습니다.
목적 함수의 스케일 : 목적 함수 tseries::arma의 조건식 제곱합, RSS의 값을 반환합니다. stats::arima을 반환합니다 0.5*log(RSS/(n-ncond)).
최적화 알고리즘 : 기본적으로 Nelder-미드가에서 사용되는 tseries::arma동안, stats::arimaBFGS 알고리즘을 사용합니다.

마지막 것은 인수 optim.method를 통해 변경 될 수 stats::arima있지만 다른 것은 코드를 수정해야합니다. 아래에서는 stats::arima위에서 언급 한 세 가지 문제가에서와 동일하도록 수정 된 소스 코드 (이 특정 모델의 최소 코드)의 요약 된 버전을 보여줍니다 tseries::arma. 이러한 문제를 해결 한 후와 동일한 결과를 tseries::arma얻습니다.

최소 버전 stats::arima(위에서 언급 한 변경 사항 포함) :

# objective function, conditional sum of squares
# adapted from "armaCSS" in stats::arima
armaCSS <- function(p, x, arma, ncond)
{
  # this does nothing, except returning the vector of coefficients as a list
  trarma <- .Call(stats:::C_ARIMA_transPars, p, arma, FALSE)
  res <- .Call(stats:::C_ARIMA_CSS, x, arma, trarma[[1L]], trarma[[2L]], as.integer(ncond), FALSE)
  # return the conditional sum of squares instead of 0.5*log(res), 
  # actually CSS is divided by n-ncond but does not relevant in this case
  #0.5 * log(res)
  res
}
# initial values of coefficients  
# adapted from function "arma.init" within tseries::arma
arma.init <- function(dx, max.order, lag.ar=NULL, lag.ma=NULL)
{
  n <- length(dx)
  k <- round(1.1*log(n))
  e <- as.vector(na.omit(drop(ar.ols(dx, order.max = k, aic = FALSE, demean = FALSE, intercept = FALSE)$resid)))
      ee <- embed(e, max.order+1)
      xx <- embed(dx[-(1:k)], max.order+1)
      return(lm(xx[,1]~xx[,lag.ar+1]+ee[,lag.ma+1]-1)$coef) 
}
# modified version of stats::arima
modified.arima <- function(x, order, seasonal, init)
{
  n <- length(x)
  arma <- as.integer(c(order[-2L], seasonal$order[-2L], seasonal$period, order[2L], seasonal$order[2L]))
      narma <- sum(arma[1L:4L])
      ncond <- order[2L] + seasonal$order[2L] * seasonal$period
      ncond1 <- order[1L] + seasonal$period * seasonal$order[1L]
      ncond <- as.integer(ncond + ncond1)
      optim(init, armaCSS, method = "Nelder-Mead", hessian = TRUE, x=x, arma=arma, ncond=ncond)$par
}

이제 두 절차를 비교하고 동일한 결과를 산출하는지 확인하십시오 ( x질문 본문에서 OP에 의해 생성 된 계열이 필요함 ).

에서 선택된 초기 값 사용 tseries::arima:

dx <- diff(x)
fit1 <- arma(dx, order=c(3,3), include.intercept=FALSE)
coef(fit1)
#         ar1         ar2         ar3         ma1         ma2         ma3 
#  0.33139827  0.80013071 -0.45177254  0.67331027 -0.14600320 -0.08931003 
init <- arma.init(diff(x), 3, 1:3, 1:3)
fit2.coef <- modified.arima(x, order=c(3,1,3), seasonal=list(order=c(0,0,0), period=1), init=init)
fit2.coef
# xx[, lag.ar + 1]1 xx[, lag.ar + 1]2 xx[, lag.ar + 1]3 ee[, lag.ma + 1]1 
#        0.33139827        0.80013071       -0.45177254        0.67331027 
# ee[, lag.ma + 1]2 ee[, lag.ma + 1]3 
#       -0.14600320       -0.08931003 
all.equal(coef(fit1), fit2.coef, check.attributes=FALSE)
# [1] TRUE

stats::arima(0) 에서 선택한 초기 값 사용 :

fit3 <- arma(dx, order=c(3,3), include.intercept=FALSE, coef=rep(0,6))
coef(fit3)
#         ar1         ar2         ar3         ma1         ma2         ma3 
#  0.33176424  0.79999112 -0.45215742  0.67304072 -0.14592152 -0.08900624 
init <- rep(0, 6)
fit4.coef <- modified.arima(x, order=c(3,1,3), seasonal=list(order=c(0,0,0), period=1), init=init)
fit4.coef
# [1]  0.33176424  0.79999112 -0.45215742  0.67304072 -0.14592152 -0.08900624
all.equal(coef(fit3), fit4.coef, check.attributes=FALSE)
# [1] TRUE

— 자바 블레
소스

훌륭한 일. 대단히 감사합니다! 나를 위해 두 가지 솔루션을 일반 arima 기능과 비교할 수 있도록 허용 오차를 추가했으며 모두 매력처럼 작동했습니다. 고마워요!

— user1965813

내가 알 수있는 한, 그 차이는 전적으로 MA 용어 때문입니다. 즉, 귀하의 데이터를 AR 용어로만 맞추면 차이 계열의 ARMA와 ARIMA가 동의합니다.

— 웨인
소스