시계열의 Ljung-Box 테스트에 몇 개의 지연이 있습니까?


20

ARMA 모델이 시계열에 적합하면 Ljung-Box portmanteau 테스트 (다른 ​​테스트 중)를 통해 잔차를 확인하는 것이 일반적입니다. Ljung-Box 테스트는 p 값을 반환합니다. 테스트 할 지연 수인 매개 변수 h 가 있습니다. 일부 텍스트는 h = 20을 사용하는 것이 좋습니다 . 다른 사람들은 h = ln (n); 대부분은 무슨 말을하지 않는 시간을 사용 할 수 있습니다.

h에 단일 값을 사용하는 대신 모든 h <50에 대해 Ljung-Box 테스트를 수행 하고 최소 p 값을 제공하는 h 를 선택 한다고 가정하십시오 . 그 접근법은 합리적입니까? 장점과 단점은 무엇입니까? (한 가지 명백한 단점은 계산 시간이 증가한다는 점이지만 여기서는 문제가되지 않습니다.) 이에 대한 문헌이 있습니까?

약간 정교하게하기 위해 ... 테스트가 모든 h에 대해 p> 0.05를 주면 시계열 (잔여)이 테스트를 통과 한 것입니다. 내 질문은 h의 일부 값에 대해 p <0.05 이고 다른 값이 아닌 경우 테스트를 해석하는 방법에 관한 것 입니다.


1
@ user2875, 답변을 삭제했습니다. 사실 큰 h 경우 테스트가 신뢰할 수 없다는 것입니다. 따라서 답은 실제로 어떤 h , 달려 <0.05있습니다. 또한 정확한 값은 무엇 입니까? 임계 값을 줄이면 0.01테스트 결과가 변경됩니까? 개인적으로 갈등 가설이있는 경우 모델이 좋은지 아닌지 다른 지표를 찾습니다. 모델이 얼마나 잘 맞습니까? 모델을 다른 모델과 어떻게 비교합니까? 대체 모델에 동일한 문제가 있습니까? 다른 위반 사항에 대해 테스트에서 null을 거부합니까?
mpiktas

1
@mpiktas, Ljung-Box 검정은 분포가 무의식적으로 (h가 커짐에 따라) 카이 제곱 인 통계량을 기반으로합니다. 그러나 h가 n에 비해 커지면 검정력은 0으로 감소합니다. 따라서 분포가 카이 제곱에 가깝지만 유용한 검정력을 갖기에 충분히 작을 정도로 충분히 큰 h를 선택하고자합니다. (h가 작을 때 허위 부정의 위험이 무엇인지 모르겠습니다.)
user2875

@ user2875, 이번이 세 번째로 질문을 변경했습니다. 먼저 가장 작은 값으로 를 선택하는 전략에 대해 물어 본 다음 , h의 일부 값에 대해 p < 0.05 인 경우 검정을 해석하는 방법에 대해 알아보고 이제 최적의 h 는 무엇 을 선택해야합니까? 세 가지 질문 모두 답변이 다르며 특정 문제의 상황에 따라 답변이 다를 수도 있습니다. h<0.05hh
mpiktas

@ mpiktas, 질문은 모두 동일하며 그것을 보는 다른 방법입니다. (모든 h에 대해 p> 0.05 인 경우 가장 작은 p를 해석하는 방법을 알고, 최적의 h를 알고 있다면 가장 작은 p를 선택하는 것에 대해서는 걱정하지 않을 것입니다.)
user2875

답변:


9

대답은 확실히 다음에 달려 있습니다. 테스트를 실제로 사용하려고하는 것은 무엇입니까 ?

일반적인 이유는 : 더 이하로 자신감을 지연없이 자기 상관 최대의 귀무 가설의 공동 통계적 유의성에 대한 (또는 당신이 뭔가 가까이가 가정 약한 화이트 노이즈를 )하고 구축 할 수 인색 모델을 적게 갖는 가능한 한 많은 매개 변수.h

일반적으로 시계열 데이터는 자연적인 계절 패턴을 가지므로 실용적인 경험 규칙은 를이 값의 두 배로 설정하는 것 입니다. 예측 요구에 모델을 사용하는 경우 예측 수평선이 또 하나 있습니다. 마지막으로 후반에 중대한 변화가 발생하면 수정에 대해 생각해보십시오 (일부 계절적 영향으로 인한 것일 수도 있고 데이터가 특이 치에 대해 수정되지 않았을 수도 있음).h

h에 단일 값을 사용하는 대신 모든 h <50에 대해 Ljung-Box 테스트를 수행 한 다음 최소 p 값을 제공하는 h를 선택한다고 가정하십시오.

그것은의 공동 유의 의 선택 그렇다면, 시험 데이터 기반은, 왜 내가 미만 어떤 지연에 몇 가지 작은 (가끔?) 출발에 대해 관심을 가져야입니다 시간 이 훨씬 미만이라고 가정하면, n은 물론 (전력 언급 한 테스트 중). 간단하지만 관련성이 높은 모델을 찾기 위해 아래에 설명 된대로 정보 기준을 제안합니다.hhn

내 질문 은 h의 일부 값에 대해 이고 다른 값이 아닌 경우 테스트를 해석하는 방법에 관한 것 입니다.p<0.05h

따라서 현재와 얼마나 멀리 떨어져 있는지에 달려 있습니다. 원거리 이탈의 단점 : 더 많은 매개 변수 추정, 더 적은 자유도, 모델의 예측력 저하.

출발이 발생하는 지연 시간에 MA 및 / 또는 AR 부품을 포함하여 모델을 추정하고 추가로 정보 기준 중 하나 (샘플 크기에 따라 AIC 또는 BIC)를 확인하면 더 많은 모델에 대한 통찰력을 얻을 수 있습니다. 지극히 검소한. 샘플 외부 예측 연습도 여기에서 환영합니다.


+1, 이것은 표현하려고했지만 할 수 없었습니다. :
mpiktas

8

일반적인 속성을 모두 사용하여 간단한 AR (1) 모델을 지정한다고 가정합니다.

yt=βyt1+ut

오차항의 이론적 공분산을 다음과 같이 나타냅니다.

γjE(ututj)

오차항을 관찰 할 수 있으면 오차항의 표본 자기 상관은 다음과 같이 정의됩니다.

ρ~제이γ~제이γ~0

어디

γ~제이1=제이+1제이,제이=0,1,2 ...

그러나 실제로는 오류 용어를 관찰하지 않습니다. 따라서 오차 항과 관련된 표본 자기 상관은 다음과 같이 추정의 잔차를 사용하여 추정됩니다.

γ^제이1=제이+1^^제이,제이=0,1,2 ...

Box-Pierce Q- 통계량 (Ljung-Box Q는 무증상 중립 스케일 버전입니다)

QBP=nj=1pρ^j2=j=1p[nρ^j]2d???χ2(p)

우리의 문제는 정확히 여부 점근이 모델 (오류 용어에없는 autocorellation의 널 아래) 카이 제곱 분포를 갖는다 고 할 수있다. 이런 일이 일어나려면 √의 모든 사람과QBP
점근 적으로 표준 정규해야합니다. 이를 확인하는 방법은nρ^j 동일한 점근 분포nρ^ (실제 오류를 사용하여 구성되므로 null 아래에서 원하는 점근 적 동작을 갖습니다).nρ~

우리는 그것을 가지고

u^t=ytβ^yt1=ut(β^β)yt1

여기서 β는 일관된 추정이다. 그래서β^

γ^j1nt=j+1n[ut(β^β)yt1][utj(β^β)ytj1]

=γ~j1nt=j+1n(β^β)[utytj1+utjyt1]+1nt=j+1n(β^β)2yt1ytj1

샘플은 정지되고 인체 공학적인 것으로 가정하고 원하는 순서까지 모멘트가 존재한다고 가정합니다. 추정기 때문에 β가 일치하는 두 개의 합계가 제로에 갈 경우,이 충분하다. 그래서 우리는 결론β^

γ^jpγ~j

이것은

ρ^jpρ~jpρj

그러나 이것이 자동으로 √를 보장하지는 않습니다수렴 nρ^jnρ~j(분포) (임의 변수에 적용되는 변환이의존하기 때문에 연속 매핑 정리가 여기에 적용되지 않는다고 생각하십시오). 이런 일이 일어나려면n

nγ^jdnγ~j

(분모 물결 또는 모자-는 두 경우 모두 오차항의 분산으로 수렴하므로 우리의 문제에 중립적입니다).γ0

우리는

nγ^j=nγ~j1nt=j+1nn(β^β)[utytj1+utjyt1]+1nt=j+1nn(β^β)2yt1ytj1

So the question is : do these two sums, multiplied now by n, go to zero in probability so that we will be left with nγ^j=nγ~j asymptotically?

For the second sum we have

1nt=j+1nn(β^β)2yt1ytj1=1nt=j+1n[n(β^β)][(β^β)yt1ytj1]

Since [n(β^β)] converges to a random variable, and β^ is consistent, this will go to zero.

For the first sum, here too we have that [n(β^β)] converges to a random variable, and so we have that

1nt=j+1n[utytj1+utjyt1]pE[utytj1]+E[utjyt1]

The first expected value, E[utytj1] is zero by the assumptions of the standard AR(1) model. But the second expected value is not, since the dependent variable depends on past errors.

So nρ^j won't have the same asymptotic distribution as nρ~j. But the asymptotic distribution of the latter is standard Normal, which is the one leading to a chi-squared distribution when squaring the r.v.

Therefore we conclude, that in a pure time series model, the Box-Pierce Q and the Ljung-Box Q statistic cannot be said to have an asymptotic chi-square distribution, so the test loses its asymptotic justification.

This happens because the right-hand side variable (here the lag of the dependent variable) by design is not strictly exogenous to the error term, and we have found that such strict exogeneity is required for the BP/LB Q-statistic to have the postulated asymptotic distribution.

Here the right-hand-side variable is only "predetermined", and the Breusch-Pagan test is then valid. (for the full set of conditions required for an asymptotically valid test, see Hayashi 2000, p. 146-149).


1
You wrote "But the second expected value is not, since the dependent variable depends on past errors." That's called strict exogeneity. I agree that it's a strong assumption, and you can build AR(p) framework without it, just by using weak exogeneity. This the reason why Breusch-Godfrey test is better in some sense: if the null is not true, then B-L loses power. B-G is based on weak exogeneity. Both tests are not good for some common econometric, applications, see e.g. this Stata's presentation, p. 4/44.
Aksakal

3
@Aksakal Thanks for the reference. The point exactly is that without strict exogeneity, the Box-Pierce/Ljung-Box do not have an asymptotic chi-square distribution, this is what the mathematics above show. Weak exogeneity (which holds in the above model) is not enough for them. This is exactly what the presentation you link to says in p. 3/44.
Alecos Papadopoulos

2
@AlecosPapadopoulos, an amazing post!!! Among the few best ones I have encountered here at Cross Validated. I just wish it would not disappear in this long thread and many users would find and benefit from it in the future.
Richard Hardy

3

Before you zero-in on the "right" h (which appears to be more of an opinion than a hard rule), make sure the "lag" is correctly defined.

http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm

Quoting the section below Issue 4 in the above link:

"....The p-values shown for the Ljung-Box statistic plot are incorrect because the degrees of freedom used to calculate the p-values are lag instead of lag - (p+q). That is, the procedure being used does NOT take into account the fact that the residuals are from a fitted model. And YES, at least one R core developer knows this...."

Edit (01/23/2011): Here's an article by Burns that might help:

http://lib.stat.cmu.edu/S/Spoetry/Working/ljungbox.pdf


@bil_080, the OP does not mention R, and help page for Box.test in R mentions the correction and has an argument to allow for the correction, although you need to supply it manualy.
mpiktas

@mpiktas, Oops, you're right. I assumed this was an R question. As for the second part of your comment, there are several R packages that use Ljung-Box stats. So, it's a good idea to make sure the user understands what the package's "lag" means.
bill_080

Thanks--I am using R, but the question is a general one. Just to be safe, I was doing the test with the LjungBox function in the portes package, as well as Box.test.
user2875

2

The thread "Testing for autocorrelation: Ljung-Box versus Breusch-Godfrey" shows that the Ljung-Box test is essentially inapplicable in the case of an autoregressive model. It also shows that Breusch-Godfrey test should be used instead. That limits the relevance of your question and the answers (although the answers may include some generally good points).


The trouble with LB test is when autoregressive models have other regressors, i.e. ARMAX not ARM models. OP explicitly states ARMA not ARMAX in the question. Hence, I think that your answer is incorrect.
Aksakal

@Aksakal, I clearly see from Alecos Papadopoulos answer (and comments under it) in the above-mentioned thread that Ljung-Box test is inapplicable in both cases, i.e. pure AR/ARMA and ARX/ARMAX. Therefore, I cannot agree with you.
Richard Hardy

Alecos Papadopoulos's answer is good, but incomplete. It points out to Ljung-Box test's assumption of strict exogeneity but it fails to mention that if you're fine with the assumption, then L-B test is Ok to use. B-G test, which he and I favor over L-B, relies on weak exogeneity. It's better to use tests with weaker assumptions in general, of course. However, even B-G test's assumptions are too strong in many cases.
Aksakal

@Aksakal, The setting of this question is quite definite -- it considers residuals from an ARMA model. The important thing here is, L-B does not work (as shown explicitly in Alecos post in this as well as the above-cited thread) while B-G test does work. Of course, things can happen in other settings (even B-G test's assumptions are too strong in many cases) -- but that is not the concern in this thread. Also, I did not get what the assumption is in your statement if you're fine with the assumption, then L-B test is Ok to use. Is that supposed to invalidate Alecos point?
Richard Hardy

1

Escanciano and Lobato constructed a portmanteau test with automatic, data-driven lag selection based on the Pierce-Box test and its refinements (which include the Ljung-Box test).

The gist of their approach is to combine the AIC and BIC criteria --- common in the identification and estimation of ARMA models --- to select the optimal number of lags to be used. In the introduction of they suggest that, intuitively, ``test conducted using the BIC criterion are able to properly control for type I error and are more powerful when serial correlation is present in the first order''. Instead, tests based on AIC are more powerful against high order serial correlation. Their procedure thus choses a BIC-type lag selection in the case that autocorrelations seem to be small and present only at low order, and an AIC-type lag section otherwise.

The test is implemented in the R package vrtest (see function Auto.Q).


1

The two most common settings are min(20,T1) and lnT where T is the length of the series, as you correctly noted.

The first one is supposed to be from the authorative book by Box, Jenkins, and Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.. However, here's all they say about the lags on p.314: enter image description here

It's not a strong argument or suggestion by any means, yet people keep repeating it from one place to another.

The second setting for a lag is from Tsay, R. S. Analysis of Financial Time Series. 2nd Ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005, here's what he wrote on p.33:

Several values of m are often used. Simulation studies suggest that the choice of m ≈ ln(T ) provides better power performance.

This is a somewhat stronger argument, but there's no description of what kind of study was done. So, I wouldn't take it at a face value. He also warns about seasonality:

This general rule needs modification in analysis of seasonal time series for which autocorrelations with lags at multiples of the seasonality are more important.

Summarizing, if you just need to plug some lag into the test and move on, then you can use either of these setting, and that's fine, because that's what most practitioners do. We're either lazy or, more likely, don't have time for this stuff. Otherwise, you'd have to conduct your own research on the power and properties of the statistics for series that you deal with.

UPDATE.

Here's my answer to Richard Hardy's comment and his answer, which refers to another thread on CV started by him. You can see that the exposition in the accepted (by Richerd Hardy himself) answer in that thread is clearly based on ARMAX model, i.e. the model with exogenous regressors xt:

yt=xtβ+ϕ(L)yt+ut

However, OP did not indicate that he's doing ARMAX, to contrary, he explicitly mentions ARMA:

After an ARMA model is fit to a time series, it is common to check the residuals via the Ljung-Box portmanteau test

One of the first papers that pointed to a potential issue with LB test was Dezhbaksh, Hashem (1990). “The Inappropriate Use of Serial Correlation Tests in Dynamic Linear Models,” Review of Economics and Statistics, 72, 126–132. Here's the excerpt from the paper:

enter image description here

As you can see, he doesn't object to using LB test for pure time series models such as ARMA. See also the discussion in the manual to a standard econometrics tool EViews:

If the series represents the residuals from ARIMA estimation, the appropriate degrees of freedom should be adjusted to represent the number of autocorrelations less the number of AR and MA terms previously estimated. Note also that some care should be taken in interpreting the results of a Ljung-Box test applied to the residuals from an ARMAX specification (see Dezhbaksh, 1990, for simulation evidence on the finite sample performance of the test in this setting)

Yes, you have to be careful with ARMAX models and LB test, but you can't make a blanket statement that LB test is always wrong for all autoregressive series.

UPDATE 2

Alecos Papadopoulos's answer shows why Ljung-Box test requires strict exogeneity assumption. He doesn't show it in his post, but Breusch-Gpdfrey test (another alternative test) requires only weak exogeneity, which is better, of course. This what Greene, Econometrics, 7th ed. says on the differences between tests, p.923:

The essential difference between the Godfrey–Breusch and the Box–Pierce tests is the use of partial correlations (controlling for X and the other variables) in the former and simple correlations in the latter. Under the null hypothesis, there is no autocorrelation in εt , and no correlation between xt and εs in any event, so the two tests are asymptotically equivalent. On the other hand, because it does not condition on xt , the Box–Pierce test is less powerful than the LM test when the null hypothesis is false, as intuition might suggest.


I suppose that you decided to answer the question as it was bumped to the top of the active threads by my recent answer. Curiously, I argue that the test is inappropriate in the setting under consideration, making the whole thread problematic and the answers in it especially so. Do you think it is good practice to post yet another answer that ignores this problem without even mentioning it (just like all the previous answers do)? Or do you think my answer does not make sense (which would justify posting an answer like yours)?
Richard Hardy

Thank you for an update! I am not an expert, but the argumentation by Alecos Papadopoulos in "Testing for autocorrelation: Ljung-Box versus Breusch-Godfrey" and in the comments under his answer suggests that Ljung-Box is indeed inapplicable on residuals from pure ARMA (as well as ARMAX) models. If the wording is confusing, check the maths there, it seems fine. I think this is a very interesting and important question, so I would really like to find agreement between all of us here.
Richard Hardy

0

... h should be as small as possible to preserve whatever power the LB test may have under the circumstances. As h increases the power drops. The LB test is a dreadfully weak test; you must have a lot of samples; n must be ~> 100 to be meaningful. Unfortunately I have never seen a better test. But perhaps one exists. Anyone know of one ?

Paul3nt


0

There's no correct answer to this that works in all situation for the reasons other have said it will depend on your data.

That said, after trying to figure out to reproduce a result in Stata in R I can tell you that, by default Stata implementation uses: min(n22,40). Either half the number of data points minus 2, or 40, whichever is smaller.

All defaults are wrong, of course, and this will definitely be wrong in some situations. In many situations, this might not be a bad place to start.


0

Let me suggest you our R package hwwntest. It has implemented Wavelet-based white noise tests that do not require any tuning parameters and have good statistical size and power.

Additionally, I have recently found "Thoughts on the Ljung-Box test" which is excellent discussion on the topic from Rob Hyndman.

Update: Considering the alternative discussion in this thread regarding ARMAX, another incentive to look at hwwntest is the availability of a theoretical power function for one of the tests against an alternative hypothesis of ARMA(p,q) model.

당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.