Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models

Mahdieh Bayati; S.K. Ghoreishi; Jingjing Wu

doi:10.2991/jsta.d.210121.001

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 20, Issue 1, March 2021, Pages 11 - 20

Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models

Authors

Mahdieh Bayati¹, S.K. Ghoreishi²^{, *}, Jingjing Wu³

¹Department of Statistics, Science and Research Branch, Azad University, I. R. Iran

²Department of Statistics, Faculty of Sciences, University of Qom, I. R. Iran

³Department of Mathematics and Statistics, University of Calgary, Canada

^*Corresponding author. Email: atty_ghoreishi@yahoo.com

Corresponding Author

S.K. Ghoreishi

Received 3 June 2020, Accepted 18 January 2021, Available Online 3 February 2021.

DOI: 10.2991/jsta.d.210121.001 How to use a DOI?
Keywords: Autoregressive models; EM algorithm; Empirical likelihood; Estimating equations
Abstract: In this paper, we first illustrate the restricted empirical likelihood function, as an alternative to the usual empirical likelihood. Then, we use this quasi-empirical likelihood function as a basis for Bayesian analysis of AR( $r$ ) time series models. The efficiency of both the posterior computation algorithm, when the estimating equations are linear functions of the parameters, and the EM algorithm for estimating hyper-parameters is an appealing property of our proposed approach. Moreover, the competitive finite-sample performance of this proposed method is illustrated via both simulation study and analysis of a real dataset.
Copyright: © 2021 The Authors. Published by Atlantis Press B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

The empirical likelihood method is one of the most useful statistical tools that allows us to refer to under some regularity conditions. Initially, this statistical method was introduced by Thomas and Grunkemeier [1] and subsequently further developed by Owen [2,3].

Nowadays, many statisticians use this statistical method in different fields of applications. Owen [2,3] have used the empirical likelihood ratio statistic to test the assumptions assumed by nonparametric statistics. They have shown that this statistic has an asymptotic chi-square distribution. Moreover, they have used this statistic to construct a confidence region and to perform a hypothesis testing of parameters. Asymptotic properties and some necessary corrections of this statistic have been studied by DiCiccio and Romano [4] and Hall and Scala [5]. Furthermore, Qin and Lawless [6] has demonstrated that in the nonparametric scheme, the empirical likelihood, along with estimating equations, fits the data well. For more details, see Newey and Smith [7], Chen and Cui, [8,9]), and Qin and Lowless [6].

In many statistical methods, estimating equations are commonly used to estimate the parameters of assumed model. Among them, the most well-known ones include the maximum likelihood (ML) method, the least square method, and the moment method. In this regard, the empirical likelihood approach, by applying unbiasedness constraint on estimating functions, leads to the maximum empirical likelihood (MEL) estimation, which is obtained by maximizing the empirical likelihood function, under the unbiasedness imposed on estimating equations involving the unknown probability $p1,p2,…,pn.$ The works from Chen et al. [9], Hjort et al. [10], Tang and Leng [11], Leng and Tang [12], and Chang et al. [13] have shown that empirical likelihood estimators perform well when the dimension $p$ of parameters and the number $r$ of estimating equations grow slower than the sample size $n$ , i.e. $p=o(n)$ and $r=o(n).$

Tang and Leng [11], Leng and Tang [12], and Chang and Chen [13] have investigated the properties of empirical maximal likelihood estimators for high-dimensional data with sparse parameter vector. Practically, this is done by incorporating some suitable penalty functions. However, there are some challenges in employing empirical likelihood for large-scale data. Tsao [14] has shown that for a large enough $n$ and fixed $p,$ the probability of the corresponding confidence region containing true parameter values can be substantially smaller than its nominal value, which leads to an under-coverage problem.

In the meantime, the use of empirical likelihood method for time series data analysis has also been a topic of interest for researchers. For more details, interested readers are referred to Mykland [15], Kitamura [16], Chan and Ling [17], and Chan and Liu [18]. Monti [19] has developed the empirical likelihood method for the analysis of stationary time series. The author has used the M-estimator method, introduced by Whittle [21], for periodogram ordinates in time series models. Later, Yau [21] has extended the results in Monti [19] to long memory time series. Nordman and Lahiri [22] has argued that Monti's results were based on the assumption of normal distribution for the error term, and thus has extended the classical empirical likelihood to the one based on spectral distribution via Fourier transforms. Their results apply to both short and long memory dependencies; see Nordman and Lahiri [24] for details.

To the best of our knowledge, all the studies on empirical likelihood for time series rely on the unbiasedness property of estimating equations. But it seems that the unbiasedness condition alone, when data contains outliers, will produce misleading estimates for the parameters of interest (Bayati et al. [24]). In other words, in the usual empirical likelihood approach, features such as the dispersion of estimator functional, which depends on the dispersion of data itself, are not included in the statistical inference process. Therefore, it is necessary to use a pseudo-empirical likelihood that is a squaring function of the estimating equations. In this paper, for the first time we introduce the restricted empirical likelihood (REL) for the AR( $r$ ) time series models and present its Bayesian analysis. To our best knowledge, Bayati et al. [24] is the only one that has also used the idea of REL. Nevertheless, Bayati et al. [24] and this paper are very different in the sense that Bayati et al. [24] has studied the REL for independent observations while this paper investigates the REL for the more sophisticated case of dependent observations specifically from autoregressive models. In this paper, we not only prove the asymptotic normality of REL estimator for general models, but also give the detailed Bayesian analysis of AR( $r$ ) time series models using the REL and as well develop efficient algorithm for computing posterior and estimating hyper-parameters.

This article is organized as follows. After a glance at the empirical likelihood approach, we will illustrate the REL methodology in Section 2. The Bayesian analysis using the REL is discussed in Section 3. Bayesian analysis of AR( $r$ ) time series models, using the REL, are given in Section 4. A simulation study and an application to a real-world dataset are embedded in Section 5.

2. PRELIMINARIES

In this section, we first review the concept of empirical likelihood and then introduce the REL which is the main method proposed in this paper. The notations and definitions are provided along with other content.

2.1. Empirical Likelihood

Suppose $x=(x1,x2,…,xn)′∈Rn$ are $n$ independent observations obtained from the unknown distribution function $f$ and they relate to the vector of parameters $ϕ=(ϕ1,...,ϕr)′.$ Moreover, we assume that $gj(x;ϕ)$ , $j=1,…,r$ , are some estimating functions to estimate $ϕ$ and they satisfy the unbiasedness conditions

$EP{gj(x;ϕ)}=∑i=1npigj(xi;ϕ)=0,j=1,…,r,$ (1)

where

$P={p1,…,pn}.$ Moreover, let

$g(x;ϕ)=g1(x;ϕ),...,gr(x;ϕ)′$ .

In the usual empirical likelihood approach, the estimates of probabilities $pi$ 's are obtained by maximizing the empirical likelihood function

$L(ϕ)∝∏i=1npi$ (2)

under conditions (1) and

$∑i=1npi=1$ . Equivalently,

$L(ϕ)=supP∏i=1npi:pi>0,∑i=1npi=1,∑i=1npig(xi;ϕ)=0.$ (3)

So, it is straightforward to see that

$pi∝11+λ(ϕ)′g(xi;ϕ),$

where

$λ(ϕ)=λ1(ϕ),λ2(ϕ),…,λr(ϕ)′$ is the vector of Lagrange multipliers (see Owen [2] and Qin and Lawless [6]). So, the empirical likelihood function is given as

$L(ϕ)∝∏i=1n11+λ(ϕ)′g(xi;ϕ).$ (4)

By using (4), the MEL estimator of $ϕ$ is given by

$ϕ̂EL=argminϕmaxλ∑i=1nlog1+λ(ϕ)′g(xi;ϕ).$

There is a modified two-layer coordinate descent algorithm to compute $ϕ̂EL$ (Tang and Wu [25]).

2.2. Restricted Empirical Likelihood

In previous subsection, the empirical likelihood function is given by (4). Now, by inequality $a′b≤a′a+b′b2=∥a∥2+∥b∥22$ for any arbitrary vectors $a$ and $b$ , we have

$∏i=1n11+∥λ(ϕ)∥2+∥g(xi;ϕ)∥22≤∏i=1n11+λ(ϕ)′g(xi;ϕ).$ (5)

We hope that the value of $ϕ$ that maximizes the left side of (5) approaches to the maximizing value of $ϕ$ for the right side. The left side of (5) can be rewritten as

$∏i=1n11+∥λ(ϕ)∥2+∥g(xi;ϕ)∥22=∏i=1n2w(ϕ)11+w(ϕ)∥g(xi;ϕ)∥2=[2w(ϕ)]n∏i=1n11+w(ϕ)∥g(xi;ϕ)∥2,$ (6)

where

$w(ϕ)=12+∥λ(ϕ)∥2$ . Therefore, the pseudo-empirical maximum likelihood estimator (MLE) of

$ϕ$ is given by

$ϕ̂=argminϕ∈Φmaxw∈ℝ+n∑i=1nlog1+w(ϕ)∥g(xi;ϕ)∥2−nlogw(ϕ).$

Indeed, the above equation is equivalent to the case that the function $∑i=1nlog1+w(ϕ)∥g(xi;ϕ)∥2$ is optimized under the penalty function $−logw(ϕ).$ It gives us the idea that we can extend the expression (6) to a general form given by

$e−nH(w,ϕ)∏i=1n11+w′g∗(xi;ϕ),$ (7)

where

$w=(w1,…,wr)′$ and

$g∗(x;ϕ)=g12(x;ϕ),…,gr2(x;ϕ)′.$ Here, we assume

$H(w,ϕ)$ plays the role of a penalty function and define our REL, an alternative to the usual empirical likelihood, as

$LR(w,ϕ)=∏i=1n11+w′g∗(xi;ϕ).$ (8)

Practically, the REL $LR(w,ϕ)$ has two appealing properties in contrast to $L(ϕ)$ given in (4). First, we can assume in $LR(w,ϕ)$ that the elements of vector $w$ are independent of $ϕ$ , while in $L(ϕ)$ dependency of $λ$ on $θ$ is a basic assumption. This means that $LR(w,ϕ)$ can be considered as a semi-parametric version of the usual EL function. Second, by assuming sparsity of vectors $ϕ$ and $w$ (to reduce the number of parameters and the number of the estimating equations respectively) and $−logREL$ being convex, we can use the REL method for high-dimensional data. For more details on the appealing properties of the REL $LR(w,ϕ)$ , readers are referred to Bayati et al. [24]. With these appealing properties, we propose the REL estimator of $ϕ$ as given by

$(ϕ̂REL,ŵREL)=argminϕ∈ℝr,w∈ℝ+r∑i=1nlog1+wTg∗(xi;ϕ)+nH(w,ϕ).$

Under some regularity conditions on $g∗$ and $H$ , the REL estimator $ϕ̂REL$ has an asymptotically multivariate normal distribution. This result is presented in following theorem with details. Its proof is straightforward and thus omitted to save space.

Theorem 2.1.

Under some well-known regularity conditions on $g∗$ and $H$ , i.e., $∂2g∗(x;ϕ)∂ϕi∂ϕj<∞$ and $∂2H(w,ϕ)∂ϕi∂ϕj<∞,$ when $x1,x2,…,xn$ are i.i.d., it holds that

$n(ϕ̂−ϕ)→Nr0,I−1(ϕ),$

where

$I(ϕ)=E∂2log1+wTg∗(x;ϕ)+H(w,ϕ)∂ϕi∂ϕji,j=1,…,r$

and

$w$ is replaced with its estimate

$ŵREL.$

Proof.

Since $ϕ̂$ is the maximizer of $∑i=1nlog1+wTg∗(xi;ϕ)+nH(w,ϕ),$ so we define

$Tn(ϕ)=1n∑i=1nlog1+wTg∗(xi;ϕ)+nH(w,ϕ)=1n∑i=1nlog1+wTg∗(xi;ϕ)+H(w,ϕ).$

It is obvious that $ϕ̂$ is the MLE when $Tn(ϕ)$ is log-likelihood function normalized by $1n.$ Therefore, the proof is the same as asymptotic normality of MLEs. Moreover, under the given regularity conditions, MLE $ϕ̂$ is constant, i.e., $ϕ̂→ϕ$ as $n→∞.$

In the next section, we focus on Bayesian analysis of $ϕ,$ considering the REL as the usual likelihood function.

3. BAYESIAN ANALYSIS WITH REL

Our main purpose in this paper is the Bayesian analysis of $ϕ$ , viewing the REL as playing the role of a regular likelihood function. Based on this idea, we hope that the Bayesian estimation of $ϕ$ can provide an acceptable approximation of the true values. For simplicity, we assume that the penalty function $H(w,ϕ)$ is an additive function, i.e., $H(w,ϕ)=H1(w)+H2(ϕ),$ and moreover assume $e−nH(w,ϕ)$ is proportional to a multivariate density. Consequently, we assume that

$π1(w)∝e−nH1(w)andπ2(ϕ)∝e−nH2(ϕ),$ (9)

are the corresponding prior densities. We have the following two points to make on the densities.

We assume in (9) that the two vectors $w$ and $ϕ$ are independent. This assumption is often considered for simplicity and computational purposes.
Unlike the usual way, the priors $π1$ and $π2$ are dependent of $n.$ This is often assumed in Bayesian statistical analysis; for more details see Bhattacharya et al. [26] and Ghoreishi [27]. However, the functions $H1$ and $H2$ can also be chosen in such a way that the effect of sample size is ignorable.

Based on the above assumptions, using (8) and (9) along with observations $D={x1,⋯,xn},$ the joint posterior density of $(w,ϕ)$ is

$πw,ϕD∝LR(w,ϕ)π1(w)π2(ϕ).$ (10)

Given the estimating functions $g∗$ and the priors $π1$ and $π2$ , one can estimate the desired parameters using the Bayesian approach. In practice, the marginal posterior distribution of parameters often do not have a closed form, therefore any inference is derived using samples via MCMC methods. However, in many statistical models in which the estimating functions are usually linear of the parameters, the conditional posterior distributions have the closed form and they are easy to sample using Gibbs sampling method. In the next section, we discuss this issue with more details for autoregressive models of order $r$ , i.e., AR( $r$ ).

4. REL FOR AR( $r$ ) MODELS

4.1. Basic Concepts

Autoregressive models show a random process where the output variable is a linear function of the same variable. Specifically, the AR( $r$ ) model is defined as

$Xt=ϕ1Xt−1+ϕ2Xt−2+⋯+ϕrXt−r+ϵt,t=r+1,…,n,$ (11)

where

$ϕ1,ϕ2,…,ϕr$ are the model parameters, and

$ϵt$ 's are white noises with mean 0 and constant variance

$σ2$ . In our empirical approach, we assume the distribution of

$ϵt$ 's is totally unknown. In practice, there are several methods to estimate the model coefficients. A popular approach is Yule-Walker's system of equations, i.e.,

$∑t=r+1nXt−jXt−ϕ1Xt−1−ϕ2Xt−2−⋯−ϕrXt−r=0,j=1,2,…,r,$

where

$gjt(X,ϕ)=Xt−jXt−ϕ1Xt−1−ϕ2Xt−2−⋯−ϕrXt−r,j=1,2,…,r$ (12)

are the estimating equations. Later in our Bayesian approach, we will use the weighed Yule-Walker's system of equations as our estimating equations.

As we see, the estimating equations (12) are linear functions of parameters $ϕ1,ϕ2,…,ϕr.$ Therefore, we hope that the posterior conditional distributions have a closed form when we use the REL function instead of a regular likelihood function. Moreover, in the autoregressive model (11), the conditional distribution of $(xr+1,…,xn)$ given $(x1,…,xr)$ is

$f(xr+1,…,xn|ϕ,x1,…,xr)=∏t=r+1nf(xt|ϕ,xt−1,…,xt−r)=∏t=r+1npt.$

where

$pt:=f(xt|ϕ,xt−1,xt−2,…,xt−r)$ . On the other hand, the corresponding likelihood function is

$L(ϕ)=∏t=r+1npt,$ (13)

Since the probabilities $pt$ 's are practically unknown, we try to estimate them here using our REL approach. We assume that the observation $xt$ is weighted with its corresponding conditional density function value $pt$ as the weight. This means that up to a normalizing constant, the estimating equations (12) satisfy

$EP{gjt(x;ϕ)}=∑t=r+1nptgj(x,ϕ)=∑t=r+1nptxt−jxt−ϕ1xt−1−ϕ2xt−2−⋯−ϕrxt−r=0,$ (14)

where

$P={pr+1,pr+2,…,pn}.$ Finally, combining (8), (13) and (14) gives the corresponding REL as

$LR(ϕ,w)=∏t=r+1n11+∑j=1rwjxt−j2xt−ϕ1xt−1−ϕ2xt−2−⋯−ϕrxt−r2.$ (15)

4.2. Posterior Conditional Distributions

Without loss of generality, for Bayesian analysis using the REL (15), we assume $wj,ϕj$ and $sj$ are i.i.d. with the following common priors

$wj~π1=inv.gamma(n,β0),ϕj~π2=N(0,σ02),sj~f(uj)=Exp(1),$ (16)

where the additional variables

$sr+1,sr+2,…,sn$ (the argument

$ur+1,…,un$ in their respective distribution functions) appeared in equality

$LR(ϕ,w)=∏t=r+1n11+∑j=1rwjxt−j2xt−ϕ1xt−1−ϕ2xt−2−⋯−ϕrxt−r2=∫0∞⋯∫0∞e−∑t=r+1nut1+∑j=1rwjxt−j2xt−∑ν=1rϕνxt−ν2dur+1⋯dun.$

Then the conditional posterior distributions have closed forms given by

$ut~Exp2+∑j=1rwjxt−j2xt−ϕ1xt−1−⋯−ϕrxt−r2,t=r+1,…,n,wj~gIG(2∑t=r+1nutxt−j2xt−ϕ1xt−1−⋯−ϕrxt−r2+1,2β0,−n),j=1,…,r,ϕl~Nμl,σl2,l=1,…,r,$ (17)

where gIG stands for the generalized inverse Gaussian distribution and

$μl=∑t=r+1n∑j=1rutwjxt−j2xtxt−l−∑t=r+1n∑j=1rutwjxt−j2∑k≠lrxt−lxt−kϕk∑t=r+1n∑j=1rutwjxt−j2xt−l2+12β0,σl=2∑t=r+1n∑j=1rutwjxt−j2xt−l2+1β0−0.5.$

4.3. EM Algorithm for Estimating Hyper-parameters

One of the major challenges in sampling from the conditional posterior densities (17) is that the hyper-parameters $β0$ and $σ02$ are unknown in practice. However, there are several methods to estimate these quantities. In this subsection, we introduce an EM algorithm for this purpose.

From (15) and (16), the marginal density of the observations, $x$ , is obtained as

$f(x|β0,σ02,x1,…,xr)=∫⋯∫e−∑t=r+1nst1+∑j=1rwjxt−j2xt−∑ν=1rϕνxt−ν2×1σ0r2e−12σ02∑j=1rϕj2×∏j=1rβ0nwjn+1e−β0wjdsdwdϕ.$ (18)

Now consider the logarithm of the integrand in (18) as a function of $β0$ and $σ02$ given by

$I(β0,σ02)=−∑t=r+1nst1+∑j=1rwjxt−j2xt−∑ν=1rϕνxt−ν2−r2logσ0−12σ02∑j=1rϕj2+nrlogβ0−(n+1)∑j=1rlogwj−∑j=1rβ0wj.$ (19)

Our proposed EM algorithm for estimating $β0$ and $σ0$ has the following steps:

Select the initial estimates $β0(0)$ ( $>0$ ) and $σ02(0)$ ( $>0$ ) and start with $k=1$ .
Given $β0(k)$ ( $>0$ ) and $σ02(k)$ ( $>0$ ), produce a sample of size $N$ from the conditional distributions (17).
E-step: Calculate the averaged $I(β0,σ02)$ given by (19) using the generated samples in Step 2. Here, it should be noted that it is enough to calculate the mean of that expression in $I(β0,σ02)$ which only depend on $β0$ and $σ02.$
M-step: Update $β0$ and $σ02$ by maximizing the following two quantities with respect to $β0$ and $σ02$ respectively:
$Q3(σ02)=r4logσ02+12σ02(k)∑j=1rEσ02,β0(k)ϕj2,Q4(β0)=nrlogβ0−(n+1)∑j=1rEσ02(k),β0(k)logwj−β0∑j=1rEσ02(k),β0(k)1wj.$

The updated estimates at time $(k+1)$ are obtained as
$σ02(k+1)=1r∑j=1rEσ02(k),β0(k)ϕj2,β0(k+1)=nc∑j=1rEσ02(k),β0(k)1wj.$

Repeat Steps 2-4 until the algorithm converges.

4.4. Model Evaluation Criteria

In practice, there are several criteria in literature for evaluating the efficiency of time series models. One of those is the mean squared prediction error defined as $MSPE=1n∑t=1n(xt−x̂t)2.$ In this article, we will use a similar criterion to BIC. This criterion, for AR( $r$ ) model, is constructed based on the REL function, abbreviated as EBIC, defined as

$EBIC=rlogn+∑t=1nlog1+∑j=1rwjxt−j2xt−ϕ̂1xt−1−ϕ̂2xt−2−⋯−ϕ̂rxt−r2.$

In practice, a small EBIC value confirms the adequacy of the model, whereas a large value indicates the inadequacy of the model.

5. SIMULATION STUDY AND APPLICATION

In this section, we first carry out a simulation study to confirm our theoretical results and then apply the proposed methods to a real dataset.

5.1. Simulation Study

Our simulation study focuses on the following autoregressive model of order 2:

$Xt=ϕ1Xt−1+ϕ2Xt−2+ϵt,t=3,4,…,n.$

For this model, we assume several scenarios for $ϕ=(ϕ1,ϕ2)$ and $n.$ In order for the AR(2) model to look like a nonparametric model, we assume that $ϵt$ 's are generated from the following a little more complex distribution (a combination of normal and Cauchy distributions)

$ϵi~0.5N(0,2)+0.5C(0,1).$

Here, we run the corresponding Gibbs sampling scheme (12) for two sample sizes $n=50$ and $500.$ With the generated data under each scenario, we try to fit the following three models:

Model I:
$Xt=ϕ1Xt−1+ϵt$
with estimating equation $g(X,ϕ1)=Xt−1(Xt−ϕ1Xt−1).$
Model II:
$Xt=ϕ2Xt−2+ϵt$
with estimating equation $g(X,ϕ2)=Xt−2(Xt−ϕ1Xt−2).$
Model III:
$Xt=ϕ1Xt−1+ϕ2Xt−2+ϵt$
with estimating equations $g1(X,ϕ1,ϕ2)=Xt−1(Xt−ϕ1Xt−1−ϕ2Xt−2)$ and $g2(X,ϕ1,ϕ2)=Xt−2(Xt−ϕ1Xt−1−ϕ2Xt−2).$

Moreover, for simplicity, we assume $w:=w1=w2$ . We ran the sampling scheme (17) in order to generate $5000$ samples from the marginal distributions of $ϕ1$ and $ϕ2,$ and each time we fit Models I–III to the data. In addition to the estimates of $(ϕ1,ϕ2)$ and their standard errors, the MSPE and EBIC criteria are also computed. The numerical results are presented in Table 1. As we can see from it, Model III fits the data much better than the other two competitor models.

		$(ϕ1,ϕ2)$	Method	$ϕ̂1$	$ϕ̂2$	MSPE	EBIC
		(−0.5, −0.8)	REL	−0.319(0.127)	-	50.941	10.693

	n = 50		EL	−0.221(0.191)	-	54.627
		(0.5, −0.8)	REL	−0.355(0.108)	-	34.713	10.261

Model I			EL	−0.312(0.123)	-	39.101
		(−0.5, −0.8)	REL	−0.302(0.168)	-	58.121	14.180

	n = 500		EL	−0.223(0.172)	-	61.001
		(0.5, −0.8)	REL	0.368(0.143)	-	33.975	13.990

			EL	0.311(0.144)	-	42.214

		(−0.5, −0.8)	REL	-	−0.146(0.221)	55.914	8.573

	n = 50		EL	-	−0.121(0.243)	67.423
		(0.5, −0.8)	REL	-	−0.417(0.197)	39.128	7.857

Model II			EL	-	0.314(0.202)	44.121
		(−0.5, −0.8)	REL	-	−0.586(0.200)	56.842	10.000

	n = 500		EL	-	−0.372(0.213)	71.001
		(0.5, −0.8)	REL	-	−0.222(0.209)	28.613	8.156

			EL	-	0.199(0.266)	37.813

		(−0.5, −0.8)	REL	−0.547(0.032)	−0.767(0.047)	20.542	4.705

	n = 50		EL	−0.570(0.035)	−0.726(0.052)	22.117
		(0.5, −0.8)	REL	0.546(0.041)	−0.642(0.032)	21.222	4.440

Model III			EL	0.561(0.043)	−0.689(0.033)	22.473
		(−0.5, −0.8)	REL	−0.505(0.012)	−0.805(0.014)	17.124	6.805

	n = 500		EL	−0.515(0.013)	−0.821(0.017)	17.921
		(0.5, −0.8)	REL	0.491(0.011)	−0.798(0.011)	16.978	6.508

			EL	0.484(0.012)	−0.811(0.012)	17.008

Table 1

The REL and EL estimates of

$(ϕ1,ϕ2)$ under various models (and their standard errors inside the parentheses) and the associated MSPE and EBIC.

5.2. Application

In this subsection, we analyze a real dataset on the Total Index of Consumer Goods and Services in urban areas of Iran for the period 1990–2017, extracted by The Central Bank of the Islamic Republic of Iran. The data are shown in Table 2 which demonstrates the presence of time dependency. A primary analysis reveals that an AR(2) model fits the data well after removing the trend. For this purpose, we first estimate the trend using a cubic polynomial given by

$xt=α+βt3+ϵt,ϵt=ϕ1ϵt−1+ϕ2ϵt−2+ut,$ (20)

where the error terms

$ut$ 's have an unknown distribution with mean 0 and variance

$σ2$ . With the cubic polynomial, we obtain the estimates

$α̂=−0.639$ and

$β̂=0.004$ , by fitting a linear regression model. Kolmogorov–Smirnove test statistic for

$ϵt$ is equal to

$0.233$ with

$Sig=0.000.$ This shows that the residuals do not have a normal distribution. Since the size of the data in not large enough, there is no strong evidence that

$ϵ̂t=xt−α̂−β̂t3$

follow any particular distribution. Thus, we will use our proposed REL method to fit a time series model to

$ϵ̂t$ 's. Using the posterior conditional distributions (17), the Bayesian estimates of

$ϕ1$ and

$ϕ2$ in model (20) are

$ϕ̂1=1.335$ and

$ϕ̂2=−0.465$ respectively. The results indicate that the two-level model (20) fits the data well with the determination coefficient

$R2=0.996,MSPE=3.96,$ and

$EBIC=6.327.$

Year	1990	1991	1992	1993	1994	1995	1996	1997	1998	1999	2000	2001	2002	2003
$xt$	1.00	1.20	1.50	1.80	2.50	3.70	4.50	5.30	6.30	7.50	8.50	9.40	10.90	12.60
$xt̂$	-	-	1.14	1.51	1.87	2.79	4.22	4.92	5.85	7.09	8.55	9.70	10.86	12.92
Year	2004	2005	2006	2007	2008	2009	2010	2011	2012	2013	2014	2015	2016	2017
$xt$	14.50	16.10	18.00	21.30	26.70	29.50	33.20	40.30	52.60	70.90	81.90	91.70	100.00	109.6
$xt̂$	15.04	17.39	19.31	21.85	26.19	32.76	34.96	39.65	48.56	62.91	82.94	90.53	100.02	108.18

Table 2

Total Index of Consumer Goods and Services and their fitted values in urban areas of Iran from 1990 to 2017.

6. CONCLUSION

In this paper, a Bayesian method using the REL has been proposed for fitting the AR( $r$ ) time series model. An useful algorithm for computing the posterior has been developed for this approach. Moreover, the competitive performance of the proposed method has been clearly demonstrated via both simulation studies and a real data application. It is important to note that this method can be easily applied to other general linear models.

REFERENCES

1.D.R. Thomas and G.L. Grunkemeier, J. Am. Stat. Assoc., Vol. 70, 1975, pp. 865-871.

2.A. Owen, Biometrika, Vol. 75, 1988, pp. 237-249.

3.A. Owen, Ann. Stat., Vol. 18, 1990, pp. 90-120.

4.T.J. DiCiccio and J.P. Romano, Biometrika, Vol. 76, 1989, pp. 447-456.

5.P. Hall and B. La Scala, Internat. Statist. Review, Vol. 58, 1990, pp. 109-127.

6.J. Qin and J. Lawless, Ann. Stat., Vol. 22, 1994, pp. 300-325.

7.W.K. Newey and R.J. Smith, Econometrica, Vol. 72, 2004, pp. 219-255.

8.S.X. Chen and H. Cui, Biometrika, Vol. 93, 2007, pp. 215-220.

9.S.X. Chen, L. Peng, and Y.L. Qin, Biometrika, Vol. 96, 2009, pp. 711-722.

10.N.L. Hjort, I.W. McKeague, and I.V. Keilegom, Ann. Stat., Vol. 37, 2009, pp. 1079-1111.

11.C.Y. Tang and C. Leng, Biometrika, Vol. 97, 2010, pp. 905-920.

12.C. Leng and C.Y. Tang, Biometrika, Vol. 99, 2012, pp. 703-716.

13.V.J. Chang and X. Chen, Econometrics, Vol. 185, 2015, pp. 283-304.

14.M. Tsao, Ann. Stat., Vol. 32, 2004, pp. 1215-1221.

15.P.A. Mykland, Ann. Stat., Vol. 23, 1995, pp. 396-421.

16.Y. Kitamura, Ann. Stat., Vol. 25, 1997, pp. 2084-2102.

17.N.H. Chan and S. Ling, Econ. Theory, Vol. 22, 2006, pp. 402-428.

18.N.H. Chan and L. Liu, Electron. J. Stat., Vol. 40, 2010, pp. 221-238.

19.A.C. Monti, Biometrika, Vol. 84, 1997, pp. 395-405.

20.P. Whittle, Arkiv Matematik, Vol. 2, 1953, pp. 423-434.

21.C.Y. Yau, J. Time Ser. Anal., Vol. 33, 2012, pp. 269-275.

22.D.J. Nordman and S.N. Lahiri, Ann. Stat., Vol. 34, 2006, pp. 3019-3050.

23.D.J. Nordman and S.N. Lahiri, Plan. Inference, Vol. 155, 2014, pp. 1-18.

24.M. Bayati, S.K. Ghoreishi, and J. Wu, Comput. Stat., 2020.

25.C.Y. Tang and T.T. Wu, J. Stat. Comput. Simul., Vol. 84, 2014, pp. 1917-1930.

26.A. Bhattacharya, D. Pati, N.S. Pillai, and D.B. Dunson, Dirichlet-Laplace priors for optimal shrinkage, 2014. arXiv:1401.5398v1

27.S.K. Ghoreishi, J. Stat. Theory Appl., Vol. 16, 2017, pp. 53-64.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: Journal of Statistical Theory and Applications
Volume-Issue: 20 - 1
Pages: 11 - 20
Publication Date: 2021/02/03
ISSN (Online): 2214-1766
ISSN (Print): 1538-7887
DOI: 10.2991/jsta.d.210121.001 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Mahdieh Bayati
AU  - S.K. Ghoreishi
AU  - Jingjing Wu
PY  - 2021
DA  - 2021/02/03
TI  - Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models
JO  - Journal of Statistical Theory and Applications
SP  - 11
EP  - 20
VL  - 20
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210121.001
DO  - 10.2991/jsta.d.210121.001
ID  - Bayati2021
ER  -

download .riscopy to clipboard

Journal of Statistical Theory and Applications

Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models

1. INTRODUCTION

2. PRELIMINARIES

2.1. Empirical Likelihood

2.2. Restricted Empirical Likelihood

Theorem 2.1.

Proof.

3. BAYESIAN ANALYSIS WITH REL

4. REL FOR AR(r<mml:math display="inline" id="im91" xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>r</mml:mi></mml:math>) MODELS

4.1. Basic Concepts

4.2. Posterior Conditional Distributions

4.3. EM Algorithm for Estimating Hyper-parameters

4.4. Model Evaluation Criteria

5. SIMULATION STUDY AND APPLICATION

5.1. Simulation Study

5.2. Application

6. CONCLUSION

REFERENCES

Cite this article

4. REL FOR AR( $r$ ) MODELS