Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models
- 10.2991/jsta.d.210121.001How to use a DOI?
- Autoregressive models; EM algorithm; Empirical likelihood; Estimating equations
In this paper, we first illustrate the restricted empirical likelihood function, as an alternative to the usual empirical likelihood. Then, we use this quasi-empirical likelihood function as a basis for Bayesian analysis of AR() time series models. The efficiency of both the posterior computation algorithm, when the estimating equations are linear functions of the parameters, and the EM algorithm for estimating hyper-parameters is an appealing property of our proposed approach. Moreover, the competitive finite-sample performance of this proposed method is illustrated via both simulation study and analysis of a real dataset.
- © 2021 The Authors. Published by Atlantis Press B.V.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
The empirical likelihood method is one of the most useful statistical tools that allows us to refer to under some regularity conditions. Initially, this statistical method was introduced by Thomas and Grunkemeier  and subsequently further developed by Owen [2,3].
Nowadays, many statisticians use this statistical method in different fields of applications. Owen [2,3] have used the empirical likelihood ratio statistic to test the assumptions assumed by nonparametric statistics. They have shown that this statistic has an asymptotic chi-square distribution. Moreover, they have used this statistic to construct a confidence region and to perform a hypothesis testing of parameters. Asymptotic properties and some necessary corrections of this statistic have been studied by DiCiccio and Romano  and Hall and Scala . Furthermore, Qin and Lawless  has demonstrated that in the nonparametric scheme, the empirical likelihood, along with estimating equations, fits the data well. For more details, see Newey and Smith , Chen and Cui, [8,9]), and Qin and Lowless .
In many statistical methods, estimating equations are commonly used to estimate the parameters of assumed model. Among them, the most well-known ones include the maximum likelihood (ML) method, the least square method, and the moment method. In this regard, the empirical likelihood approach, by applying unbiasedness constraint on estimating functions, leads to the maximum empirical likelihood (MEL) estimation, which is obtained by maximizing the empirical likelihood function, under the unbiasedness imposed on estimating equations involving the unknown probability The works from Chen et al. , Hjort et al. , Tang and Leng , Leng and Tang , and Chang et al.  have shown that empirical likelihood estimators perform well when the dimension of parameters and the number of estimating equations grow slower than the sample size , i.e. and
Tang and Leng , Leng and Tang , and Chang and Chen  have investigated the properties of empirical maximal likelihood estimators for high-dimensional data with sparse parameter vector. Practically, this is done by incorporating some suitable penalty functions. However, there are some challenges in employing empirical likelihood for large-scale data. Tsao  has shown that for a large enough and fixed the probability of the corresponding confidence region containing true parameter values can be substantially smaller than its nominal value, which leads to an under-coverage problem.
In the meantime, the use of empirical likelihood method for time series data analysis has also been a topic of interest for researchers. For more details, interested readers are referred to Mykland , Kitamura , Chan and Ling , and Chan and Liu . Monti  has developed the empirical likelihood method for the analysis of stationary time series. The author has used the M-estimator method, introduced by Whittle , for periodogram ordinates in time series models. Later, Yau  has extended the results in Monti  to long memory time series. Nordman and Lahiri  has argued that Monti's results were based on the assumption of normal distribution for the error term, and thus has extended the classical empirical likelihood to the one based on spectral distribution via Fourier transforms. Their results apply to both short and long memory dependencies; see Nordman and Lahiri  for details.
To the best of our knowledge, all the studies on empirical likelihood for time series rely on the unbiasedness property of estimating equations. But it seems that the unbiasedness condition alone, when data contains outliers, will produce misleading estimates for the parameters of interest (Bayati et al. ). In other words, in the usual empirical likelihood approach, features such as the dispersion of estimator functional, which depends on the dispersion of data itself, are not included in the statistical inference process. Therefore, it is necessary to use a pseudo-empirical likelihood that is a squaring function of the estimating equations. In this paper, for the first time we introduce the restricted empirical likelihood (REL) for the AR() time series models and present its Bayesian analysis. To our best knowledge, Bayati et al.  is the only one that has also used the idea of REL. Nevertheless, Bayati et al.  and this paper are very different in the sense that Bayati et al.  has studied the REL for independent observations while this paper investigates the REL for the more sophisticated case of dependent observations specifically from autoregressive models. In this paper, we not only prove the asymptotic normality of REL estimator for general models, but also give the detailed Bayesian analysis of AR() time series models using the REL and as well develop efficient algorithm for computing posterior and estimating hyper-parameters.
This article is organized as follows. After a glance at the empirical likelihood approach, we will illustrate the REL methodology in Section 2. The Bayesian analysis using the REL is discussed in Section 3. Bayesian analysis of AR() time series models, using the REL, are given in Section 4. A simulation study and an application to a real-world dataset are embedded in Section 5.
In this section, we first review the concept of empirical likelihood and then introduce the REL which is the main method proposed in this paper. The notations and definitions are provided along with other content.
2.1. Empirical Likelihood
Suppose are independent observations obtained from the unknown distribution function and they relate to the vector of parameters Moreover, we assume that , , are some estimating functions to estimate and they satisfy the unbiasedness conditions
In the usual empirical likelihood approach, the estimates of probabilities 's are obtained by maximizing the empirical likelihood function
So, it is straightforward to see thatwhere is the vector of Lagrange multipliers (see Owen  and Qin and Lawless ). So, the empirical likelihood function is given as
By using (4), the MEL estimator of is given by
There is a modified two-layer coordinate descent algorithm to compute (Tang and Wu ).
2.2. Restricted Empirical Likelihood
In previous subsection, the empirical likelihood function is given by (4). Now, by inequality for any arbitrary vectors and , we have
Indeed, the above equation is equivalent to the case that the function is optimized under the penalty function It gives us the idea that we can extend the expression (6) to a general form given by
Practically, the REL has two appealing properties in contrast to given in (4). First, we can assume in that the elements of vector are independent of , while in dependency of on is a basic assumption. This means that can be considered as a semi-parametric version of the usual EL function. Second, by assuming sparsity of vectors and (to reduce the number of parameters and the number of the estimating equations respectively) and being convex, we can use the REL method for high-dimensional data. For more details on the appealing properties of the REL , readers are referred to Bayati et al. . With these appealing properties, we propose the REL estimator of as given by
Under some regularity conditions on and , the REL estimator has an asymptotically multivariate normal distribution. This result is presented in following theorem with details. Its proof is straightforward and thus omitted to save space.
Under some well-known regularity conditions on and , i.e., and when are i.i.d., it holds thatwhere and is replaced with its estimate
Since is the maximizer of so we define
It is obvious that is the MLE when is log-likelihood function normalized by Therefore, the proof is the same as asymptotic normality of MLEs. Moreover, under the given regularity conditions, MLE is constant, i.e., as
In the next section, we focus on Bayesian analysis of considering the REL as the usual likelihood function.
3. BAYESIAN ANALYSIS WITH REL
Our main purpose in this paper is the Bayesian analysis of , viewing the REL as playing the role of a regular likelihood function. Based on this idea, we hope that the Bayesian estimation of can provide an acceptable approximation of the true values. For simplicity, we assume that the penalty function is an additive function, i.e., and moreover assume is proportional to a multivariate density. Consequently, we assume that
We assume in (9) that the two vectors and are independent. This assumption is often considered for simplicity and computational purposes.
Unlike the usual way, the priors and are dependent of This is often assumed in Bayesian statistical analysis; for more details see Bhattacharya et al.  and Ghoreishi . However, the functions and can also be chosen in such a way that the effect of sample size is ignorable.
Given the estimating functions and the priors and , one can estimate the desired parameters using the Bayesian approach. In practice, the marginal posterior distribution of parameters often do not have a closed form, therefore any inference is derived using samples via MCMC methods. However, in many statistical models in which the estimating functions are usually linear of the parameters, the conditional posterior distributions have the closed form and they are easy to sample using Gibbs sampling method. In the next section, we discuss this issue with more details for autoregressive models of order , i.e., AR().
4. REL FOR AR() MODELS
4.1. Basic Concepts
Autoregressive models show a random process where the output variable is a linear function of the same variable. Specifically, the AR() model is defined as
As we see, the estimating equations (12) are linear functions of parameters Therefore, we hope that the posterior conditional distributions have a closed form when we use the REL function instead of a regular likelihood function. Moreover, in the autoregressive model (11), the conditional distribution of given iswhere . On the other hand, the corresponding likelihood function is
Since the probabilities 's are practically unknown, we try to estimate them here using our REL approach. We assume that the observation is weighted with its corresponding conditional density function value as the weight. This means that up to a normalizing constant, the estimating equations (12) satisfy
4.2. Posterior Conditional Distributions
Without loss of generality, for Bayesian analysis using the REL (15), we assume and are i.i.d. with the following common priors
Then the conditional posterior distributions have closed forms given by
4.3. EM Algorithm for Estimating Hyper-parameters
One of the major challenges in sampling from the conditional posterior densities (17) is that the hyper-parameters and are unknown in practice. However, there are several methods to estimate these quantities. In this subsection, we introduce an EM algorithm for this purpose.
Now consider the logarithm of the integrand in (18) as a function of and given by
Our proposed EM algorithm for estimating and has the following steps:
Select the initial estimates () and () and start with .
Given () and (), produce a sample of size from the conditional distributions (17).
E-step: Calculate the averaged given by (19) using the generated samples in Step 2. Here, it should be noted that it is enough to calculate the mean of that expression in which only depend on and
M-step: Update and by maximizing the following two quantities with respect to and respectively:
The updated estimates at time are obtained as
Repeat Steps 2-4 until the algorithm converges.
4.4. Model Evaluation Criteria
In practice, there are several criteria in literature for evaluating the efficiency of time series models. One of those is the mean squared prediction error defined as In this article, we will use a similar criterion to BIC. This criterion, for AR() model, is constructed based on the REL function, abbreviated as EBIC, defined as
In practice, a small EBIC value confirms the adequacy of the model, whereas a large value indicates the inadequacy of the model.
5. SIMULATION STUDY AND APPLICATION
In this section, we first carry out a simulation study to confirm our theoretical results and then apply the proposed methods to a real dataset.
5.1. Simulation Study
Our simulation study focuses on the following autoregressive model of order 2:
For this model, we assume several scenarios for and In order for the AR(2) model to look like a nonparametric model, we assume that 's are generated from the following a little more complex distribution (a combination of normal and Cauchy distributions)
Here, we run the corresponding Gibbs sampling scheme (12) for two sample sizes and With the generated data under each scenario, we try to fit the following three models:
Model I:with estimating equation
Model II:with estimating equation
Model III:with estimating equations and
Moreover, for simplicity, we assume . We ran the sampling scheme (17) in order to generate samples from the marginal distributions of and and each time we fit Models I–III to the data. In addition to the estimates of and their standard errors, the MSPE and EBIC criteria are also computed. The numerical results are presented in Table 1. As we can see from it, Model III fits the data much better than the other two competitor models.
|n = 50||EL||−0.221(0.191)||-||54.627|
|n = 500||EL||−0.223(0.172)||-||61.001|
|n = 50||EL||-||−0.121(0.243)||67.423|
|n = 500||EL||-||−0.372(0.213)||71.001|
|n = 50||EL||−0.570(0.035)||−0.726(0.052)||22.117|
|n = 500||EL||−0.515(0.013)||−0.821(0.017)||17.921|
In this subsection, we analyze a real dataset on the Total Index of Consumer Goods and Services in urban areas of Iran for the period 1990–2017, extracted by The Central Bank of the Islamic Republic of Iran. The data are shown in Table 2 which demonstrates the presence of time dependency. A primary analysis reveals that an AR(2) model fits the data well after removing the trend. For this purpose, we first estimate the trend using a cubic polynomial given by
In this paper, a Bayesian method using the REL has been proposed for fitting the AR() time series model. An useful algorithm for computing the posterior has been developed for this approach. Moreover, the competitive performance of the proposed method has been clearly demonstrated via both simulation studies and a real data application. It is important to note that this method can be easily applied to other general linear models.
Cite this article
TY - JOUR AU - Mahdieh Bayati AU - S.K. Ghoreishi AU - Jingjing Wu PY - 2021 DA - 2021/02/03 TI - Restricted Empirical Likelihood Estimation for Time Series Autoregressive Models JO - Journal of Statistical Theory and Applications SP - 11 EP - 20 VL - 20 IS - 1 SN - 2214-1766 UR - https://doi.org/10.2991/jsta.d.210121.001 DO - 10.2991/jsta.d.210121.001 ID - Bayati2021 ER -