On Seemingly Unrelated Regression Model with Skew Error

Omid Akhgari; Mousa Golalizadeh

doi:10.2991/jsta.d.210126.002

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Volume 20, Issue 1, March 2021, Pages 97 - 110

On Seemingly Unrelated Regression Model with Skew Error

Authors

Omid Akhgari¹, Mousa Golalizadeh²^{, *}

¹Department of Statistics, Amin University, Tehran, Iran

²Department of Statistics, Tarbiat Modares University, Tehran, Iran

^*Corresponding author. Email: golalizadeh@modares.ac.ir

Corresponding Author

Mousa Golalizadeh

Received 7 October 2018, Accepted 5 January 2021, Available Online 8 February 2021.

DOI: 10.2991/jsta.d.210126.002 How to use a DOI?
Keywords: Seemingly unrelated regression; Endogenous variable; Exogenous variable; Skew-normal distribution
Abstract: Sometimes, invoking a single causal relationship to explain dependency between variables might not be appropriate particularly in some economic problems. Instead, two jointly related equations, where one of the explanatory variables is endogenous, can represent the actual inheritance inter-relationship among variables. Such typical models are called simultaneous equation models of which the seemingly unrelated regression (SUR) models is a special case. Substantial progress has been made regarding the statistical inference on estimating the parameters of these models in which errors follow a normal distribution. But, less research was devoted to a case that the distributions of the errors are asymmetric. In this paper, statistical inference on the parameters for the SUR models, assuming the skew-normal density for errors, is tackled. Moreover, the results of the study are compared with those of other naive methodologies. The proposed model is utilized to analyze the income and expenditure of Iranian rural households in the year 2009.
Copyright: © 2021 The Authors. Published by Atlantis Press B.V.
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

Most linear regression models rely on the relationship between a dependent variable to one or more explanatory variables. The main objective in treating these models is estimating and predicting the average value of dependent variables subject to some explanatory variables. But in many cases, particularly in some economic problems, the causal relationship represented by a single equation is not appropriate. The drawback of such single models is twofold. Mainly, not only does the response variable depends on the explanatory variables, but the response variable also determines some of the explanatory variables. Generally, it can be argued that there are simultaneous or two-sided relationships between the response and some of the explanatory variables in these cases. Hence, to separate the variables as explanatory and dependent does not make sense in real-life circumstances. In these situations, the number of equations will, naturally, be more than one. Precisely, there is an equation for every endogenous or dependent variable. Generally, following Haavelmo [1] when the dependent variable of a particular model is an explanatory variable, one should use the simultaneous equations models (SEMs). The particular case of these models is called the seemingly unrelated regression (SUR) model.

Evidence shows that Zellner [2] was the pioneer researcher to estimate the parameters of the SUR model using the generalized least square method. The history of the frequent approach to such models was somewhat low. But, there were much research on following the Bayesian approach. The application of the Bayesian approach in the SUR model was first proposed by Zellner [3]. Afterward, other methods for estimating parameters were used, including the maximum likelihood method [4], Bayesian moment and direct Monte Carlo method [5]. The MCMC application in the SUR model has appeared in many studies under various assumptions. To name some we can mention, for example, Percy [6], Chib and Greenberg [7], and Smith and Kohn [8]. Recently, Zellner and Ando [5] and also Zellner et al. [9] have investigated the estimation of the parameters in the SUR model using a hierarchical Bayes approach through the direct Monte Carlo and importance sampling techniques.

Another important aspect of the SUR models, which was and is worth to study, refers to the type of distribution considered for the error term. It is quite common to assume the normal density for this case. But, there are numerous examples in which the empirical distribution of variables often exhibits asymmetric structure and so the normal distribution can no longer be used in these cases. In these situations, some transformations may be used to make the distribution of data to, relatively, follow normal density. However, such transformations have their own drawbacks, including the biase of the estimator [10]. Using asymmetric distributions possessing the same characteristics as normal distribution, has recently received significant attention in the literature. The skew-normal distribution is one of the important distributions proposed to tackle the asymmetric feature of data. Historically, the univariate skew-normal distribution was advocated by Azzalini [11]. Then, Azzalini and Dalla valle [12] proposed the multivariate skew-normal distribution. Azzalini and Capitano [13] further studied the properties of this density. Several generalizations of this distribution have been presented by Balakrishnan [14], Genton [15], Guptaet al. [16], and Arellanovalle et al. [17]. Recently, Azzalini and Regoli [18] have investigated some other properties of the skew-symmetric distribution. As a new line of research, we consider the SUR model allowing the error in the model to follow the skew-normal distribution. The estimation of parameters using the maximum likelihood methodology is also treated. Intensive simulation studies are conducted to evaluate the proposed methods. Application of the model to real-life data is also given.

The present paper is organized as follows: A brief review of the SUR model is presented in Section 2. Then, a likelihood-based approach to estimate the parameters with the skew distribution for the errors in the SUR models is discussed in Section 3. The simulation study as well as the analysis of the real data, related to the Iranian rural households income and expenditure on in the year 2009, are presented in Section 4. General conclusions are provided at the end. The proofs for some of the results are given in the Appendix.

2. SUR MODEL

Suppose Xt is an n×kt matrix of explanatory variables and βt a column vector of parameters with the length kt. Furthermore, suppose there are g equations corresponding with g endogenous variables, a column vector with the length n, indicated by y1,…,yg. Hence, the t-th equation of a linear simultaneous system can be written as

yt=Xtβt+ut, t=1,…,g,(2.1)

where

E(uti)=0 Var(uti)=σttCov(uti,usi)=σts, t,s=1,2,…,g i=1,2,…,n.(2.2)

Let us assume that, g-vectors yi• and ui• consist of yti and uti, respectively, stacked vertically for fixed t. Accordingly, the k-vector β• is formed by stacking βi vertically. Then, the matrix Xt• will be of dimension g×k, where k=∑t=1gkt. In fact, it is a block-diagonal matrix with diagonal blocks Xti also for fixed t with rank 1×kt. In short, the notations can be summarized as follows:

yi•=(y1i ⋮ ygi )g×1 ui•=(u1i ⋮ ugi )g×1 β•=(β1 ⋮ βk )k×1 Xi•=(X1i…0 ⋮⋱⋮ 0…Xgi )g×k.(2.3)

Based on this notation model (2.1) can be rewritten as

yi•=Xi•β•+ui•, i=1,…,n,(2.4)

Note that as a common assumption, we now consider ui•~N(0g,Σ) where Σ={σts}g×g.

In the present study, we aim to estimate the parameters of this SUR model. This can be achieved via many parametric and nonparametric estimating procedures including 2SLS¹, 3SLS², GMM³, LIML⁴ and FIML,⁵ Anderson and Rubin [19], Theil [20], and Davidson and Mackinnon [21]. In this paper we focus on FIML according to normal and skew-normal errors assumption. Moreover, a number of important statistical features pertaining to these models are provided.

Based upon the information provided so far, we can write down the likelihood function to estimate the parameters. As is common, it is preferred to use the logarithm of the likelihood, in which we write it as l(β•,Σ), in our problem. It is given by

l(β•,Σ)=−ng2log2π−n2log|Σ|−12∑t=1n[(yt•−Xt•β•)TΣ−1(yt•−Xt•β•)],(2.5)

and should be maximized to obtain the FIML estimators. It is quite straightforward to show (see, e.g. Anderson and Rubin [19]) that the maximum likelihood estimators of the parameters are given by

β^•=[∑t=1nXt•TΣ−1Xt•]−1∑t=1nXt•′Σ−1yt•Σ^=1n∑t=1n[(yt•−Xt•β•)(yt•−Xt•β•)T].(2.6)

Moreover, via invoking a simple computation, it can be shown that

Var(β^•)=∑t=1nXt•TΣ−1Xt•−1.(2.7)

So far, the estimators have been calculated based on the assumption of normality for the error. However, if the distribution of errors is asymmetric, such as specifically skew-normal then to obtain the estimators are not as trivial as seen above. To treat this, we first briefly review the skew-normal distribution in the subsequent section. Then, the FIML estimators of the parameters are obtained under such assumption, while the model includes endogenous variables.

3. SUR MODELED WITH SKEW-NORMAL DISTRIBUTION

We first recall the definition and a few key properties of the skew-normal distribution, as given by Azzalini and Dalla Valle [12]. Suppose Z is a k-dimensional random variable, then it follows the multivariate skew-normal distribution if it is continuous with density function

2ϕk(z;Ψ)Φ(λTz),(zϵℝk),(3.1)

where ϕk(z;Ψ) is the k-dimensional normal density with zero mean vector and correlation matrix Ψ being of full rank, Φ(.) is the cumulative distribution function of the k-dimensional standard normal, and λ is a k-dimensional column vector with constant values. To show this in short form, it is common to write Z~SNk(0k,Ψ,λ).

The parameter λ plays a key role in representing the main features of density in (3.1). Since it controls the skewness of density, it is usually referred to as shape parameter or, also, skewness control. This density function is skewed to the right (left) for positive (negative) values of λ. When λ=0, the distribution function (3.1) reduces to N(0k,Ψ), where 0q is a zero vector of length q.

Location and scale parameters can be also added to the skew-normal density of Z given in (3.1). Let us write

Y=ξ+ωZ,(3.2)

where ξ=(ξ1,…,ξk)T, and ω=diag(ω1,…,ωk), are location and scale parameters, respectively. Note that components of ω are assumed to be all positive. The density function of Y is then given by

2ϕk(y−ξ;Ω)Φ(λTω−1(y−ξ)),(3.3)

where Ω=ωΨωT=ωΨω represents the covariance matrix of Y. We use the standard notation Y~SNk(ξ,Ω,λ) to indicate that Y follows the density function in (3.3). To have a general graphical view of this density, we provided some plots for particular values of the parameters in (3.3). The Figure 1 shows the contour plots of bivariate skew-normal density and the histogram of each variable for a bivariate skew-normal density. Now, we are in a position to concentrate on the estimators in an SUR model under the skew-normal distribution for the error term. Consider the model (2.4), with altering the index i to t, where

ut•=(ut1,…,utg)T~SNg(0g,Σ,λ), t=1,…,n.(3.4)

Now, suppose one is interested in the estimator of parameters in this model through the maximum likelihood approach. Then, corresponding logarithm of the likelihood function, say ℓ=l(λ,β•,Σ), which is given by

l=l(λ,β•,Σ)=nlog2−ng2log(2π)−n2log|Σ| −12∑t=1n[(yt•−Xt•β•)TΣ−1(yt•−Xt•β•)]+∑t=1nlog[Φ1(λTΣ−1/2ut•)],(3.5)

needs to be maximized. If we regard η=Σ−1/2λ as a new parameter, instead of λ, it results in splitting the parameters in (3.5) in the following sense: for fixed β and η, maximization of l with respect to Σ is equivalent to maximizing the analogous function for normal density for fixed β, which has a well-known solution (see, e.g. Mardia et al. [22]) given by

Σ^(β•)=V(β•)=1n∑t=1n(yt•−Xt•β•)(yt•−Xt•β•)T.(3.6)

By substituting this estimation into the expression in (3.5), one will obtain

l∗(η,β•)=C−n2log|V(β•)|−ng2+∑t=1nlogζ0(ηTut•),(3.7)

where ζ0(x)=log(2Φ(x)) and x~N(0,1). Now, to get the estimators for the rest of the parameters, one needs to maximize l∗(η,β•), which is, in fact, the profile likelihood function [23], with respect to η and β•. To do so, the partial derivatives of l∗(η,β•) with respect to η and β• can be written, respectively, as

∂l∗(η,β•)∂η=∑t=1nut•ζ1(ηTut•)=∑t=1n(yt•−Xt•β•)ζ1[η′(yt•−Xt•β•)]∂l∗(η,β•)∂β•=−n2∂log|V(β•)|∂β•−∑t=1nXt•Tηζ1[ηT(yt•−Xt•β•)] =−n2(tr(V−1∂V∂β1)tr(V−1∂V∂β2)⋮tr(V−1∂V∂βk))−∑t=1nXt•Tηζ1[ηT(yt•−Xt•β•)],(3.8)

where ζ1(x)=ϕ(x)/Φ(x). As seen, one cannot derive some closed solutions (estimators) from the equations in (3.8). Hence, some numerical maximization procedures need to be implemented for this purpose. There are numerous literature for such numerical computations. See, for example, Robert and Casella [24]. A common approach is to follow the quasi-Newton algorithm. To do so, we are required to get the second derivatives of the expression in (3.7). They are given as follows:

∂2l∗(η,β•)∂η∂ηT=∑t=1n(yt•−Xt•β•)(yt•−Xt•β•)Tζ2[ηT(yt•−Xt•β•)]∂2l∗(η,β•)∂β•T∂β•=−n2(tr(V−1∂2V∂β12)tr(V−1∂2V∂β2∂β1)…tr(V−1∂2V∂βk∂β1)⋮⋮⋱⋮tr(V−1∂2V∂β1∂βk)tr(V−1∂2V∂β2∂βk)…tr(V−1∂2V∂βk2)) +n2(tr(V−1∂V∂β1V−1∂V∂β1)tr(V−1∂V∂β2V−1∂V∂β1)…tr(V−1∂V∂βkV−1∂V∂β1)⋮⋮⋱⋮tr(V−1∂V∂β1V−1∂V∂βk)tr(V−1∂V∂β2V−1∂V∂βk)⋯tr(V−1∂V∂βkV−1∂V∂βk)) +∑t=1nXt•TηηTXt•ζ2[ηT(yt•−Xt•β•)]∂2l∗(η,β•)∂β•T∂η=−∑t=1n{Xt•Tη(yt•−Xt•β•)Tζ2[ηT(yt•−Xt•β•)]+Xt•Tζ1[ηT(yt•−Xt⋅β•)]},∂2l∗(η,β•)∂ηT∂β•=(∂2l∗(η,β•)∂β•T∂η)T,(3.9)

where ζ2(x)=−ζ1(x)[x+ζ1(x)]. If ϒ is the parameter of interest, using the gradient of the function in which this parameter appears, the quasi-Newton algorithm apply as

ϒ(k+1)=ϒ(k)−(∇2f)(k)−1(∇f)(k),(3.10)

where the indices are used to show the value of the estimator at corresponding stage and (ignoring the index)

∇f=(∂l∗(η,β•)∂η ∂l∗(η,β•)∂β• ), ∇2f=(∂2l∗(η,β•)∂β•∂β•T∂2l∗(η,β•)∂β•T∂η∂2l∗(η,β•)∂ηT∂β•∂2l∗(η,β•)∂η∂ηT).(3.11)

We conduct some simulation studies using model (2.1) along with normal and skew-normal distributions in the following section. Moreover, we investigate the application of these methods in real-life data.

4. SIMULATION STUDIES AND APPLICATION

Here, we outline our simulation study to evaluate the performance of the parameters estimation for the SUR models given in Section 2. Suppose we have the following model:

y1=β0+β1z1+β2x1+u1y2=γ0+γ1z1+γ2x2+u2.(4.1)

To further identification of this model, we need to indicate a distribution for (u1,u2)T. To start, let us assume u=(u1,u2)T~N(02,Σ), y1 and y2 are endogenous variables and z1, x1 and x2 are exogenous variables. To compare this model with an alternative, we also consider the case in which u=(u1,u2)T~SN(02,Σ,λ).

We fix the parameter in our simulation studies as β•=(6,−3,−4,9,3,−2)T, λ=(2,3)T and

Σ= (12−2−211).(4.2)

To initiate our simulation studies, we take the sample size equal to 1000, in which using two equations in (4.1) ends up with the total observations 2000. Then, we generate data for 1000 times from skew-normal distribution. Thereafter, the model was fitted by both maximum likelihood approaches (normal and skew-normal assumptions) as described in previous sections. Particularly, the parameters were estimated based upon either equations in (2.6) and (3.10), depending on the distribution considered for the errors in the model.

The results gained from our simulation studies for both the normal and skew-normal cases are given in Table 1. As seen, the table is partitioned into two parts. The three left- hand sides panels are related to the results coming from the normal assumption and the rest on the right belong to the skew-normal assumption both for error term. The distributions are indicated by N (Normal) and SN (Skew-Normal). Furthermore, the table includes estimate, standard deviation (SD), and effect size (ES).

	N-ML			SN-ML
Parameter	Estimate	SD	ES	Estimate	SD	ES
β0	9.552	0.602	3.552	5.736	0.313	0.264
β1	−3.001	0.018	0.001	−3.003	0.011	0.003
β2	−4.002	0.067	0.002	−3.982	0.023	0.018
γ0	18.11	0.751	9.11	8.711	0.451	0.289
γ1	3.007	0.063	0.007	3.007	0.029	0.007
γ2	−2.013	0.068	0.013	−1.969	0.037	0.031
σ11	20.55	4.849	8.55	12.47	1.059	0.474
σ22	41.42	5.543	30.42	11.25	3.377	0.258
σ12	−9.57	1.414	7.577	−1.72	1.175	0.28
λ1	–	–	–	2.210	0.691	0.210
λ2	–	–	–	3.211	1.080	0.211

Table 1

The result of SUR model fitted according to the skew-normal and normal assumptions.

Based on the results in Table 1, the estimates for β1, β2, γ1, and γ2 have small ES in both cases. The ES for the intercept is high regardless of which distribution is considered for the error term. However, it is higher in the normal model compared to the skew-normal case. Overall, the estimates in the SN case are closer to the real value of parameters before conducting the simulation. In general, when response variables follow a skew-normal distribution in the SUR model, the methods relied on the skew-normal density for the error leads to more accurate estimation than the normal density case.

One notes that the likelihood ratio test for the null hypothesis λ=0 can be considered as a criterion for a comparison in whether or not the skew-normal distribution should be considered. This test is given by

2{ℓ(β^•,Σ^,λ^)−ℓ(μ^,Ω^,0)},(4.3)

where β^•, Σ^, and λ^ denote the MLE under the assumption of skew-normality (shorten as SN-ML) and μ^ and Ω^, are MLE under the assumption of normality (shorten as N-ML) for the errors. Following Casella and Berger [25], the expression (4.3) follows χdf2 where df is the difference on the dimensions of parameter in the alternative and null hypotheses. The logarithm of the likelihood and AIC criterion for both methods appear in Table 2. As it can be seen, the logarithm of the likelihood for the SN-ML is higher than that of N-ML. Moreover, the AIC criterion for the SN-ML is less than that of the N-ML. Therefore, SN-ML outperforms N-ML in this study which means that, in comparison with the N-ML distribution, using the skew-normal density for the error term in the SUR model (3.10), leads to an improvement on the accuracy and bias of the estimators. Here, the likelihood ratio test statistics was LRT=2{ℓ(β^•,Σ^,λ^)−ℓ(μ^,Ω^,0)}=119.48 with df=2. Hence, the test is significant at 0.05 level; therefore, it can be stated that the skew parameters (λ) is not zero. This supports our initial assumption on considering the skew-normal distribution for the error terms.

Criteria	N-ML	SN-ML
AIC	13143.32	13035.198
Log likelihood	−6560.66	−6508.599

Table 2

Criteria to compare two methods of model parameters estimate.

We were interested in applying the proposed model in this paper in real-life data. To do this, we used the Iranian rural households income and expenditure data collected in the year 2009. It includes 13345 families from 32 provinces. In the present paper, the main goal is a survey effects of some variables on Iranian rural households income and expenditure. In this study, these two variables are considered as endogenous variables and other covariates are set as exogenous. Based on a general view and also consulting experts in the Statistical Center of Iran, the following SUR was utilized to express the inter-relationship between rural households income and expenditure in Iran:

GH=β0+∑i=14βCiCi+∑i=15βBiBi+βAA+ϵ1D=γ0+∑i=14γDiDi+ϵ2.(4.4)

A general description of the considered variables is provided in Table 3. Figures 2–4 present a geometric display of two important variables.

Variable Names	Abbreviation Signs	Variable Type	Coding
Households expenditure	GH	Quantitative	–
Households income	D	Quantitative	–
Family size	C1	Quantitative	–
Number of literate in household	C2	Quantitative	–
Number of employees in household	C3	Quantitative	–
Number of people with income	C4	Quantitative	–
Age	A	Quantitative	–
Floor area	B1	Quantitative	–
Private car	B2	Qualitative	1: Use, 0: Nonuse
Internet	B3	Qualitative	1: Use, 0: Nonuse
Gas	B4	Qualitative	1: Use, 0: Nonuse
Mobile	B5	Qualitative	1: Use, 0: Nonuse
Agriculture self-employment income	D1	Quantitative	–
Nonagriculture self-employment income	D2	Quantitative	–
Miscellaneous income	D3	Quantitative	–
Non-monetary other incomes	D4	Quantitative	–

Table 3

Description of variables utilized in model (4.4).

To initiate the analysis, the validity of the normality assumption for the response variables should be tested. We used the Kolmogorov–Smirnov (KS) test statistics for this purpose. The results of the KS test was significant with p-value <0.05, rejecting the null hypothesis; assuming the normality density. To have a visual inspection of the density, the Q-Q plot of the households income and expenditure are also drawn in Figure 5. They show the departure of univariate normal distribution for both variables. The contour plot in Figure 5 also demonstrates a departure from the bivariate normal distribution. It can be argued that some transformations, such as logarithm, to make density normal is appropriate. However, the income variable includes some negative values and so we are not allowed to utilize this transformation. Instead, we preferred to use the skew-normal distribution for the errors and attempted to model the rural households income and expenditure in Iran based upon this methodology. Nonetheless, to have a basement for our further comparison, the normal distribution was also considered for the errors in this example.

The results from employing aforementioned models for our example are appeared in Table 4. As seen, it includes three panels. The first (second) panel shows the results for the first (second) equation of the model (4.4). Confining ourselves only to those significant estimates of the parameters at %5 level, the results for the normal and skew-normal densities are provided in both panels. The last panel shows the estimation for the components of the covariance matrix and shape parameters. A test was conducted to check whether or not the skewness parameter (λ) is equal to zero. This led to LRT=2{ℓ(β^•,Σ^,λ^)−ℓ(μ^,Ω^,0)}=−24385.1−(−24444.4)=59.3 with df=2. Since the test was significant at 0.05 level, we accept that the skew parameter is not zero, and using the skew-normal MLE is more effective than the normal MLE.

	Estimation		Sth.error
Parameter	N-ML	SN-ML	N-ML	SN-ML
β0	−1.50	−1.34	0.047	0.006
βC1	0.036	0.040	0.009	0.001
βC2	0.082	0.046	0.010	0.002
βC3	0.106	0.059	0.011	0.004
βC4	0.061	−0.024	0.014	0.004
βB1	0.003	0.003	0.0006	0.0001
βB2	0.004	0.002	0.0002	0.0005
βB3	0.649	0.531	0.024	0.013
βB4	0.689	0.490	0.051	0.032
βB5	0.064	0.031	0.018	0.0085
βA	0.276	0.137	0.025	0.0064
γ0	0	−0.103	0.025	0.0038
γD1	0.544	0.499	0.013	0.0039
γD2	0.503	0.487	0.013	0.0039
γD3	0.412	0.352	0.013	0.0039
γD4	0.033	0.030	0.013	0.0038
σ11	0.656	0.051	0.011	0.009
σ21	0.084	0.009	0.009	0.001
σ22	0.315	0.018	0.008	0.004
λ1	–	1.181	–	0.104
λ2	–	0.869	–	0.097

Table 4

The result of fitting the seemingly unrelated regression (SUR) model in (4.4) considering the skew-normal and normal distributions assumption for the response in the Iranian rural households income and expenditure data on year 2009.

Based on the results given in the first panel of Table 4, using facilities (including the Internet, gas, and mobile), has a direct effect on family households expenditure in Iran. In other words, using these facilities can increase family households expenditure. It is also seen that, family size, number of literate, employees, and people with income in household and age have direct link with family households expenditure. Moreover, regarding the second panel of Table 4, the agriculture self-employment, non-agriculture self-employment, miscellaneous income, and non-monetary other incomes have direct effect on the family incomes.

5. CONCLUSION

There are some examples of encountering with data having an asymmetric histogram. Considering some skew-normal distributions is usually a solution to construct a model. The problem will be harder if one should take SEMs into account. Confining to the SUR model, which is a particular case of SEM, we discussed the method of estimation for the parameters of this model in this paper. Here, the response variables were following the skew-normal distribution. Performance of the proposed method has been compared with an alternative case in which the normal density is incorrectly assumed for the error. Then, we applied the methods discussed in this paper on real data. Results shown superiority of our approach to other methods relied on normal distribution for the error. There is still room to extend the model in this paper. One of the possible options is to investigate the performance of the Bayesian approach on the SUR model with skew-normal assumption for the error term. Moreover, to check how other skew distributions such as skew-t density works on the SUR models worth to study.

CONFLICTS OF INTEREST

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article

Funding Statement

Receiving support from the Center of Excellence in Analysis of Spatio Temporal Correlated Data at Tarbiat Modares University.

APPENDIX

Theorem: For any fixed (p×p) matrix A>0.

f(Σ)=|Σ|−n/2exp{−12trΣ−1A}(5.1)

is maximized over Σ>0 by Σ=n−1A, and so f(n−1A)=|n−1A|−n/2e−np2.

In Equation (3.8), ∂V∂βj is determined as follows:

Suppose Ai=yti−Xtiβi∗ is the i-th observation from i-th equation and βi∗ is ki-vector and XtiT is a ki-vector. Also consider

At∗=(A1A2…Ag0A2…Ag⋮⋮⋱⋮00…Ag)g × gβ•=(β1∗⋮βg∗)k × 1Xt•=(Xt10…00Xt2…0⋮⋮⋱⋮00…Xtg)g × k(5.2)

where k=∑i=1gki. Here, the main goal is to get the derivative of V with respect to j-th parameter of β•, that is βj (for j=1…k). Therefore, we define k-vector whose that its j-th element is 1 and the other ones are all zero. Similarly, we determine βj a g-vector in which its the i-th element is 1 and the other ones are zero. Since Xt(j)∗ is only appears in i-th equation in a particular manner, we define:

a=(0⋮1⋮0)k × 1 b=(0⋮1⋮0)g × 1.(5.3)

Hence, Xt(j)∗=bTXt•a where Xt(j)∗ is the corresponding variable to βj. The last step for determining the derivative of V is to set the matrix Ctj as

Ctj=(00…000…0⋮⋮⋱⋮−Xt(j)∗…0⋮⋮⋱⋮00…0)g×gAt∗.(5.4)

As it can be seen, the first column and i-th row of Ctj is equal to −Xt(j)∗. Finally, for all other observations, the corresponding derivative is given as:

∂V∂βj=1n∑t=1n(Ctj+CtjT).(5.5)

As a general rule, the Hessian matrix is required if one is interested in utilizing the quasi-Newton algorithm. The relevant derivatives to construct such a matrix are as follows:

∂2l∗∂β•T∂η=∂∂β•T[∑t=1n(yt−Xt•β•)ζ1(ηT(yt−Xt•β•))] =∑t=1n[∂∂β•Tytζ1(ηT(yt−Xt•β•))−∂∂β•TXt•β•ζ1(ηT(yt−Xt•β•))] =∑t=1n[−Xt•TηytTξ2(ηT(yt−Xt•β•))−Xt•Tζ1(ηT(yt−Xt•β•)) +Xt•Tηβ•TXt•Tζ2(ηT(yt−Xt•β•))] =−∑t=1n[Xt•Tη(yt−Xt•β•)Tζ2(ηT(yt−Xt•β•))+Xt•Tζ1(ηT(yt−Xt•β•))].(5.6)

Notice that we used the property (∂2ℓ∗∂ηT∂β•)T=∂2ℓ∗∂β•T∂η. The second derivative of ℓ∗ subject to η is straightforward. However, the computation of ∂2ℓ∗∂β•∂β•T is too tough. To obtain this derivative, we applied formula (5.5) to get:

∂2l∗∂β•∂β•T=∂∂β•{−n2(tr(V−1∂V∂β1)⋮tr(V−1∂V∂βk))T−∑t=1nηTXt•ζ1[η′(yt•−Xt•β•)]} =−n2∂∂β•(tr(V−1∂V∂β1)⋮tr(V−1∂V∂βk))T+∑t=1nXt•TηηTXt•ζ2(ηT(yt−Xt•β•)) =−n2(∂∂β1tr(V−1∂V∂β1)∂∂β2tr(V−1∂V∂β1)⋯∂∂βktr(V−1∂V∂β1)⋮⋮⋱⋮∂∂β1tr(V−1∂V∂βk)∂∂β2tr(V−1∂V∂βk)⋯∂∂βktr(V−1∂V∂βk)) +∑t=1nXt•TηηTXt•ζ2(ηT(yt−Xt•β•)) =−n2(tr(V−1∂2V∂β12)tr(V−1∂2V∂β2∂β1)…tr(V−1∂2V∂βk∂β1)⋮⋮⋱⋮tr(V−1∂2V∂β1∂βk)tr(V−1∂2V∂β2∂βk)…tr(V−1∂2V∂βk2)) +n2(tr(V−1∂V∂β1V−1∂V∂β1)tr(V−1∂V∂β2V−1∂V∂β1)…tr(V−1∂V∂βkV−1∂V∂β1)⋮⋮⋱⋮tr(V−1∂V∂β1V−1∂V∂βk)tr(V−1∂V∂β2V−1∂V∂βk)…tr(V−1∂V∂βkV−1∂V∂βk)) +∑t=1nXt•TηηTXt•ζ2[ηT(yt•−Xt•β•)].(5.7)

On getting (5.7), we employed the following equality in which F is a non-singular matrix:

∂2log|F|∂xi∂xj=∂tr(F−1∂F∂xj)∂xi=tr(F−1∂2F∂xi∂xj)−tr(F−1∂2F∂xi∂xj)(5.8)

The components of the second matrix in the last expression (5.7) are determined using (5.5). Assuming βj is a member of i-th equation in the SUR, we have:

B=Ctj+CtjT=(0…−Xt(j)∗A1…00…−Xt(j)∗A2…0⋮⋱⋮⋱⋮−Xt(j)∗A1…−2Xt(j)∗Ai…−Xt(j)∗Ag⋮⋱⋮⋱⋮0…−Xt(j)∗Ag…0)(5.9)

where all of the arrays equal zero except i-th row and column. The main diagonal of the favorite matrix βj is a member of i-th equation and so

∂2V∂βj2=(0…0…0⋮⋱⋮⋱⋮0…2Xt(j)∗…0⋮⋱⋮⋱⋮0…0…0) j=1,…k.(5.10)

If both βj and βl are members of i-th equation in a SUR, then; we have:

∂2V∂βjβl=(0…0…0⋮⋱⋮⋱⋮0…2Xt(j)∗Xt(l)∗…0⋮⋱⋮⋱⋮0…0…0) jl=1,…k(5.11)

where all of the arrays are zero except the element in the (ii) position. If βj is a member of i-th equation and βl is a member of m-th equation where i≠m, then; we have:

∂2V∂βjβl=(0……0…0⋮⋮⋱⋱⋮⋮0……Xt(j)∗Xt(l)∗…00Xt(j)∗Xt(l)∗………0⋮⋮⋱⋱⋮⋮0……0…0) jl=1,…k(5.12)

where all of the arrays are zero except (im)-th and (mi)-th components.

Footnotes

1

Two-stage Least Square

2

Three-stage Least Square

3

Generalized Method of Moments

4

Limited Information Maximum Likelihood

5

Fully Information Maximum Likelihood

REFERENCES

1.T. Haavelmo, Econometrica, Vol. 11, 1943, pp. 1-12.

2.A. Zellner, J. Am. Stat. Assoc., Vol. 58, 1963, pp. 977-992.

3.A. Zellner, An Introduction to Bayesian Inference in Econometrics, John Wiley and Sons, New York, NY, USA, 1971.

4.D.A.S. Fraser, M. Rekkasb, and A. Wong, J. Econom., Vol. 127, 2005, pp. 17-33.

5.A. Zellner and T. Ando, Bayesian Anal., Vol. 5, 2010, pp. 65-96.

6.D.F. Percy, J. R. Stat. Soc. Ser. B, Vol. 54, 1992, pp. 243-252.

7.S. Chib and E. Greenberg, J. Econ., Vol. 68, 1995, pp. 339-360.

8.M. Smith and R. Kohn, J. Econ., Vol. 98, 2000, pp. 257-282.

9.A. Zellner, T. Ando, N. Basturk, L. Hoogerheide, and H. Van Dijk, Econom. Rev., Vol. 33, 2014, pp. 3-35.

10.D. Warton and F. Hui, Ecol. Soc. Am., Vol. 92, 2011, pp. 3-10.

11.A. Azzalini, Statistica, Vol. 46, 1986, pp. 199-208.

12.A. Azzalini and A. Dalla Valle, Biometrika, Vol. 83, 1996, pp. 715-726.

13.A. Azzalini and A. Capitanio, J. R. Stat. Soc. B, Vol. 61, 1999, pp. 579-602.

14.N. Balakrishnan, Test, Vol. 11, 2002, pp. 37-39.

15.G.G. Genton, Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality, Chapman and Hall, Boca Raton, FL, USA, 2004.

16.A.K. Gupta, G. Gonzlez-faras, and J.A.A. Domnguez-molina, J. Multivar. Anal., Vol. 89, 2004, pp. 181-190.

17.R.B. Arellano-Valle, M.A. Cortes, and H.W. Gomez, Commun. Stat. Theory Methods, Vol. 39, 2010, pp. 912-922.

18.A. Azzalini and G. Regoli, Ann. Inst. Stat. Math., Vol. 64, 2012, pp. 857-879.

19.T.W. Anderson and H. Rubin, Ann. Math. Stat., Vol. 20, 1949, pp. 46-63.

20.H. Theil, Bull. Int. Stat. Inst., Vol. 34, 1954, pp. 122-129.

21.R. Davidson and J.G. MacKinnon, Estimation and Inference in Econometrics, Oxford University Press, New York, NY, USA, 1993.

22.K.V. Mardia, J.T. Kent, and J.M. Bibby, Multivariate Analysis, Academic Press, New York, NY, USA, 1979.

23.A. Raue, C. Kreutz, T. Maiwald, J. Bachmann, M. Schilling, U. Klingmuller, and J. Timmer, Bioinformatics, Vol. 25, 2009, pp. 1923-1929.

24.C. Robert and G. Casella, Monte Carlo Statistical Methods, Springer-Verlag, New York, NY, USA, 2004.

25.G. Casella and R. Berger, Statistical Inference, second, Duxbury Press, Pacific Grove, CA, USA, 2002.

<Previous Article In Issue

Download article (PDF)

Next Article In Issue>

Journal: Journal of Statistical Theory and Applications
Volume-Issue: 20 - 1
Pages: 97 - 110
Publication Date: 2021/02/08
ISSN (Online): 2214-1766
ISSN (Print): 1538-7887
DOI: 10.2991/jsta.d.210126.002 How to use a DOI?
Open Access: This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

ris enw bib

TY  - JOUR
AU  - Omid Akhgari
AU  - Mousa Golalizadeh
PY  - 2021
DA  - 2021/02/08
TI  - On Seemingly Unrelated Regression Model with Skew Error
JO  - Journal of Statistical Theory and Applications
SP  - 97
EP  - 110
VL  - 20
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210126.002
DO  - 10.2991/jsta.d.210126.002
ID  - Akhgari2021
ER  -

download .riscopy to clipboard