Journal of Statistical Theory and Applications

Volume 17, Issue 4, December 2018, Pages 636 - 646

Efficient Estimator of Parameters of a Multivariate Geometric Distribution

Authors
U. J. Dixit1, *, S. Annapurna2
1Department of Statistics, University of Mumbai, Mumbai, India
2Department of Statistics, St. Xaviers College (Autonomous), Mumbai, India
*

Corresponding author. Email: ullllhasdixit@yahoo.com.in

Received 10 March 2017, Accepted 27 December 2017, Available Online 31 December 2018.
DOI
10.2991/jsta.2018.17.4.6How to use a DOI?
Keywords
Multivariate geometric distribution; Maximum likelihood estimator; Uniformly minimum variance unbiased estimator; modified MLE
Abstract

The maximum likelihood estimator (MLE) and uniformly minimum variance unbiased estimator (UMVUE) for the parameters of a multivariate geometric distribution (MGD) have been derived. A modification of the MLE estimator (modified MLE) has been derivedin which case the bias is reduced. The mean square error (MSE) of the modified MLE is less than the MSE of the MLE. Variances of the parameters and the corresponding generalized variance (GV) has been obtained. It has been shown that the MLE and modified MLE are consistent estimators. A comparison of the GVs of modified MLE and UMVUE has shown that the modified MLE is more efficient than the UMVUE. In the final section its application has been discussed with an example of actual data.

Copyright
© 2018 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

1. INTRODUCTION

It is appropriate and convenient to measure lifetime of devices such as on/off switches, bulbs, engines of an airplane on a discrete scale. Discrete random variables also help study lifetimes such as the incubation period of diseases like AIDS, the remission time of cancers as well as time to failure of engineering systems (see [1]). The discrete multivariate distributions are useful to measure lifetime data. The multivariate geometric distribution (MGD) has been vital in studying reliability analysis. Various models of the bivariate geometric distribution (BGD) have been proposed to study lifetime devices. Downtown [2] has described a model for developing a BGD. This arises in a shock model with two components. Downtown [2] describes this model asfollows. Suppose that the number of shocks suffered by each component before failure can be represented by a population in which proportions p1 and p2 affected the first and second components respectively, without failure and a proportion 1-p1-p2 of the shocks lead to failure of both the components. Hence X is number of shocks to component 1 prior to the first failure and Y is number of shocks to component 2 prior to the first failure. The joint probability function of (X, Y) is given by

πX,Y(t1,t2)=(1p1p2)(1p1t1p2t2)1

The corresponding joint probability mass function of (X,Y) is

P(X=x,Y=y)=x+yxp1xp2yp3;x=0,1,2,,y=0,1,2,,0<p1<1;0<p2<1;p3=1p1p20;otherwise

Hare Krishna and Pundir [3] have obtained the MLE and Baye’s estimators of the parameters for this BGD. Dixit and Annapurna [4] have further obtained the UMVUE estimators and have compared the MLE and UMVUE based on the mean square errors (MSEs). Phatak and Sreehari [5] introduced a version of bivariate geometric distribution as a stochastic model for giving the distribution of good and marginally good items that are produced by a production unit. Marshall and Olkin [6] constructed a BGD based on the sequence of Bernoulli random variables in which X was defined as the number of trials required for the rth occurence of an event A and Y was the number of trials required for the sth occurence of an event B. If we consider Bernoulli trials then (X,Y) will have 4 possible values (0,0), (0,1), (1,0) and (1,1).

We define PX=i,Y=j=pij;i,j=0,1.

Let p00+p01=p0+,p10+p11=p1+,p00+p10=p+0,p01+p11=p+1 Marginally X and Y have negative binomial distribution. X follows NB (r,p1+) and Y follows NB (s,p+1).

The marginal probability functions are of the usual form but the joint probability function of (X,Y) is quite complicated. Reference may be made to Marshall and Olkin ([6], Eq. 7.2).

On the same lines, Gultekin and Bairamov [7] constructed a trivariate geometric distribution and the corresponding multivariate extension. Srivastava and Bagchi [8] introduced the multivariate version of a geometric distribution and obtained certain characterizations. Vasudeva and Srilakshminarayana [9] established some properties of the MGD and also obtained a characterization assuming it to follow the power series distribution. Sreehari and Vasudeva [10] have given characterization of the MGD based on conditional distributions. Esary and Daniel [11] studied properties of MGDs that were generated by a cumulative damage process. In this paper we look at another form of the MGD and estimate its parameters by a new approach that reduces the bias.

Consider a system which comprises of k components namely C1,C2,…,Ck. The system is so designed that at any given time not more than one component can function. The system functions when any one component functions. The system initially functions because C1 functions. When C1 stops functioning C2 starts functioning in the next trial keeping the system functioning. Thus system continues to function in this manner till Ck functions. Let probability that component Ci fails be pi, i = 1,2,3, …, k. Let Xi denote the trial at which component Ci fails, i = 1,2, …, k.

The joint probability mass function of (X1,X2, …, Xk) is given as

P(X1=x1,X2=x2,,Xk=xk) =(1p1)x11p1(1p2)x2x11p2(1pk)xkxk11pk;1x1<x2<<xk;0<pi<1;i=1,,k0; otherwise

The probability generating function (pgf) is given as

PX1X2Xkt1,t2,,tk=p1p2pkt1t22tkk11p1t1t2tk11p2 t2t3tk11pktk

In this paper we obtain UMVUE and MLE of the parameters in Eq. (1) and their functions.

2. UNIFORM MINIMUM VARIANCE UNBIASED ESTIMATOR (UMVUE)

Here we obtain the UMVUE of the parameters as well as of the functions of the parameters. We consider the trivariate case which has three parameters, p1, p2 and p3. Here pi denotes the corresponding probability of the ith component in the system failing.

Consider the case where k = 3. Eqs. (1) and (2) become

P(X1=x1,X2=x2,X3=x3)=(1p1)x11p1(1p2)x2x11p2(1p3)x3x21p3;1x1<x2<x3;0<pi<1;i=1,2,30; otherwise
PX1X2X3t1,t2,t3=p1p2p3t1t22t3311p1t1t2t311p2t2t311p3t3

The pgf of S1=i=1nX1i,S2=i=1nX2i and S3=i=1nX3i is

PS1S2S3t1,t2,t3=p1p2p3t1t22t3311p1t1t2t311p2t2t311p3t3n

Hence the pmf of S1, S2 and S3 is co-efficient of t1s1t2s2t3s3 in Eq. (7) and is given as

Ps1,s2,s3=s3s21s3s2ns2s11s2s1ns11s1np1np2np3n(1p1)s1n(1p2)s2s1n(1p3)s3s2n;ns1<s2<s3<;0<p1<1;0<p2<1;0<p3<1

Theorem 2.1.

The UMVUE of p1a1p2a2p3a3(1p1)b1(1p2)b2(1p3)b3 is

s1b1a11s1b1ns2s1b2a21s2s1b2ns3s2b3a31s3s2b3ns11s1ns2s11s2s1ns3s21s3s2n

Proof.

The trivariate joint distribution of X11X21X31,X12X22X32,,X1nX2nX3n belongs to the exponential family and (S1, S2 and S3) is sufficient and complete for Eq. (1). Hence by using Rao-Blackwell theorem we can obtain the UMVUE of p1a1p2a2p3a3(1p1)b1(1p2)b2(1p3)b3.

Let ϕs1,s2,s3 be the UMVUE of p1a1p2a2p3a3(1p1)b1(1p2)b2(1p3)b3

E(ϕ(s1,s2,s3))=s1=ns1=s2+ns3=s2+nϕ(s1,s2,s3)(s3s21s3s2n)(s2s11s2s1n)(s11s1n)×p1np2np3n(1p1)s1n(1p2)s2s1n(1p3)s3s2n;=p1a1p2a2p3a3(1p1)b1(1p2)b2(1p3)b3

Hence

s1=ns2=s1+ns3=s2+nϕ(s1,s2,s3)(s3s21s3s2n)(s2s11s2s1n)(s11s1n)×p1na1p2na2p3na3(1p1)s1nb1(1p2)s2s1nb2(1p3)s3s2nb3=1;

Therefore

ϕs1,s2,s3=s1b1a11s1b1ns2s1b2a21s2s1b2ns3s2b3a31s3s2b3ns11s1ns2s11s2s1ns3s21s3s2n

Particular Cases

  1. a1=a2=a3=1 and b1=b2=b3=0

    p1p2p3^=(s12s1n)(s2s12s2s1n)(s3s22s3s2n)(s11s1n)(s2s11s2s1n)(s3s21s3s2n)=(n1)3(s11)(s2s11)(s3s21)

  2. a1=1 and a2=a3=b1=b2=b3=0

    p1^=s12s1ns11s1n=n1s11

  3. a2=1 and a1=a3=b1=b2=b3=0

    p2^=s2s12s2s1ns2s11s2s1n=n1s2s11

  4. a3=1 and a1=a2=b1=b2=b3=0

    p3^=s3s22s3s2ns3s21s3s2n=n1s3s21

  5. a1=b1=1 and a2=a3=b2=b3=0

    p1(1p1)^=(s13s1n1)(s11s1n)=(n1)(s1n)(s11)(s12)

    Similarly it is possible to obtain UMVUE for various combinations of p1a1p2a2p3a3(1p1)b1(1p2)b2(1p3)b3 for different values of a1,a2,a3,b1,b2 and b3.

Theorem 2.2.

The UMVUE in the multivariate case of

i=1kpiai(1pi)bi=i=1ksisi1biai1sisi1binsisi11sisi1n;si=j=1nXij  ,s0=0
Proof is similar to proof of Theorem 2.1.

3. MAXIMUM LIKELIHOOD ESTIMATOR (MLE)

In the earlier section we have obtained an estimator based on the criteria of unbiasedness and minimum variance. We now look at another very popular principle used namely method of maximum likelihood to obtain the estimators of the functions of the parameters. These shall be compared with the corresponding estimators obtained by UMVUE in order to study their efficiency. The likelihood function based on n systems put on test strictly under the same conditions will be

L  =i=1n  (1p1)x1i1p1(1p2)x2ix1i1p2(1pk)xkixki11pk=  (1p1)s1np1n(1p2)s2s1np2n(1pk)sksk1npkn

Taking log and differentiating w.r.t pi, i = 1,2, …, k, the MLEs are obtained as

The MLE of pi; i = 1,2,. …, k is

pi^=nsisi1;  s0=0

By invariance property, MLE of i=1kpiai(1pi)bi is

i=1kpiai(1pi)bi^=i=1knsisi1aisisi1nsisi1bi

4. MODIFIED MAXIMUM LIKELIHOOD ESTIMATOR (MODIFIED MLE)

In the earlier two sections we have applied two procedures to obtain the estimators and either of them could be good. We now try to improve on the MLE by reducing the bias and thus derive a modified estimator namely modified MLE. We have further shown thatthis modified MLE is better than the UMVUE. Hence to derive a modification that reduces the bias and the MSE of the MLE of pi, i = 1,2,3, …, k, we apply the Taylor Series two-parameter expansion to

pi^=ϕsi1,si=nsisi1;  s0=0,  i=1,2,,k.
ϕ(si1,si)=ϕ(μsi1,μsi)+(si1μsi1)dϕdsi1|(sj=μsj)+(siμsi)dϕdsi|(sj=μsj)+((si1μsi1))22!d2ϕdsi12|(sj=μsj)+(siμsi)22!d2ϕdsi2|sj=μsj+2(si1μsi1)(siμsi)2!d2ϕdsi1dsi|(sj=μsj)+
where j = 1,2, …, k and
μsi=ESi=i=1nnpi  ,i=1,2,,k.

On substition of Eqs. (21) in (22) we obtain

pi^=pi+(si1j=1i1npj)pi2n+(sij=1inpj)pi2n+(si1j=1i1npj)22!2pi3n2+(sij=1inpj)22!2pi3n2+(sij=1inpj)(si1j=1i1npj)2!2pi3n2+O1n

On taking expectaion of Eq. (24) we obtain

Epi^=pi1+1npi2n+O1n

We observe that

pi^ is an asymptotically unbiased estimator of pi

Consider a linear function of the MLE of pi given as

pi˜=αpi^+β, α and β are constants

Hence E(pi˜)=α.E(pi^)+β

=αpi1+1npi2n+O1n+β

If the coefficient of pi is set equal to 1 and the constant term set equal to zero, we obtain aproximate equality between Epi˜ and pi.

This gives us α=nn+1 and β=0.

Therefore we obtain a modified MLE of pi

pi˜=n2n+1sisi1;i=1,2,3,,k,s0=0.

Since

pi˜=nn+1pi^
E(pi˜pi)2=nn+12E(pi^pi)2+pi2n22piEpi^pin+O1n
E(pi˜pi)2=nn+12E(pi^pi)2pi2n2

Thus

E(pi˜pi)2<E(pi^pi)2

Thus the MSE of modified MLE of pi namely pi˜ is less than the mean square error of the corresponding MLE of pi i.e. pi^, i = 1,2, …, k.

5. CONSISTENCY OF MODIFIED MLE

If we collect a large number of observations, then we can obtain a lot of information about the unknown parameter. We can thus construct an estimator T(X) with a small MSE and we can call it a consistent estimator if limnMSETX=0.

Theorem 5.1.

p1^, p2^, …, pk^ are consistent estimators where pi^ is the MLE of pi, i = 1,2, …, k. where

pi^=nsisi1;  s0=0

Proof.

From Eq. (25) it is clear that Epi^ tends to pi as n tends to ∞, i = 1,2, …, k. We shall prove by method of induction that Vpi^ tends to 0 as n tends to ∞.

Let k = 1

p1^=ns1

Applying the Taylor series expansion

p1^=p1+s1np1p12n+s1np122!2p13n2+O1n

On taking expectaion of (p1^p1)2 we obtain

E(p1^p1)2=Es1np12p14n2+Es1np14p16n4+O1n
E(p1^p1)2=n1p1p12p14n2+n1p1p123p1n+2+3n+2p14p16n4+O1n

Hence Vp1^=E(p1^p1)2 tends to 0 as n tends to ∞.

Thus p1^ is a consistent estimator of p1.

Consider k = 2

Here

p2^=ns2s1

Applying Taylor Series two parameter expansion we obtain

E(p2^p2)2=Es1np12p24n2+Es2np1np22p24n2+Es1np14p26n4+Es2np1np24p26n4+Es1np12s2np1np224p26n4+O1n
E(p2^p2)2=np12np1p24n2+np12+np22np1np2p24n2+n1p1p123p1n+2+3n+2p14p16n4+O1n

Hence Vp2^=E(p2^p2)2 tends to 0 as n tends to ∞. Thus p2^ is a consistent estimator of p2.

Assume p^i1 is a consistent estimator of pi1.

To prove that pi^ is a consistent estimator of pi.

We need to prove that Vpi^ tends to 0 as n tends to ∞, i = 2,3, …, k.

pi^=p^i1+n2si1sisi2sisi1si1si2

To obtain E(pi^p^i1)2 we apply the Taylor series expansion to n2si1sisi2sisi1si1si22 and then take the expectation. Hence

En2si1sisi2sisi1si1si22=(pipi1)2+O1n
E(pi^p^i1)2=(pipi1)2+O1n

Thus as n tends to ∞, E((pi^p^i1)2) tends to (pipi1)2

We now consider

Vpi^p^i1=E(pi^p^i1)2(Epi^p^i1)2
Vpi^p^i1=E(pi^p^i1)2(Epi^Ep^i1)2

Thus from Eq. (39) and since pi^ is asymptotically unbiased from Eq. (26) we can conclude that Vpi^p^i1 tends to 0 as n tends to ∞.

But

Vpi^p^i1=Vpi^+Vp^i12Covp^i1,p^i

The Covp^i1,p^i can also be shown as tending to 0 as n tends to by applying the Taylor series expansion.

Thus, since Vpi^p^i1,Covp^i1,p^i and Vp^i1 all tend to 0 as n tends to , we conclude that Vpi^ also tends to 0 as n tends to . Hence pi^, the MLE of pi, is a consistent estimator of pi,i=1,2,,k.

Note: The MSE of modified MLE is less than MSE of MLE from Eq. (29).

From Eq. (28) we can conclude that the modified MLE, pi˜ is a consistent estimator of pi,i=1,2,,k.

6. CONCLUSION AND COMPARISION OF ESTIMATORS

We have observed in the earlier section that an improvement over the MLE is the modified MLE. We now have two estimators the UMVUE and the modified MLE. Our objective is to compare both the estimators with respect to efficiency. We make a comparative study of the two based on the determinant of the variance covariance matrix also called as the generalised variance (GV). We consider the trivariate case namely k = 3. The variances and covariances of the UMVUE and modified MLE of the parameters can be obtained as below. Consider the case when k = 3.

Variance of UMVUE of pi, i = 1, 2, 3 where s0=0 is

s1=ns2=s1+ns3=s2+nsisi12sinsisi11sisi1n2Ps1s2s3pi2

Covariance of the UMVUEs of pi and pj, i, j = 1, 2, 3 and s0=0 is

s1=ns2=s1+ns3=s2+nsisi12sinsisi11sisi1nsjsj12sjnsjsj11sjsj1nPs1s2s3pi.pj;

Variance of modified MLE of pi when k = 3 is

s1=ns2=s1+ns3=s2+n(n2(n+1)(sisi1))2P(s1s2s3)   (s1=ns2=s1+ns3=s2+n(n2(n+1)(sisi1))2P(s1s2s3))2
where s0=0  ;  i=1,2,3

Covariance of modified MLE of pi and pj, i, j = 1, 2, 3 and s0=0 is

s1=ns2=s1+ns3=s2+n(n2(n+1)(sisi1))(n2(n+1)(sjsj1))P(s1s2s3)   (s1=ns2=s1+ns3=s2+n(n2(n+1)(sisi1))(n2(n+1)(sjsj1))P(s1s2s3))2

The determinant of the variance covariance matrix is calculated and are compared for the two estimators in the graphs below for a range of values of the parameters p1, p2 and p3.

It can be observed from the graphs in Figs. 1 and 2 that the generalised variance of the modified MLE is less than the corresponding GV of the UMVUE for numerical values of p1, p2 and p3 ranging from 0.1 to 0.9. Thus we obtain a new and better estimator called modified MLE which is consistent. It is an improvement over the MLE and is also more efficient than the UMVUE.

Figure 1

The generalised variance of the modified MLE

Figure 2

The generalised variance of the modified MLE

7. AN EXAMPLE FOR k = 3

A game of cricket has been considered. When a batsman is out he is replaced by another batsman in the next ball and the game continues. When the replaced batsman is declared out another replacement is sent forth and the game continues. Consider the 2016 season of Cricket's Indian Premium League ‘IPL 2016’. A total of 17 matches were played by the winning team. Sunrisers Hyderabad, of which 15 were suitable for our study. We have recorded the following details.

Let

Xi denote the ball at which the first player becomes out in the ith match.

Yi denote the ball at which the player who replaces the first batsman becomes out in the ith match.

Zi denote the ball at which the player who replaces the second batsman becomes out in the ith match, i = 1, 2, …, 15.

Match No i Xi Yi Zi
1 25 29 35
2 7 16 27
3 4 26 38
4 31 32 52
5 4 18 20
6 11 49 52
7 17 26 42
8 33 38 61
9 14 51 56
10 30 54 56
11 8 9 20
12 16 41 50
13 10 31 61
14 4 10 23
15 25 30 53
Table 1

IPL 2016: Team Hyderabad Sunrisers

Hence p1=PFirst player is out, p2=PSecond player is out and p3=PThird player is out.

Thus for n = 15, we obtain s1 = 239, s2 = 460 and s3 = 646. The MLE, UMVUE and modified MLE for the following parameters are as shown in Table 2.

P1 P2 P3
UMVUE 0.05882 0.063636 0.075676
MLE 0.06276 0.06787 0.0806
Modified MLE 0.05884 0.063631 0.075605
Table 2

Values of MLE, UMVUE and modified MLE for parameters

From the example of IPL2016, it is observed that the UMVUE, MLE and modified MLE estimates calculated for p1, p2 and p3 are very close to each other.

ACKNOWLEDGEMENT

We are thankful to the referees and the editor for their valuable comments and constructive suggestions which has improved this manuscript.

REFERENCES

1.J. Li, Modeling with Bivariate Geometric Distribution, The State University of New Jersey-Newark. Department of Mathematical Sciences(NJIT). Department of Mathematics and Computer Science, Rutgers, 2010. A dissertation submitted to the Faculty of New Jersey Institute of Technology(NJIT) and Rutgers
2.F. Downtown, JRSS. B, Vol. 32, 1970, pp. 408-417.
3.H. Krishna and P.S. Pundir, Commun. Statist. Theor. Meth., Vol. 38, 2009, pp. 1079-1093.
4.U.J. Dixit and S. Annapurna, J. Stat. Theory Appl., Vol. 14, 2015, pp. 324-349.
5.A.G. Phatak and M. Sreehari, J. Indian Stat. Assoc., Vol. 19, 1981, pp. 141-146.
6.A.W. Marshall and I. Olkin, J. Amer. Statist. Assoc., Vol. 80, 1985, pp. 332-338.
7.O.E. Gultekin and I. Bairamov, EGE Univ. J. Fac. Sci., Vol. 37, No. 1, 2013, pp. 1-18.
8.R.C. Srivastava and K.S.N. Bagchi, J. Indian Stat. Assoc., Vol. 23, 1985, pp. 27-33.
9.R. Vasudeva and G. Srilakshminarayana, Some Properties of Trivariate and Multivariate Geometric Distribution, in Paper presented at the International Conference on Multivariate Statistical Methods in 21st Century (Indian Statistical Institute, Kolkata, India), 2006.
10.M. Sreehari and R. Vasudeva, Metrika, Vol. 75, 2012, pp. 271-286.
11.Esary and D. James, 1973. Calhoun. 3
12.A.W. Marshall and I. Olkin, J. Am. Stat. Assoc., Vol. 62, 1967, pp. 30-44.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
17 - 4
Pages
636 - 646
Publication Date
2018/12/31
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.2018.17.4.6How to use a DOI?
Copyright
© 2018 The Authors. Published by Atlantis Press SARL.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - U. J. Dixit
AU  - S. Annapurna
PY  - 2018
DA  - 2018/12/31
TI  - Efficient Estimator of Parameters of a Multivariate Geometric Distribution
JO  - Journal of Statistical Theory and Applications
SP  - 636
EP  - 646
VL  - 17
IS  - 4
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.2018.17.4.6
DO  - 10.2991/jsta.2018.17.4.6
ID  - Dixit2018
ER  -