Journal of Statistical Theory and Applications

Volume 19, Issue 3, September 2020, Pages 383 - 390

Deriving Mixture Distributions Through Moment-Generating Functions

Authors
Subhash Bagui1, *, Jia Liu1, Shen Zhang2
1Department of Mathematics and Statistics, The University of West Florida, Pensacola, FL, USA
2Department of Statistics, The University of Texas at San Antonio, San Antonio, TX, USA
*Corresponding author. Email: sbagui@uwf.edu
Corresponding Author
Subhash Bagui
Received 10 March 2020, Accepted 14 August 2020, Available Online 2 September 2020.
DOI
10.2991/jsta.d.200826.001How to use a DOI?
Keywords
Mixture distributions; Moment-generating functions; Characteristic functions; Hierarchical models; Over-dispersed models
Abstract

This article aims to make use of moment-generating functions (mgfs) to derive the density of mixture distributions from hierarchical models. When the mgf of a mixture distribution doesn't exist, one can extend the approach to characteristic functions to derive the mixture density. This article uses a result given by E.R. Villa, L.A. Escobar, Am. Stat. 60 (2006), 75–80. The present work complements E.R. Villa, L.A. Escobar, Am. Stat. 60 (2006), 75–80 article with many new examples.

Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

1. INTRODUCTION

A random variable X is said to have mixture distribution if it depends on a quantity that has a distribution. The mixture distribution arises from a hierarchical model (see [1], p. 165). A typical example of a hierarchical model is as follows: Consider a large number of eggs laid by an insect. The survival of each egg has a probability θ. If each egg's survival is independent, then we have a sequence of Bernoulli trials on egg's survival. Assume that the “large number” of eggs laid is a random variable and follows the Poisson distribution with parameter λ. Hence, if we let X is equal to the number of survivors and Y is equal to the number of eggs laid, then

X|Y follows BinomialY,θ and Y follows Poisson λ,(1)
constitute a hierarchical model, where the notation X|Y=y denotes the conditional distribution of X given Y=y follows Binomial (y,θ), and the (marginal) distribution of Y is Poisson (λ). It turns out that the (marginal) distribution of X is Poisson (λθ). Thus, the distribution of the number of survivors X, which is Poisson (λθ), is a mixture distribution as it is a result of combining the distribution of X|Y and the distribution of Y. In general, hierarchical models lead to mixture distributions.

Mixture models play an important role in the theory and practice. There are textbooks, monographs, and journal articles discussing history, theory, applications, and the usefulness of mixture models. Mixture models became popular as, among others: they (a) provide a simple device to include other variation and correlation in the model, (b) add model flexibility, and (c) allow modeling the data that arise in multi-stages. The literature shows several authors, namely Everitt and Hand [2], Titterington et al. [3], Böhning [4], McLachlan and Peel [5] have discussed mixture models and provided the statistical methodology and references on finite mixtures.

Lindsay [6] discussed the application of mixture models and its interrelation with other related fields, among others. In discussing mixture models, Casella and Berger [1] showed the derivation of mixture models from hierarchical models. Panjer and Willmot [7] consider applications of mixture models in actuarial sciences. Karlis and Xekalaki [8] derived results related to Poisson mixtures models with applications in various other fields. The mixture models of continuous and discrete types can be found in Johnson et al. [9,10] and Gelman et al. [11]. To fit plant quadrat data on the blue-green sedge, Skellam [12] used a mixture of binomial with varying sample sizes modeled with Poisson distributions. The Gamma mixture of Poisson r.v.'s yield negative-binomial, while Green and Yule [13] used this mixture distribution to model “accident proneness”; see Bagui and Mehra [14]. The research dated back to Pearson [15] shows modeling the mixing of different crab type with mixtures of two normal. The mixture distribution negative-binomial can arise in the distribution of the sum of N independent random variables, each having the same logarithmic distribution and N having a Poisson distribution; this mixture distribution was used in modeling biological spatial data; see Gurland [16], Bagui and Mehra [14].

2. MIXTURE MODEL

Consider a two-stage mixture model of type (1) where X|YfX|Y(x|y) and YfY(y). It is customary to derive the mixture distribution fX(x) of the mixture X from the joint density of X and Y, fX,Y(x|y)=fY(y)fX|Y(x|y), as

fX(x)=fY(y)fX|Y(x|y)dy if Y is continuousyfY(y)fX|Y(x|y) if Y is dicrete(2)

Villa and Escobar [17] derived mixture distributions from the moment-generating function (mgf) of X, and they derived the mgf of X as MX(t)=EMX|Y(t), where MX|Y(t) is the mgf of X|Y and the expectation is over the distribution of the r.v. Y. From the known distribution of X|Y with known mgf Villa and Escobar [17] rewrites MX|Y(t) as

MX|Y(t)=a1(t)ea2(t)Y(3)
where a1(t) and a2(t) are functions of t and may also depend on the parameter of the distribution of X|Y. Then they arrived at the mgf of X as
MX(t)=EMX|Y(t)=a1(t)MYa2(t).(4)

In the above, we assumed that all mfg's exist. When mfg of X|Y is not found from the known list, then the mgf of X can be computed directly as

MX(t)=EetX=EEetX|Y(5)

From the joint mgf of X and Y, MX,Y(t,s), one can also derive the mgf of X as

MX(t)=EMX|Y(t)=EEetX+sY|Y=EesYEetX|Y=EesYMX|Y(t).(6)

Then by setting s=0 in the above Eq. (6), one would get the mgf of X, MX(t).

The main goal of this article is to derive mixture distributions that complements the examples of Villa and Escobar [17] using the above mgf method.

3. EXAMPLES

There are situations where obtaining the mixture distributions by using mgf is much easier than getting it by the marginalization of the joint distribution. The examples considered here are the complements of the cases discussed by Villa and Escober [17].

3.1. The Binomial–Binomial Mixture

The mixture model of the Binomial mixture of Binomial random variables is

X|YBINY,p1YBINn,p2XBINn,p1p2.(7)

Proof.

The mgf for X|Y is

MX|Y(t)=q1+p1etY=eln(q1+p1et)Y=a1(t)ea2(t)Y,
where q1=1p1, a1(t)=1, and a2(t)=lnq1+p1et.

Now by Eq. (4),

MX(t)=a1(t)MYa2(t)=MYlnq1+p1et=q2+p2eln(q1+p1et)n=q2+p2q1+p1etn=q+petn, Binomial mgf,(8)
where p=p1p2 and q=q2+p2q1=1p1p2=1p.

Thus, it follows from (8) that XBINn,p1p2, (see Appendix, Table A1, Villa and Escober [17]).

3.2. The Negative-Binomial–Binomial Mixture

The mixture model of the Negative-binomial mixture of Binomial random variables is

X|YBINY,p1YNEGBINα,p2XNEGBIN(α,p),(9)
where p=p2/p2+p1p2 and q2=1p2.

Proof.

The mgf for X|Y is

MX|Y(t)=q1+p1etY=elnq1+p1etY=a1(t)ea2(t)Y,
where q1=1p1, a1(t)=1, and a2(t)=lnq1+p1et.

Now by Eq. (4),

MX(t)=a1(t)MYa2(t)=MYlnq1+p1et=p211p2elnq1+p1etα=p211p2q1+p1etα, negative-binomial mgf,=p/1(1p)etα,(10)
where p=p2/(p2+p1p2). Thus, it follows from (10) that XNEGBIN(α,p), (see Appendix, Table A2, Villa and Escober [17]).

3.3. The Exponential–Exponential Mixture

The mixture model of the exponential mixture of shifted exponential random variables is

X|Yλeλ(xY),xYY(1+λ)e(1+λ)y,y0Xλ(1+λ)eλx1ex,x0.(11)

Proof.

The mgf for X|Y is given by

MX|Y(t)=EX|YetX=λYetxeλ(xY)dx=λeλYYe(λt)xdx=λetY(λt),t<λ,(12)
where a1(t)=λλt and a2(t)=t.

Now by Eq. (4),

MX(t)=λλtMY(t)=λ(1+λ)(λt)(1+λt).(13)

The above mgf is the mgf of the density fX(x)=λ(1+λ)eλx1ex, x0. It follows from (13) that Xλ(1+λ)eλx1ex,x0.

3.3.1. A specific exponential–exponential mixture

The mixture model of a specific exponential mixture of shifted exponential random variables is

X|Ye(xY),xYYey,y0XGamma(2,1),x0.(14)

Proof.

The mgf for X|Y is given by

MX|Y(t)=EX|YetX=etY1t,(15)
where a1(t)=11t and a2(t)=t. Now by Eq. (4),
MY(t)=11tMy(t)=11t×11t=1(1t)2.(16)

Eq. (16) confirms that XGamma(2,1).

3.3.2. The exponential–normal mixture

The mixture model of the exponential mixture of normal random variables is

X|Y12πe12(xY)2,<x<Yey,y0XConvolution of Y and an independent ZN(0,1),(17)

Proof.

The mgf for X|Y is given by

MX|Y(t)=EX|YetX=eYt+t2/2,(18)
where a1(t)=et2/2 and a2(t)=t. Now by Eq. (4),
MX(t)=et2/2MY(t)=et2/2(1t),(19)
which is the mgf of the convolution of Y and an independent ZN(0,1). Thus, X follows the distribution of the convolution of Y and an independent ZN(0,1).

3.4. The Poisson–Chi-square Mixture

The mixture model of the Poisson mixture of Chi-square random variables is

X|Yχn+2Y2,xYYPOI(λ)Xχn;2λ2.(20)

Proof.

The mgf for X|Y is given by

MX|Y(t)=EX|YetX=(12t)(n+2Y)/2=(12t)n/2eln(12t)Y,(21)
where a1(t)=(12t)n/2 and a2(t)=ln(12t). Now by Eq. (4),
MX(t)=(12t)n/2MYln(12t)=(12t)n/2eλeln(12t)1=(12t)n/2e2tλ/(12t),(22)
which is an mgf of a noncentral chi-square distribution that has a noncentrality parameter 2λ. It follows from (22) that X has the noncentral chi-square distribution with the noncentrality parameter 2λ.

3.5. The Geometric–Gamma Mixture

The mixture model of the Geometric mixture of Gamma random variables is

X|YGamma(Y,β)YGeometric(θ)XEXP(β/θ).(23)

Proof.

The mgf for X|Y is given by

MX|Y(t)=EX|YetX=(1βt)Y=eYln(1βt),(24)
where a1(t)=1 and a2(t)=ln(1βt). Now by the Eq. (4),
MX(t)=MYln(1βt)1=θeln(1βt)11(1θ)eln(1βt)1=θ(1βt)11(1θ)(1βt)1=θθβt=11(β/θ)t,(25)
which is an mgf of an exponential distribution with shape parameter β/θ. Thus, XEXP(β/θ).

4. EXTENSION TO MIXTURES THAT DO NOT HAVE AN mgf

When mgfs do not exist for the mixture distribution, one uses the characteristic function (cf) for the mixture distributions.

The cf of an r.v. X is defined by ϕX(t)=EeitX, where tR and i=1. The conditional characteristic of X|Y is denoted and defined by ϕX|Y(t)=EeitX|Y.

The Chi-square–Normal Mixture

X|YN(0,1/Y)Yχ12XCauchy(0,1).(26)

Proof.

The cf of X|Y is

ϕX|Y(t)=EeitX|Y=et2/2Y.(27)

The cf of X can be obtained from the Eq. (27) as

ϕX(t)=EϕX|Y(t)=Eet2/2Y=0et2/2y12πy1/21ey/2dy.(28)

Now with the transformation |t|z=y and simplifying (28), we have

ϕX(t)=2|t|π×120z1/21e|t|/2(z+1/z)dz=2|t|π×K1/2|t|,(29)
where, Ku(v) is the modified Bessel function of third kind defined by
Ku(v)=120zu1ev2z+1/zdz,<u<.(30)

It should be noted that asymptotic form for the modified Bessel function of the third kind is

Kα(z)π2zez1+4α218z+4α214α292!(8z)2+4α214α294α2253!(8z)3+.(31)

Therefore, by Eq. (31), we have

K1/2(v)=π2zez.(32)

Now by Eqs. (29) and (32), we obtain the cf X as

ϕX(t)=2|t|π×π2|t|e|t|=e|t|,(33)
which is the cf of the standard Cauchy distribution. Thus, we conclude that XCauchy(0,1).

Remarks.

The F distribution arises from the mixture of chi-square and Gamma distribution and it has no mgf. In this case, one may derive the cf of this mixture distribution. The t-distribution arises from the mixture of chi-square and normal distribution and it has no mgf, Hurst [18]. Similarly, Pareto distribution is a mixture of distributions and has no mgf. In all these cases, a cf can be used in the derivation of the mixture distributions.

5. CONCLUDING REMARKS

Mixture models play vital roles in statistics. They are used in modeling actuarial applications, biological spatial data, “accident proneness,” plant data on sedge Carex flacca, and applied in many other areas of statistics. Because of the high importance of mixture distributions, students should be exposed to mixture distributions as soon as they have familiarity with conditional expectations. In the current textbooks, mixture distributions are derived from the joint distribution that originated from hierarchical mixture models as a marginal distribution.

This article finds mixture distributions using mgf method. The derivation of mixture distribution using mgfs is, in general, more straightforward and shorter than an origin of the marginal density of mixture random variable from a joint density. It is because, in the present method, one relies on mgfs that have already been derived or available. However, there are examples where the derivation of the marginal density of the mixture r.v. from a joint density is much simpler.

On the other hand, there are two difficulties in the mgf methods. First, one cannot get a1(t) and a2(t) as given in Eq. (3), for all mfs MX|Y(t) of the conditional distributions of X|Y. Second, sometimes it is hard to map the derived mgf MX(t) with a distribution. It requires good knowledge with familiarization of various distributions and corresponding mgfs. The mixtures that do not have an mfg, one can extend the mgfs method to cfs method.

As pointed out by [17], the idea of using mgf method for mixture distribution can be introduced in senior mathematical statistics courses at the level of Wackerly et al. [19] and Larsen and Marx [20] for students who are exposed to conditional expectations and mgfs. This article is directed to first-year graduate students in the mathematical statistics course at the level of Casella and Berger [1]. The mgf technique is underexposed in the current textbooks. From a pedagogical standpoint, mgf techniques can be used as a useful tool to derive mixture distributions. For mixtures that do not have an mgf, students with a background in complex analysis may use the cfs to extend the approach. Finally, the techniques learned here can be profitably used in the study of Bayes' procedures.

CONFLICTS OF INTEREST

Authors have no conflict of interest to declare.

AUTHORS' CONTRIBUTIONS

All authors contributed equally.

ACKNOWLEDGMENTS

The authors would like to thank the Editor and referees for their careful reading of the paper.

APPENDIX

Conditional Mixing Marginal
X|Yf(x|Y) Yg(y) XfX(x)
BIN (Y,p) POI (λ) POI (pλ)
POI (Y) GAM (α,β) NEGBIN (α,1/(1+β))
NOR (Y,σ2) NOR (μ,τ2) NOR (μ,σ2+τ2)
POI (Y) ϕ POI (α) Neyman-A (λ,ϕ)
POI (Y) GIG (γ,χ,ψ) SICHEL γ,2ψ+2,χ(ψ+2)
LEV (Y,σ) σ SEV (ξ,1) LOGIS (μ,σ), μ=ξσ
1YXi,Xiiid LOGSER (p) POI (λ) NEGBIN λlnp,p
Table A1

Examples of other mixture distributions that have moment-generating functions (mgfs) [17].

Distribution Pdf/pmff(x) mgf
BIN (n,p) knpk(1p)nk,k=0,1,,n 1p+petn
CHISQ(n) – χn2 1Γ(n2)2n2xn21ex2,x0 1(2t)n/2,t<12
CHISQ – noncentral χn;λ2 12e(x+λ)/2xλk/41/2Ik/21λx expλt12t(12t)k/2,t<1/2
EXP (λ) λeλx,x0 λλt,t<λ
GAM (α,β) 1Γ(α)βαxα1ex/β,x0 1(1βt)α,t<1/β
GEO (p) p(1p)k1,k=1,2, pexp(t)1(1p)exp(t),t<ln(1p)
GIG (γ,χ,ψ) (ψ/χ)γ/2xγ12Kγχψexp12χx+ψx, x>0 Kγχ(ψ2t)12t/ψγ/2Kγχψ,t<ψ/2
LEV (μ,σ) 1σϕlevxμσ,xR exp(μt)Γ(1σt), t<1/σ
LOGIS (μ,σ) 1σϕlogisxμσ,xR exp(μt)Γ(1σt)Γ(1+σt),|t|<1σ
LOGSER (p) (1p)xxlnp,x=1,2, ln1(1p)exp(t)lnp,tln(1p)
NEGBIN (α,p) xα+x1pα(1p)x,x=0,1, p1(1p)exp(t)α,t<ln(1p)
NOR (μ,σ2) 12πσe12σ2(xμ)2,xR expμt+σ2t2/2
SEV (μ,σ) 1σϕlogisxμσ,xR exp(μt)Γ(1+σt),t>1/σ
SICHEL (γ,θ,α) (1σ)γ/2Kψα1σθα/2xx!Kx+σ(α),xN 1θ1θexp(t)γ/2Kγα1θexp(t)Kγα1θ,t<lnθ
Table A2

List of probability density functions or probability mass functions and corresponding moment-generating functions (mgfs) [17].

REFERENCES

1.G. Casella and R.L. Berger, Statistical Inference, second, Duxbury, Pacific Grove, CA, USA, 2002.
3.D.M. Titterington, A.F.M. Smith, and U.E. Makov, Statistical Analysis of Finite Mixture Distributions, Wiley, New York, NY, USA, 1985.
4.D. Böhning, Computer-assisted Analysis of Mixtures, Marcel Dekker, New York, NY, USA, 1999.
6.B.G. Lindsay, Mixture Models: Theory, Geometry, and Applications, NSF-CBMS Regional Conference Series in Probability and Statistics, Institute of Mathematical Statistics, Hayward, CA, USA, Vol. 5, 1995.
7.H.H. Panjer and G.E. Willmot, Insurance Risk Models, Society of Actuaries, Schaumburg, IL, USA, 1992.
9.L.J. Johnson, S. Kotz, and A.W. Kemp, Univariate Discrete Distributions, Wiley, New York, NY, USA, 1992.
10.L.J. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, Wiley, New York, NY, USA, 1994.
18.S. Hurst, Financial Mathematics Research Report No. FMRR 006-95, The Australian National University, Canberra, Australia, 1995. http://www.maths.anu.edu.au/research.reports/srr/95/
19.D.D. Wackerly, W. Mendenhall III, and R.L. Scheaffer, Mathematical Statistics with Applications, sixth, Duxbury, Pacific Grove, CA, USA, 2002.
20.R.J. Larsen and M.L. Marx, An Introduction to Mathematical Statistics and its Applications, sixth, Pearson, Boston, MA, USA, 2018.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
19 - 3
Pages
383 - 390
Publication Date
2020/09/02
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.200826.001How to use a DOI?
Copyright
© 2020 The Authors. Published by Atlantis Press B.V.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Subhash Bagui
AU  - Jia Liu
AU  - Shen Zhang
PY  - 2020
DA  - 2020/09/02
TI  - Deriving Mixture Distributions Through Moment-Generating Functions
JO  - Journal of Statistical Theory and Applications
SP  - 383
EP  - 390
VL  - 19
IS  - 3
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.200826.001
DO  - 10.2991/jsta.d.200826.001
ID  - Bagui2020
ER  -