New Discrete Lifetime Distribution with Applications to Count Data
- https://doi.org/10.2991/jsta.d.210203.001How to use a DOI?
- Generalized Hermite distribution, Hermite polynomials, Genocchi polynomials, Hermite–Genocchi polynomials, Discrete distribution, Reliability
In this paper, we present a new class of distribution called generalized Hermite–Genocchi distribution (GHGD). This model is obtained by compounding generalized Hermite–Genocchi polynomials given by Gould and Hopper with powers series distribution. Statistical properties and reliability characteristics are studied. The model has been applied to several real data. Finally, a simulation study is performed to assess the performance of the model.
- © 2021 The Authors. Published by Atlantis Press B.V.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
In this paper, we introduced a new discrete distribution based on the generalized Hermite polynomials given by, see 
Gupta and Jain  extended the Hermite distribution (HD) of the generalized HD defined bywhere and
The distribution has been applied to the frequency of bacteria in leucoytes and frequency of larvae in corn plants .
Moreover, there are a lot of popular statistical distributions that have specific applications, but sometimes, observable data contain distinct features not shown by these classic distributions. So to overcome these limitations, researchers often develop new distributions so that these new distributions can be used in these cases where the classical distributions don't provide any suitable fit. There are many techniques with which we can get new distribution, for more details see [7–9].
Recently, El-Desouky et al.  introduced a new generalized Hermite–Genocchi distribution (GHGD). By compounding (1) and powers series distribution defined new multivariate distribution called GHGD.
is convergent and positive for
The paper is organized as follows: In Section 2, when set in (1.1), we introduce a new univariate discrete distribution and discuss mathematical and statistical properties of the model. In Section 3, we introduce monotonic properties. In Section 4, reliability characteristics are obtained. In Section 5, moment and maximum likelihood estimates of unknown parameters are presented and simulation study is performed. In Section 6, we apply the new model to real data sets to illustrate the usefulness and applicability of the model. Graphical assesment of goodness of fit of the model based on empirical probability generating function is presented. Finally, in Section 7, conclusion and remarks are given.
2. GENERALIZED HERMITE–GENOCCHI DISTRIBUTION
A discrete random variable taking value in the set is said to follow GHGD with three parameters, that is , if its probability mass function can be written as
is convergent and positive for
2.1. Structural Properties of GHGD Model
2.1.1. Shape and behavior of pmf plots of GHG distribution with serval values of parameters and are present in Figure 1
Three examples in Figure 1 showing effects of scale and shape parameters.
2.1.2. Cumulative distribution function
The cumulative distribution function (cdf) of GHGD is given by
Figure 2 showing shape and behavior of Cdf plots of GHG distribution with several values of parameters α, β and γ.
2.1.3. Moments and related measures
The moment-generating function of GHGD is given by
The factorial moments is given by
The moments is given by
The mean and variance are given, respectively, by
The plots in Figure 3, it is apparent that both mean and variance of GHGD have bounds.
The over-dispersion (OD) index of GHGD is given by
OD if and only if , and
GHGD is no over-dispersion, over-dispersion and under-dispersion for , and respectively.
We obtained that numerically.
2.1.5. Surprise index
The surprise index (SI) of GHGD is given by
From Figure 4 for various value of where decreases, large values of become more surprising.
2.1.6. Generating function
The probability-generating function of GHGD is given by
3. MONOTONIC PROPERTIES
Log-concavity is an essential property of the probability distribution. Characteristics such as reliability function, failure rate, mean residual and moment of log-concave probability have specific properties see [11–14].
The GHG distribution is log-concave.
Consider the function
Its derivative is given by
Note that is decreasing function in for , and thus, the is log-concave. The behavior of GHG distribution can be illustrated as in Figure 1.
As a direct consequence of log-concavity, see , the following results hold for GHG distribution:
It is strongly unimodal.
It has all moments.
It has an increasing failure rate distribution.
It has monotonically decreasing mean residual function.
It remains log-concave if truncated.
It gives unimodal and log-concave distribution when convoluted with any other discrete distribution.
4. RELIABILITY PROPERTIES
The survival function of GHGD is given by
In Figure 5, shape and behaviour of survival function plots of GHG distribution with several values of parameters α, β and γ.
Also, the hazard rate function is given by
The mean residual life (MRL) of the GHGD is given by
The mean time to failure (MTTF) of GHGD is given by
The reversed hazard rate is given by
The shape and behavior of reversed hazard rate GHG distribution with several values of parameters α, β and γ, see Figure 7.
Definition 4.1. 
A discrete distribution of nonnegative random variable is said to be
New better (worse) than used, denoted by NBU(NWU) if
New better (worse) than used in expectation, denoted by NBUE(NWUE) if
As a result of IFR, see , the following results hold:
GHGD is IFRA
GHGD is NBU
GHGD is DMRL
GHGD is NBUE.
5. PARAMETER ESTIMATION AND SIMULATION
5.1. Maximum Likelihood Estimators
Let be a random sample of size drawn from GHGD. Then, the likelihood function of vector is given by
The log-likelihood function can be written as
Computing the first partial derivatives of (5.1) with respect to and , we get
Equating the Equations (5.2–5.4) to zero and solving them with the help of R software, the MLES can be obtained. We notice that, these equations cannot solve analytically, there is an alternative procedure like Newton-Raphson is required to solve them numerically.
In this section, we evaluate MLE performance to sample . Evaluation based on simulation study described in the following steps:
Generate samples with and from GHGD.
Calculate MLES for sampls.
Calculating absolute bias, standard errors and mean square errors (MSE).
The results obtained in Table 1.
|n||Parameter||MLE||Standard Error||Abs. Bias||MSE|
Result from the simulated data.
It can be seen that
The bias values decrease as .
MSEs decrease as . This shows the consistency of the estimators.
The MLE method performs well for the parameters.
6. DATA ANALYSIS
In this section, we explain the empirical importance of GHGD using real data applications. The fitted model is compared using statistic, Akaike information criterion (AIC), Bayesian information criterion (BIC) and correct Akaike information criterion (AICc).
6.1. Data Set 1
This data represents counts of cysts in embryonic mouse kidneys which subjected to steroids, taken from McElduff et al.  and . We compare the fits of GHGD with HD, zero-inflated Poisson distribution (ZIPD), negative binomial distribution (NBD), zero-inflated negative binomial distribution (ZINBD), zero-inflated generalized Poisson distribution (ZIGPD) and zero-inflated Hermite distribution (ZIHD). The MLES and goodness of fit are presented in Table 2.
|Estimates of the parameter|
Distribution of the counts of cysts from 111 steroid-treated kidneys  and the expected frequencies computed using HD, ZIPD, NBD, ZIGPD, ZINBD, ZIHD and GHGD.
From the plots of the log-likelihood function of and in Figure 8a–8c, we observe that the likelihood equations have a unique solution.
6.2. Data Set 2
This data represents the distribution of mistakes in copying groups of random digits, see . We compare the fits of GHGD with hyper-Poisson distribution (HPD), zero- inflated Poisson distribution (ZIPD), zero-inflated Conway–Maxwell–Poisson distribution (ZICMPD),ZINBD, ZIGPD and zero-inflated hyper-Poisson distribution (ZIHPD). The MLES and goodness of fit are presented in Table 3.
|Estimates of the parameter|
Distribution of mistakes in copying groups of random digits  and the expected frequencies computed using HPd, ZIPD, ZIHPD, ZICMPD, ZINBD, ZIGPD and GHGD distribution.
From the plots of the log-likelihood function of and in Figure 9a–9c, we observe that the likelihood equations have a unique solution.
6.3. Data Set 3
This data represents counts of Collenbola microarthropods in 200 samples of forest soil, see [18,19]. We compare the fits of GHGD with (HPD), (ZIPD), (ZICMPD),(ZINBD), (ZIGPD) and (ZIHPD). The MLES and goodness of fit are presented in Table 4.
|Estimates of the parameter|
Distribution of the counts of Collenbola microarthropods in 200 samples of fort soil  and the expected frequencies computed using HPD, ZIPD, ZIHPD, ZICMPD, ZINBD, ZIGPD and GHGD distribution.
From the plots of the log-likelihood function of and in Figure 10a–10c, we observe that the likelihood equations have a unique solution.
6.4. Graphical Assesment of Goodness of Fit
Plotting both the empirical probability generating function (EPGF) and log pgf's on the same graph allows us to compare the fit of a number of discrete distributions using only one plot, see .
The log of the EPGF of data set 1 is plotted in Figure 11. The EPGF is shown as black line, whilst a series of distributions fitted to data. The GHGD shown by the red line, indicates that the GHGD is a good fit to the data.
The log of the EPGF of data set 2 is plotted in Figure 12. The EPGF is shown as black line, whilst a series of distributions fitted to data. The GHGD shown by the red line, indicates that the GHGD is a good fit to the data.
The log of the EPGF of data set 3 is plotted in Figure 13. The EPGF is shown as black line, whilst a series of distributions fitted to data. The GHGD shown by the red line, indicates that the GHGD is a good fit to the data.
A new three parameters discrete distribution is proposed and its important monotonic and reliability concepts are introduced. The model proposed parameters are estimated by Maximum likelihood and the simulation study is performed to establish the accuracy of the maximum likelihood estimators. Applications of the new model in the analysis of three real-life data are presented. We show by three applications of the real data that the proposed distribution can yield better fits than some other distributions.
CONFLICTS OF INTEREST
The authors declare they have no conflicts of interest.
All authors have read and agreed to the published version of the manuscript.
The author would like to thank the Editor-in-Chief, and the anonymous referees for their careful reading and constructive comments and suggestions which greatly improved the presentation of the paper.
Cite this article
TY - JOUR AU - Beih S. El-Desouky AU - Rabab S. Gomaa AU - Alia M. Magar PY - 2021 DA - 2021/02/22 TI - New Discrete Lifetime Distribution with Applications to Count Data JO - Journal of Statistical Theory and Applications SP - 304 EP - 317 VL - 20 IS - 2 SN - 2214-1766 UR - https://doi.org/10.2991/jsta.d.210203.001 DO - https://doi.org/10.2991/jsta.d.210203.001 ID - El-Desouky2021 ER -