Journal of Statistical Theory and Applications

Volume 17, Issue 1, March 2018, Pages 122 - 135

Estimating the parameters of Lomax distribution from imprecise information

Authors
Abbas Pak*, pak@sci.sku.ac.ir
Department of Computer Sciences, Faculty of Mathematical Sciences, Shahrekord University, P. O. Box 115, Shahrekord, Iran.
Mohammad Reza Mahmoudimahmoudi.m.r@fasau.ac.ir
Department of Statistics, Fasa University, Fasa, Iran
*Corresponding author
Received 16 November 2016, Accepted 9 May 2017, Available Online 31 March 2018.
DOI
10.2991/jsta.2018.17.1.9How to use a DOI?
Keywords
Imprecise data; Fuzzy information; Lomax distribution; Maximum likelihood estimation; Bayesian estimation
Abstract

Traditional statistical approaches for estimating the parameters of Lomax distribution have dealt with precise information. However, in real world situations, some information about an underlying system might be imprecise and are represented in the form of fuzzy information. In this paper, we consider the problem of estimating the parameters of Lomax distribution when the available observations are described by means of fuzzy information. We obtain the maximum likelihood estimate of the parameters by using the Newton-Raphson as well as the EM algorithm. We also provide an approximation namely, Tierney and Kadane’s approximation, to compute the Bayes estimates of the unknown parameters. The estimation procedures are discussed in details and compared via Monte Carlo simulations in terms of their estimated biases and mean squared errors. Finally, analysis of one data set is provided for the purpose of illustration.

Copyright
Copyright © 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

1. Introduction

A random variable X is said to have Lomax distribution, if its probability density function (pdf) and cumulative distribution function are given, respectively, by

f(x)=αλ(1+λx)(α+1),x>0
and
F(x)=1(1+λx)α,x>0,
where α > 0 and λ > 0 are the shape and scale parameters, respectively. From now on the Lomax distribution with parameters α and λ will be denoted by LOM(α, λ). This distribution is also known as pareto distribution of the second type since it is a special case of the generalized Pareto (GP) model with pdf
f(x;ξ,μ,σ)=1σ(1+ξxμσ)(1ξ+1),
where μ, ξR, σ ∈ (0, +∞) and for ξ > 0 the range of x is x > 0 and for ξ < 0 the range of x is 0 < x < σ /|ξ|. The limit case ξ = 0 corresponds to the exponential distribution. Note that, by defining ξ/σλ, 1α and μ ≡ 0, the pdf of Lomax distribution in relation (1.1) is obtained. There is a large amount of literature about the estimation of GP distribution parameters using different approaches; see for example, Singh and Guo [30], Kremer [18], Shi et al. [28], Lin and Wang [20], Oztekun [22], Hsu et al. [16], Castillo and Daoudi [9] and Bermudez and Kotz [6]. For more details about GP model and its properties, see Arnold [2].

Lomax distribution provides a very good alternative to the common lifetime distributions such as exponential, Weibull, or gamma when the experimenter presumes that the population distribution may be heavy-tailed (see Bryson [7]). Also, it has been shown that its utilities for modeling and analyzing lifetime data in medical and biological sciences, engineering, etc. So, it has been received greatest attention from theoretical and statisticians primarily due to its use in lifetime testing studies. The Lomax distribution has been used in the literature in a number of ways. For example, it has been extensively used for reliability modelling and life testing; see, for example, Balkema and de Haan [5]. Ahsanullah [1] studied the record values of Lomax distribution. Balakrishnan and Ahsanullah [4] introduced some recurrence relations between the moments of record values from Lomax distribution.

Several authors have addressed inferential issues for the Lomax distribution based on complete and censored samples. The order statistics from nonidentical right-truncated Lomax random variables have been studied by Childs et al. [10]. Howlader and Hossain [15] considered Bayesian estimation of the survival function of the Lomax distribution. Ghitany et al. [13] considered Marshall-Olkin approach and extended Lomax distribution. Elfattah et al. [12] derived the Bayesian and the non-Baysian estimators for the same sample size from the Lomax distribution based on the progressive type-I censoring. Raqab et al. [26] discussed different predictors of failure times such as best linear unbiased predictors, and maximum-likelihood and approximate maximum-likelihood predictors based on multistage progressive censoring from Pareto distribution, and recently Cramer and Schmiedt [11] discussed the optimal censoring scheme for estimating the parameters of the Lomax distribution using progressive type-II censoring. Asgharzadeh and Valiollahi [3] derived the Bayesian estimator of the scale parameter of the Lomax distribution based on progressive type-II censoring.

From the above, it is evident that all the earlier works on the estimation of the parameters of Lomax distribution have been done under the assumption of precise data. However, in real world situations, we deal with experiments whose observation does not provide exact but imprecise information. For example, the time of reaction of a person to a certain stimulus in a psychological experience can not be exactly determined, but the psychologist is able to determine it by means of the following imprecise information: ”The time of reaction is approximately 25 to 35 seconds”. To deal with the lack of precision of the data, it is necessary to incorporate fuzzy concept to statistical techniques. In recent years, many papers on generalization of classical statistical methods to analysis of fuzzy data have been published. Among others, Buckley [8] considered fuzzy systems whose performance will depend on fuzzy probability distributions and whose measures of performance can be described by fuzzy numbers. Pak et al. [23] and [24] conducted a series of studies to develop the inferential procedures for the lifetime distributions on the basis of fuzzy numbers. Maximum likelihood estimation of exponential model using type-II fuzzy censored data is considered by Khoolenjani and Shahsanaie [17]. Makhdoom et al. [21] studied Bayesian estimation of the parameter of exponential distribution under type II censoring from fuzzy data.

To our knowledge there are no reports on estimating the parameters of Lomax distribution based on fuzzy data. The main aim of this paper is to obtain the inferential procedures for the Lomax distribution when the available observations are reported by means of fuzzy information. In Section 2, we use the Newton-Raphson and the EM algorithms to determine the maximum likelihood estimates of the unknown parameters. In Section 3, we obtain the Bayes estimates of the parameters α and λ by using the approximation form of Tierney and Kadane [32] under the assumption of gamma priors. A Monte Carlo simulation study is presented in Section 4, which provides a comparison of all estimation procedures developed in this paper and analysis of a data set is provided. Finally, conclusions and recommendations are provided in Section 5.

Let us first review the fundamental notation and basic definitions of fuzzy set theory used in the paper. Consider an experiment characterized by a probability space X = (X, X, Pθ), where (X, X) is a measurable space and Pθ belongs to a specified family of probability measures {Pθ, θ ∈ Θ} on (X, X). Assume that the observer cannot distinguish or transmit with exactness the outcome in the performance of X, but that rather the available observation may be described in terms of fuzzy information which is defined as follows:

Definition 1.

A fuzzy event x˜ on X, characterized by a Borel measurable membership function μx˜(x) from X to [0, 1], where μx˜(x) represents the ”grade of membership” of x to x˜, is called fuzzy information associated with the experiment X.

Among the various types of fuzzy information, the triangular and trapezoidal fuzzy numbers are most convenient and useful in describing fuzzy data. For triangular membership function, the triangular fuzzy number can be defined as x˜=(σ1,σ2,σ3) and its membership function is defined by the following expression:

μx˜(x)={xσ1σ2σ1σ1xσ2,σ3xσ3σ2σ2xσ3,0otherwise.
The set consisting of all observable events from the experiment X determines a fuzzy information system associated with it, which is defined as follows.

Definition 2.

(see Tanaka et al. [31]). A fuzzy information system (f.i.s.) X˜ associated with the experiment X is a fuzzy partition with fuzzy events on X, that is a finite set of fuzzy events on X satisfying the orthogonality condition

x˜X˜μx˜(x)=1forallxX.
On the other hand, according to Zadeh [33] given the experiment X = (X, X, Pθ), θ ∈ Θ, and a f.i.s. X˜ associated with it, each probability measure Pθ on (X, X) induces a probability measure on X˜ defined as follows:

Definition 3.

The probability distribution on X˜ induced by Pθ is the mapping P from X˜ to [0, 1] such that

P(x˜)=Xμx˜(x)dPθ(x),x˜X˜.
In particular, the conditional density of a continuous random variable Y with p.d.f. g(y) given the fuzzy event à can be defined as
g(y|A˜)=μA˜(y)g(y)μA˜(u)g(u)du.
For more details about the membership functions and probability measures of fuzzy sets, one can refer to Pak et al. [23] and the references therein.

2. Maximum likelihood estimation

Suppose that X1,...,Xn is a random sample of size n from Lomax distribution with pdf given by (1.1). Let X = (X1,...,Xn)denotes the corresponding random vector. If a realization x of X was known exactly, we could obtain the complete-data likelihood function as

L(x;α,λ)=(αλ)ni=1n(1+λxi)(α+1).
Now consider the problem where x is not observed precisely, and only partial information about x is available in the form of fuzzy observation x˜=(x˜1,,x˜n) with the Borel measurable membership function μx˜(x). In practice, the grade of membership μx˜(x) is often regarded as a kind of ”probability with which the observer gets the information x˜ when he really has obtained the exact outcome x ”.

Example 1.

A geologist is interested in analyzing the length of the largest axis of boulders in the upper reaches of a particular river. Assume that the geologist does not have a mechanism of measurement sufficiently precise to determine exactly the length of the largest axis of boulders. More precisely, suppose that the lack of roundness of these boulders only allows him to approximate the length of the largest axes by means of the following fuzzy observations: x˜1 =”approximately lower than 30 inches”, x˜2 =”approximately 35 to 50 inches”, x˜3 =”approximately 55 inches”, x˜4 =”approximately 60 to 65 inches”, x˜5 =”approximately higher than 70 inches”, which are characterized by the membership functions in Fig. 1 (Clearly, a f.i.s. X˜={x˜1,,x˜6} can be immediately constructed by defining μx˜6=1i=15μx˜i).

Once x˜ is given, we can obtain the observed data log-likelihood function by using the expression (1.4) as follows:

LO(x˜;α,λ)=nlogα+nlogλ+i=1nlog(1+λx)(α+1)μx˜i(x)dx.
The maximum likelihood estimate of the parameters α and λ can be obtained by maximizing the log-likelihood LO(x˜;α,λ). Equating the partial derivatives of the log-likelihood (2.2) with respect to α and λ to zero, the resulting two equations are:
LO(x˜;α,λ)α=nαi=1n(1+λx)(α+1)log(1+λx)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx=0,
LO(x˜;α,λ)λ=nλi=1n(α+1)x(1+λx)(α+2)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx=0.
Since there are no closed form of the solutions to the likelihood equations (2.3) and (2.4), an iterative numerical search can be used to obtain the MLEs. In the following, we describe the Newton-Raphson method and the EM algorithm to determine the MLEs of the parameters α and λ.

Fig. 1.

Membership functions of the fuzzy observations x˜1, x˜2, x˜3, x˜4 and x˜5.

2.1. Newton-Raphson algorithm

The Newton-Raphson algorithm is a direct approach for estimating the relevant parameters in a likelihood function. In this algorithm, the solution of the likelihood equation is obtained through an iterative procedure. Let θ = (α, λ)T be the parameter vector. Then, at the (h+1)th step of iteration process, the updated parameter is obtained as

θ(h+1)=θ(h)[LO(x˜;θ)θ|θ=θ(h)]T[2LO(x˜;θ)θθT|θ=θ(h)]1,
where
LO(x˜;θ)θ=(LO(x˜;α,λ)αLO(x˜;α,λ)λ)
and
2LO(x˜;θ)θθT=(2LO(x˜;α,λ)α22LO(x˜;α,λ)αλ2LO(x˜;α,λ)αλ2LO(x˜;α,λ)λ2).
The second-order derivatives of the log-likelihood with respect to the parameters, required for proceeding with the Newton-Raphson method, are obtained as follows.
2LO(x˜;α,λ)α2=nα2+i=1n(1+λx)(α+1)[log(1+λx)]2μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dxi=1n[(1+λx)(α+1)[log(1+λx)]2μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx]2,
2LO(x˜;α,λ)λ2=nλ2+i=1n(α+1)(α+2)x2(1+λx)(α+3)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dxi=1n[(α+1)x(1+λx)(α+2)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx]2,
2LO(x˜;α,λ)αλ=i=1nx(1+λx)(α+2)[(α+1)log(1+λx)1]μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dxi=1n((1+λx)(α+1)log(1+λx)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx×[(α+1)x(1+λx)(α+2)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx]).
The iteration process then continues until convergence, i.e., until ‖θ(h+1) θ(h)‖ < ε, for some pre-fixed ε > 0. The MLE of (α, λ) via NR algorithm is thereafter refereed as (α^NR,λ^NR) in this paper.

It should be pointed out that the second-order derivatives of the log-likelihood are required at every iteration in the Newton-Raphson method. Sometimes the calculation of the derivatives based on fuzzy data can be rather tedious. Another viable alternative to the Newton-Raphson algorithm is the well-known EM algorithm. In the following, we discuss how that can be used to determine the MLEs in this case.

2.2. EM algorithm

The Expectation Maximization (EM) algorithm is a broadly applicable approach to the iterative computation of maximum likelihood estimates and useful in a variety of incomplete-data problems. Since the observed fuzzy data x˜ can be seen as an incomplete specification of a complete data vector x, the EM algorithm is applicable to obtain the maximum likelihood estimates of the unknown parameters. In the following, we use the EM algorithm to determine the MLEs of α and λ.

From (2.1), the log-likelihood function for the complete data vector x becomes:

logL(x;α,λ)=nlogα+nlogλ(α+1)i=1nlog(1+λxi).
Taking the derivative with respect to α and λ, respectively, on (2.6), the following likelihood equations are obtained:
nα=i=1nlog(1+λxi),
nλ=(α+1)i=1nxi1+λxi.
Therefore the EM algorithm is given by the following iterative process:
  1. 1.

    Given starting values of α and λ, say α(0) and λ(0) and set h = 0.

  2. 2.

    In the (h + 1)th iteration,

    1. (1)

      The E-step requires to compute the following conditional expectations using the expression (1.5):

      Eα(h),λ(h)(log(1+λ(h)X)|x˜)=(1+λ(h)x)(α(h)+1)log(1+λ(h)x)μx˜i(x)dx(1+λ(h)x)(α(h)+1)μx˜i(x)dx,
      Eα(h),λ(h)(X1+λ(h)X|x˜)=x(1+λ(h)x)(α(h)+2)μx˜i(x)dx(1+λ(h)x)(α(h)+1)μx˜i(x)dx,
      and the likelihood equations (2.7) and (2.8) are replaced by
      nα=i=1nEα(h),λ(h)(log(1+λ(h)X)|x˜),
      nλ=(α+1)i=1nEα(h),λ(h)(X1+λ(h)X|x˜).

    1. (1)

      The M-step requires to solve the Eqs. (2.9) and (2.10) and obtain the next values, α(h+1) and λ(h+1), of α and λ, respectively, as follows:

      α(h+1)=ni=1nEα(h),λ(h)(log(1+λ(h)X)|x˜),
      λ(h+1)=n(α(h+1)+1)i=1nEα(h),λ(h)(X1+λ(h)X|x˜).

  3. 3.

    Checking convergence, if the convergence occurs then the current α(h+1) and λ(h+1) are the maximum likelihood estimates of α and λ via EM algorithm; otherwise, set h = h + 1 and go to Step 2. The MLE of (α, λ) via EM algorithm is thereafter refereed as (α^EM,λ^EM) in this paper.

3. Bayesian estimation

In recent decades, the Bayes viewpoint, as a powerful and valid alternative to traditional statistical perspectives, has received frequent attention for statistical inference. In this section, we consider the Bayesian estimation of the unknown parameters by using a squared error loss function. In order to evaluate the behavior of the parameters, it is assumed that α and λ have independent gamma priors with the pdfs

π1(α)=cdΓ(c)αc1eλd,α>0,
and
π2(λ)=baΓ(a)λa1eλb,λ>0.
Based on the above priors, the joint posterior density function of α and λ given the data can be written as follows:
π(α,λ|x˜)=π1(α)π2(λ)(x˜;α,λ)00π1(α)π2(λ)(x˜;α,λ)dαdλ,
where
(x˜;α,λ)=αnλni=1n(1+λxi)(α+1)μx˜i(x)dx,
is the likelihood function based on the fuzzy sample X˜. Then, under the squared error loss function, the Bayes estimate of any function of α and λ, say g(α, λ), is
E(g(α,λ)|x˜)=00g(α,λ)π1(α)π2(λ)(x˜;α,λ)dαdλ00π1(α)π2(λ)(x˜;α,λ)dαdλ=00g(α,λ)eQ(α,λ)dαdλ00eQ(α,λ)dαdλ,
where Q(α,λ)=ln[π1(α)π2(λ)]+ln(x˜;α,λ)ρ(α,λ)+L(α,λ). Note that Eq. (3.4) cannot be obtained analytically; therefore, in the following, we adopt Tierney and Kadane’s approximation for computing the Bayes estimates.

Setting H(α, λ) = Q(α, λ)/n and H* (α, λ) = [lng(α, λ) + Q(α, λ)]/n, the expression in (3.4) can be reexpressed as

E(g(α,λ)|x˜)=00enH*(α,λ)dαdλ00enH(α,λ)dαdλ.
Following Tierney and Kadane [32], Eq. (3.5) can be approximated as
g^BT(α,λ)=(det*det)12exp{n[H*(α¯*,λ¯*)H(α¯,λ¯)]},
where (α¯*,λ¯*) and (α¯,λ¯) maximize H*(α¯*,λ¯*) and H(α¯,λ¯), respectively, and * and are minus the negatives of the inverse Hessians of H *(α, λ) and H(α, λ) at (α¯*,λ¯*) and (α¯,λ¯), respectively.

In our case, we have

H(α,λ)=1n{k+(n+c1)logα+(n+a1)logλλbαd+i=1nlog(1+λx)(α+1)μx˜i(x)dx},
where k is a constant. Therefore, (α¯,λ¯) can be obtained by solving the following two equations:
H(α,λ)α=1n{n+c1αdi=1n(1+λx)(α+1)log(1+λx)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx}=0,
H(α,λ)λ=1n{n+a1λbi=1n(α+1)x(1+λx)(α+2)μx˜i(x)dx(1+λx)(α+1)μx˜i(x)dx}=0.
and from the second derivatives of H(α, λ), the determinant of the negative of the inverse Hessians of H(α, λ) at (α¯,λ¯) is given by
det=(H11H22H122)1,
where
H11=1n{n+c1α¯2+i=1n((1+λ¯x)(α¯+1)[log(1+λ¯x)]2μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx)i=1n[(1+λ¯x)(α¯+1)log(1+λ¯x)μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx]2},
H22=1n{n+a1λ¯2+i=1n((α¯+1)(α¯+2)x2(1+λ¯x)(α¯+3)μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx)i=1n[(α¯+1)x(1+λ¯x)(α¯+2)μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx]2},
H12=1n{i=1nx(1+λ¯x)(α¯+2)[(α¯+1)log(1+λ¯x)1]μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dxi=1n((α¯+1)x(1+λ¯x)(α¯+2)μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx×[(1+λ¯x)(α¯+1)log(1+λ¯x)μx˜i(x)dx(1+λ¯x)(α¯+1)μx˜i(x)dx])}
Now, following the same argument with g(α, λ) = α and λ, respectively, in H *(α, λ), α^BT and λ^BT in Eq. (3.6) can then be obtained straightforwardly.

4. Numerical Study

4.1. Monte Carlo Simulations

In this section, we present some experimental results, mainly to observe how the different methods behave for different sample sizes. We obtain the estimates of the unknown parameters α and λ by using the methods provided in the preceding sections. The performance of the competitive estimates has been compared on the basis of their estimated biases and mean squared errors. The computations are performed using R 2.14.0 [27], which is a non-commercial, open source software package for statistical computing.

First, for two sets of parameter values namely; (α = 1, λ = 1) and (α = 2, λ = 1.5)and various choices of sample size n, we have generated i.i.d random samples from the Lomax distribution. Then, using the method proposed by Pak et al. [24], each realization of the generated samples was fuzzified by employing the fuzzy information system shown in Fig. 2, and the estimates of parameters for the fuzzy sample were computed using the maximum likelihood method (via Newton-Raphson and EM algorithms) and Bayesian procedure. To make the comparison between ML and Bayes estimates meaningful, two types of priors are suggested, namely non-informative prior and informative gamma prior (Lin et al. [19]). The first one is used when we do not have any prior information about the parameters while the informative priors are used to investigate any improvements in the performances of Bayes estimators. Here, we have considered two different choices of hyperparameters as

  • Prior I: (non-informative) a = b = c = d = 0

  • Prior II: (informative ) a = b = c = d = 2.

Figure 2:

Fuzzy information system used to encode the simulated data

It must be noted that the non-informative prior I is non-proper also. Press [25] suggested to use very small non-negative values of the hyperparameters in this case, and it will make the priors proper. We have tried a = b = c = d = 0.0001. The results are not significantly different than the corresponding results obtained using non-proper priors, and are not reported due to space. From now on, the Bayes estimates of parameters obtained by using the above priors will be denoted by BET1 and BET2, respectively. The estimated biases and mean squared errors of the estimates over 1000 replications are presented in Tables 14.

n NR EM BET1 BET2

EB MSE EB MSE EB MSE EB MSE
15 0.0567 0.0396 0.0569 0.0397 0.0553 0.0391 0.0535 0.0366
20 0.0509 0.0378 0.0511 0.0380 0.0504 0.0375 0.0477 0.0332
25 0.0482 0.0341 0.0485 0.0343 0.0479 0.0338 0.0439 0.0319
30 0.0417 0.0276 0.0419 0.0277 0.0403 0.0269 0.0381 0.0253
40 0.0362 0.0219 0.0364 0.0220 0.0361 0.0217 0.0345 0.0196
50 0.0319 0.0154 0.0319 0.0154 0.0315 0.0152 0.0307 0.0122
70 0.0254 0.0117 0.0256 0.0118 0.0237 0.0114 0.0239 0.0104
100 0.0213 0.0093 0.0214 0.0093 0.0213 0.0092 0.0208 0.0089
200 0.0173 0.0081 0.0174 0.0082 0.0171 0.0080 0.0168 0.0076
Table 1.

Estimated bias (EB) and mean squared error (MSE) of the ML and Bayes estimates of α for different sample sizes (α = 1, λ = 1).

n NR EM BET1 BET2

EB MSE EB MSE EB MSE EB MSE
15 0.0873 0.0542 0.0875 0.0543 0.0856 0.0537 0.0822 0.0517
20 0.0846 0.0507 0.0849 0.0508 0.0840 0.0498 0.0805 0.0492
25 0.0721 0.0440 0.0722 0.0441 0.0713 0.0432 0.0689 0.0420
30 0.0697 0.0416 0.0698 0.0417 0.0685 0.0409 0.0644 0.0397
40 0.0611 0.0357 0.0611 0.0357 0.0605 0.0348 0.0572 0.0308
50 0.0558 0.0313 0.0560 0.0315 0.0557 0.0310 0.0521 0.0245
70 0.0432 0.0271 0.0432 0.0271 0.0429 0.0262 0.0419 0.0228
100 0.0395 0.0249 0.0396 0.0251 0.0395 0.0248 0.0387 0.0244
200 0.0246 0.0177 0.0247 0.0177 0.0242 0.0176 0.0241 0.0169
Table 2.

Estimated bias (EB) and mean squared error (MSE) of the ML and Bayes estimates of λ for different sample sizes (α = 1, λ = 1).

n NR EM BET1 BET2

EB MSE EB MSE EB MSE EB MSE
15 0.0941 0.0687 0.0942 0.0687 0.0926 0.0675 0.0871 0.0619
20 0.0912 0.0634 0.0914 0.0635 0.0908 0.0631 0.0835 0.0578
25 0.0858 0.0573 0.0860 0.0574 0.0839 0.0559 0.0772 0.0513
30 0.0730 0.0506 0.0733 0.0506 0.0724 0.0502 0.0694 0.0462
40 0.0691 0.0428 0.0692 0.0428 0.0690 0.0427 0.0645 0.0408
50 0.0617 0.0382 0.0617 0.0384 0.0611 0.0380 0.0603 0.0374
70 0.0582 0.0341 0.0582 0.0342 0.0579 0.0338 0.0578 0.0336
100 0.0424 0.0296 0.0426 0.0296 0.0424 0.0294 0.0421 0.0290
200 0.0316 0.0182 0.0318 0.0183 0.0314 0.0178 0.0310 0.0177
Table 3.

Estimated bias (EB) and mean squared error (MSE) of the ML and Bayes estimates of α for different sample sizes (α = 2, λ = 1.5).

n NR EM BET1 BET2

EB MSE EB MSE EB MSE EB MSE
15 0.1456 0.1075 0.1457 0.1076 0.1408 0.1055 0.1289 0.0966
20 0.1391 0.0928 0.1393 0.0930 0.1375 0.0918 0.1321 0.0874
25 0.1328 0.0861 0.1328 0.0861 0.1327 0.0854 0.1296 0.0815
30 0.1279 0.0827 0.1282 0.0828 0.1276 0.0823 0.1208 0.0792
40 0.1044 0.0695 0.1045 0.0697 0.1032 0.0692 0.0983 0.0648
50 0.0953 0.0613 0.0954 0.0613 0.0941 0.0610 0.0877 0.0580
70 0.0765 0.0489 0.0767 0.0489 0.0762 0.0488 0.0714 0.0456
100 0.0492 0.0325 0.0494 0.0327 0.0490 0.0324 0.0478 0.0313
200 0.0378 0.0219 0.0379 0.0219 0.0377 0.0215 0.0364 0.0209
Table 4.

Estimated bias (EB) and mean squared error (MSE) of the ML and Bayes estimates of λ for different sample sizes (α = 2, λ = 1.5).

From the experiments, we found that using the NR or EM algorithm for the computation of maximum likelihood estimates of α and λ give similar estimation results. Moreover, with the EM algorithm, there is no need to evaluate the first and second derivatives of the log-likelihood function, which helps save the central processing unit (CPU) time of each iteration; although less iteration is required by the NR method, the CPU time required per iteration is substantially shorter for the EM algorithm. The performances of the estimates are satisfactory in terms of biases and MSEs, even for small sample sizes. It can be further observed that when we do not have any prior information about the parameters, using the Bayes estimates we may not gain much as expected. For all the methods, it is observed that as the sample size increases, the biases and MSEs of the estimates decrease as expected. For small and moderate sample sizes, the Bayesian approach based on informative priors, gives the most precise parameter estimates as shown by ABs and MSEs in Tables 14. For large sample sizes (n = 100, 200), the performances of the MLEs and Bayes estimates are almost identical.

4.2. Illustrative example

In this example, we consider a date set that were obtained from a meteorological study by Simpson [29] and further analyzed by Helu et al. [14]. The study was based on the radar-evaluated rainfall from 52 south Florida cumulus clouds, 26 seeded clouds, and 26 control clouds. Since the rainfall data evaluated by radar systems inevitably have some degree of imprecision, it is suggested to report the partial information on the rainfalls by means of lower and upper bounds, as well as a point estimate. Assume that the imprecision of the rainfalls is formulated by triangular fuzzy numbers x˜i=(μi,xi,ηi), i = 1,...,52, where μi = 0.05xi and ηi = 0.03xi with the membership functions

μx˜(x)={x(xiμi)μixiμixxi,xi+ηixηixixxi+ηi.
Here, for the fuzzy sample x˜1,,x˜52, we employ NR and EM algorithms to compute the ML estimates of the parameters. The stopping criterion is based on the difference between the two consecutive iterates, with a tolerance value ε = 106 . The final MLEs are (α^NR=0.7843, λ^NR=0.0054) and (α^EM=0.7844, λ^EM=0.0052). For computing the Bayes estimates, two different choices of hyperparameters are considered as (a, b, c, d) = (0.0001, 0.0001, 0.0001, 0.0001), (2, 2, 2, 2). Then, using the Tierney and Kadane’s approximation, the Bayes estimate of the parameters become (α^BT=0.7824 , λ^BT=0.0057) and (α^BT=0.7761, λ^BT=0.0063) for the above non-informative and informative priors, respectively. It is observed that the ML estimates of the parameters from Newton-Raphson and EM algorithms are about the same. Moreover, the Bayes estimates evaluated based on non-informative priors, are very close to the corresponding MLEs.

5. Conclusions

In the literature, there are well-developed estimation techniques for the parameters of Lomax distribution based on complete and censored data. But, traditionally it is assumed that the available data are obtained as exact numbers. However, in real world situations, the results of the experimental performance can not always be recorded or measured precisely, but each observable event may only be identified with a fuzzy subset of the sample space. Therefore, we need suitable statistical methodology to handle these data as well. In this paper, we have discussed different estimation procedures for the Lomax distribution when the obtained data are reported in terms of fuzzy information. They include the maximum likelihood method (via Newton-Raphson and EM algorithms) and Bayesian procedure. We have then carried out a simulation study to assess the performance of these procedures. Based on the results of the simulation study, we see clearly that, the maximum likelihood estimates based on Newton-Raphson and EM algorithms behave in a very similar manner, but the EM algorithm is computationally slower. Among the two estimation procedures developed in the paper, the Bayesian procedure with informative priors gives smaller biases and MSEs compared to the maximum likelihood method. Also, the MSE values of the estimates are close to each other as the sample size increases.

Acknowledgements.

The authors would like to thank the referees for their constructive comments and suggestions which improved and enriched the presentation of the paper.

References

[3]A Asgharzadeh and R Valiollahi, Estimation of the scale parameter of the Lomax distribution under progressive censoring, International Journal of Statistics and Economics, Vol. 6, 2011, pp. 37-48.
[4]N Balakrishnan and M Ahsanullah, Relations for single and product moments of record values from Lomax distribution, Sankhya B, Vol. 56, No. 1.2, 1994, pp. 140-146.
[12]A Elfattah, F Alaboud, and A Alharby, On sample size estimation for Lomax distribution, Australian Journal of Basic and Applied Sciences, Vol. 4, 2007, pp. 373-378.
[16]BM Hsu, BS Chen, and MH Shu, Evaluating lifetime performance for the Pareto model with censored information, Second International Conference on Innovative Computing, Information and Control, 2007.
[22]T Oztekun, Comparison of parameter estimation methods for the three-parameter generalized Pareto distribution, Turkish Journal of Agriculture & Forestry, Vol. 29, 2005, pp. 419-428.
[23]A Pak, GH Parham, and M Saraj, Inference for the Weibull Distribution Based on Fuzzy Data, Revista Colombiana de Estadistica, Vol. 36, No. 1.2, 2013, pp. 339-358.
[25]SJ Press, The Subjectivity of Scientists and the Bayesian Approach, Wiley, New York, NY, USA, 2001.
[27]R Development Core Team, A Language and Environment for Statistical Computing: R Foundation for Statistical Computing, Vienna, Austria, 2011.
[31]H Tanaka, T Okuda, and K Asai, Fuzzy information and decision in statistical model, Advances in Fuzzy Sets Theory and Applications, North-Holland, Amsterdam, 1979, pp. 303-320.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
17 - 1
Pages
122 - 135
Publication Date
2018/03/31
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.2018.17.1.9How to use a DOI?
Copyright
Copyright © 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (http://creativecommons.org/licences/by-nc/4.0/).

Cite this article

TY  - JOUR
AU  - Abbas Pak
AU  - Mohammad Reza Mahmoudi
PY  - 2018
DA  - 2018/03/31
TI  - Estimating the parameters of Lomax distribution from imprecise information
JO  - Journal of Statistical Theory and Applications
SP  - 122
EP  - 135
VL  - 17
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.2018.17.1.9
DO  - 10.2991/jsta.2018.17.1.9
ID  - Pak2018
ER  -