# Journal of Statistical Theory and Applications

Volume 20, Issue 1, March 2021, Pages 46 - 60

# Construction of Strata for a Model-Based Allocation Under a Superpopulation Model

Authors
Bhuwaneshwar Kumar Gupt1, *, Md. Irphan Ahamed2
1Department of Statistics, North-Eastern Hill University, Shillong, 793022, India
2Department of Mathematics, Umshyrpi College, Shillong, 793004, India
*Corresponding author. Email: bhuwaneshwargupt@gmail.com
Corresponding Author
Bhuwaneshwar Kumar Gupt
Received 21 April 2019, Accepted 7 October 2020, Available Online 13 January 2021.
DOI
10.2991/jsta.d.210107.001How to use a DOI?
Keywords
Approximately optimum strata boundaries; Auxiliary variable; Optimum strata boundaries; Superpopulation models
Abstract

This paper considers the problem of optimum stratification for a model-based allocation under a superpopulation model. The equations giving optimum points of stratification have been derived and a few methods for finding approximately optimum points of stratification have been obtained from the equations. Numerical illustrations using generated data have been worked out and the proposed methods of stratification have been compared with equal interval stratification.

Open Access

## 1. INTRODUCTION

The classical work of Tschuprow [1] and Neyman [2] on allocation of sample size to strata opened a space for further research on allocation and stratification aspects in stratified sampling. However, initial works on problem of allocation of sample size to strata as well as optimum stratification were based on the values of study variable y itself. When the information on an auxiliary variable highly correlated with the study variable y is available, it was demonstrated by Cochran [3] that a superpopulation model could be constructed in which the finite population under study could be treated as a random sample from an infinite population (superpopulation). It could also be used for construction of strata and allocation of sample size to strata.

Hanurav [4] and Rao [5] started using auxiliary information for allocation of sample size to strata in which the following superpopulation model was considered.

iξyi|xi=α+βxiiivyi|xi=σ2xigiiiςyi,yj|xi,xj=0(1)
where α, β, σ2 and g were superpopulation parameters with σ2>0 and g0. The script letters ξ, v, ς denoted conditional expectation, variance and covariance given x's respectively.

Hanurav [4] studied the problem of allocation and obtained the allocation as nhNhσhx for simple random sampling with replacement (SRSWR) within each stratum under particular case g=2 of the model (1). This allocation was obtained from Tschuprow–Neyman allocation when the unknown proportionate values of σhy2's were replaced by known proportionate values of σhx2's, which were the estimates of σhy2's. Rao [5] too examined analytically the justification for the assumption that the unknown proportionate values of σhy2's were not quite different from the proportionate values of known σhx2's. He proved that σhy2's could be expected to be in the same proportion as σhx2's, if the squares of the corrected coefficients of variation of x character, defined by σhx2Xh2δhNh2, where δh=Nhj=1NhXhj2j=1NhXhjg, are equal in all strata. He also obtained allocation which minimized the expected variance of strategy consisting πPS sampling scheme, and Narain [6] and Horvitz and Thomson [7] estimator under particular case of the model (1) with intercept α=0.

The problem of optimum allocation of sample size to strata for probability proportional to size with replacement (PPSWR) within each stratum under a particular case α=0 of the model (1) was considered by Gupt and Rao [8].

On the other hand, in stratified sampling, ever since Dalenius [9] had pioneered the work on optimum stratification based on estimation variable for Tschuprow–Neyman allocation, later workers have been extending the work in various perspectives and dimensions till date. Regarding finding of approximate solutions to the equations giving optimum strata boundaries (OSBs), Dalenius and Hodges [10] were the first ones who proposed cum f rule. In case of the problem of optimum stratification using auxiliary variable, the original main works, inter alia, were done by Dalenius and Gurney [11] and Taga [12] who considered it in the case of Tschuprow–Neyman and proportional allocations (PA) respectively. Subsequently, Singh and Sukhatme [13,14], Serfling [15], Singh and Parkash [16] and Singh [1720] furthered the work on optimum stratification based on auxiliary variable for various allocation methods in which a number of methods of finding approximate solutions to all the equations giving OSB were also obtained.

Whenever an auxiliary variable highly correlated with the study variable is available, Singh and Sukhatme [13] stipulated a superpopulation model as follows —for which the form of regression of estimation variable y on the concomitant variable x and also the form of the variance function were known.

y=cx+e such that ξe|x=0 and ve|x=φx(2)
where cx and φx are real valued functions of x with φx>0, for all values of x in the range (a, b) with ba<.

Singh and Sukhatme [13] derived OSB and approximately optimum strata boundaries (AOSBs) based on the auxiliary variable for Tschuprow–Neyman Optimum Allocation (TNOA) and PA under the superpopulation model (2) and empirically illustrated under model (1) for the particular case g=1.

Yadava and Singh [21] derived equations for OSB for allocation proportional to strata totals of an auxiliary variable and also developed a few methods of obtaining their approximate solutions.

Gupt [22,23] considered sample size allocation problem by modifying the above model (1) in such a way that the element of correlation among units within the same stratum was incorporated; he derived three model-based allocations by assuming some conditions for approximation.

In this paper, we consider problem of optimum stratification for the following allocation which is one of the three model-based allocations, viz.,

nhαNhμhxg(3)
provided θhg=σhxμhxg are equal in all strata.

The equations that give OSB for the allocation (3) for stratified SRSWR have been derived in Section 2. Moreover, these results will hold true for stratified simple random sampling without replacement (SRSWOR) design too when finite population correction is ignored in each stratum. The limiting lower bound of the variance of population mean when the number of strata tends to infinity has been shown in Section 3. The methods of obtaining AOSB to the equations that give OSB have been obtained in Section 4. Numerical illustrations by using generated populations have been worked out in Section 5. Conclusion is given in Section 6.

## 2. EQUATIONS GIVING OSB

The model-based allocation (3) can be expressed as

nh=nWhμhxgWhμhxg(4)
where Wh is the proportion of population units in the hth stratum, μhxg is the mean for xg in the hth stratum.

For the above expression (4), the variance of the estimate of the population mean can be obtained as

vy¯st=1nh=1LWhβ2σhx2+σ2μhxgμhxgh=1LWhμhxg.(5)

If fx is taken as the marginal density function for the stratification variable x,

We know that

Whμhxg=xh1xhxgfxdx.(6)
2Whσhxδσhxδxh=xhμhx2fxhσhx2fxh.(7)

Taking into consideration of the superpopulation model (1) we get,

σhy2=β2σhx2+σ2μh(xg).(8)

The variance expression in (5) is partially differentiated with respect to xhh=1,2,,L1 and equated to zero to obtain minimum variance.

On differentiating (5) partially with respect to xh, we get

h=1LWhβ2σhx2+σ2μhxgμhxgδδxhh=1LWhμhxg+h=1LWhμhxgδδxhh=1LWhβ2σhx2+σ2μhxgμhxg=0
h=1LWhβ2σhx2+σ2μhxgμhxgWhδhδxh+hδWhδxh+Wiδiδxh+iδWiδxh+h=1LWhμhxgWhδhδxh+hδWhδxh+Wiδiδxh+iδWiδxh=0
where i=h+1,h=μhxg,i=h+1,h=β2σhx2+σ2μhxgμhxg.(9)

By using the definitions in (6), (7) and relation (8)

Whδhδxh+hδWhδxh+Wiδiδxh+iδWiδxh=μhxg+xhg2μhxgfxh+μixg+xhg2μixgfxh(10)
and
Whδhδxh+hδWhδxh+Wiδiδxh+iδWiδxh=β2σhx2+σ2μhxgμhxgxhg+2μhxgβ2xhμhx2+σ2xhg2μixg3/2fxh+β2σix2+σ2μixgμixgxhg+2μixgβ2xhμix2+σ2xhg2μixg3/2fxh(11)

Then, finally using (10) and (11), the equations giving OSB for model-based allocation (3) can be obtained as

β2σhx2+σ2μhxgμhxgxhg+2μhxgβ2xhμhx2+σ2xhg2μhxg3/2h=1LWhμhxg+μhxg+xhg2μhxgh=1LWhβ2σhx2+σ2μhxgμhxg=β2σix2+σ2μixgμixgxhg+2μixgβ2xhμix2+σ2xhg2μixg3/2h=1LWhμhxg+μixg+xhg2μixgh=1LWhβ2σhx2+σ2μhxgμhxg.(12)

The above equations (12) give the OSB of the auxiliary variable x.

## 3. A FEW METHODS OF FINDING APPROXIMATE SOLUTIONS TO THE EQUATIONS GIVING OSB

In order to find the approximate solutions of the equations (12), we follow the techniques of Singh and Sukhatme [13] and Yadava and Singh [21]. For this purpose, it is required to assume the existence of partial derivatives of fx, cx and ψx, where we have defined ψx=xg, cx=α+βx. Then, we obtain series expansions of this system of equations (Singh and Sukhatme [13]) using the identities (Ekman [24,25]) and Taylor's series expansion, about the point xh which is assumed as the common boundary of hth and h+1th strata. Considering the right hand side of Equation (12) and all the derivatives used hereafter, in this paper, are evaluated at t=xh in the interval txh,xh+1, we have

μiψ=ψ+ψ2ki+ψf+2fψ12fki2+ffψ+ffψ+f2ψψf224f2ki3+Oki4.
μixgxhg=ψ2ki+ψf+2fψ12fki2+ffψ+ffψ+f2ψψf224f2ki3+Oki4.

Now, we can get

β2σix2+σ2μixg=σ2ψ+σ2ψ2ki+β2f+2σ2fψ+σ2fψ12fki2+σ2ffψ+σ2ffψ+σ2f2ψσ2f2ψ24f2ki3+Oki4.
&β2σix2+σ2μixgμixgxhg=σ2ψψ2ki+σ2fψψ+2σ2fψψ+3σ2fψ212fki2+    +2σ2f2ψψ+β2f2ψ+2σ2f2ψψ+σ2ffψ2σ2ffψψ+σ2ffψψ+σ2f2ψψσ2f2ψψ+σ2ffψ224f2ki3+Oki4.(13)

Also,

2μixgβ2μixxh2+σ2xhg=2σ2ψ2+σ2ψψki+3β2fψ+σ2fψψ+2σ2fψψ6fki2+    +σ2ffψψ+σ2f2ψψσ2f2ψψ3β2f2ψ+2β2ffψ+σ2ffψψ12f2ki3+Oki4(14)

Adding (13) and (14) we get,

β2σix2+σ2μixgμixgxhg+2μixgβ2μixxh2+σ2xhg=2σ2ψ2+3σ2ψψ2ki+    +σ2fψ2+2β2fψσ2fψψ+2σ2fψψ4fki2+       +2σ2ffψ2+4σ2f2ψψ+7β2f2ψ+4β2ffψ3σ2ffψψ+3σ2ffψψ+3σ2f2ψψ3σ2f2ψψ24f2ki3+Oki4(15)

Moreover

12μixg3/2=12ψ3/212121213ψ4ψki+15fψ24fψψ8fψψ32fψ2+    8ffψψ28f2ψψ2+8f2ψψ2)(20ffψψ2+40f2ψψψ35f2ψ38ffψψ2128f2ψ3ki3+Oki4.(16)

Multiplying Equations (15) and (16), we get,

β2σix2+σ2μixgμixgxhg+2μixgβ2μixxh2+σ2xhg2μixig3/2=σ2ψ+σ2ψ2+8β2ψ32ψ3/2ki2+2σ2fψψ2+4σ2fψψψ8β2fψψ+16β2fψ23σ2fψ3192fψ5/2ki3+Oki4.(17)

Similarly, we have derived

μixg+xhg2μixg=ψ+ψ232ψ3/2ki2+2fψψ2+4fψψψ3fψ3192fψ5/2ki3+Oki4(18)

Thus from (9), (17) and (18), we can again express the equations (12) as

g1fkh2g1fkh33+Okh4h=1LWhh+g2fkh2g2fkh33+Oki4h=1LWhh=g1fki2+g1fki33+Oki4h=1LWhh+g2fki2+g2fki33+Oki4h=1LWhh.(19)

Now, for the purpose of tackling (19), we prove the following lemma:

### Lemma 3.1.

If xh,xh+1 are boundaries of the ith stratum and ki=xh+1xh, the expressions h=1LWhh and h=1LWhh can, under the conditions of AOSB, be approximated to give fixed values for a given value of L- no of strata, where i=h+1.

### Proof:

Using series expansions in powers of interval width ki for Wi, μiψ and σix2 by following certain techniques and strategies (Ekman [24,25], Singh and Sukhatme [13], Yadava and Singh [21]).

We have already got from above

μhxg=ψ1+ψ4ψki+8fψψ+4fψψ3fψ296fψ2ki2+8ffψ2ψ+8ffψ2ψ+8f2ψ2ψ8f2ψ2ψ8f2ψψψ4ffψψ2+3f2ψ3384f2ψ3ki3+Oki4.(20)

And

Wi=kif+ki22f+ki36f+ki424f+Oki5.(21)

Multiplying (20) and (21)

Wiμhxg=ψ121212kif+fψ+2fψ4ψki2+8fψψ+16fψψ3fψ2+16fψ296ψ2ki3+                    10fψψ2+3fψ3+16fψ324fψ2ψ+24fψ2ψ+8fψ2ψ8fψψψ384ψ3ki4+Oki5(22)

Considering the expansion of the following term by Taylor's series

xhxh+1ψtftdt=ψfki+ψfki22!+ψfki33!+ψfki44!+Oki5=ψkif+fψ+2fψ4ψki2+2fψψ+4fψψfψ2+4fψ224ψ2ki3    +24fψ2ψ+8fψψ2+24fψ2ψ12fψψ2+16fψ312fψψψ+6fψ3384ψ3ki4+Oki5(23)

Subtracting (23) from (22)

Wiμixgxhxh+1ψtftdt=ki296fψ2ψψki+4fψψψ+2fψψ23fψ34ψ2ψki2+Oki3=ki296g1fki+12ddxhg1fki2+Oki3, where g1=ψ2ψ3/2(24)

The following lemma was proved by Singh and Sukhatme [13], by using Taylor's series expansion at the point t=y.

### Lemma 3.2.

yxftdtλλ=kλfy+fy2k+Ok2=kλ1yxftdt1+Ok2

By using Lemma (3.2) in (24) and considering the fact that if we have large number of strata whose strata widths kh are small, the higher powers of kh in the expansion can be neglected. So, by neglecting the terms of order Om5, where m=supa,bkh, we get

Wiμixgxhxh+1ψtftdt=196xhxh+1g1tft33.

By using cum g1tft or g1tft3 rule in manner in which Yadava and Singh [21] proceeded for obtaining AOSB - xhxh+1g1tft3dt=1Labg1tft3dt. We get

h=1LWhμhxg=abψxfxdx+196L2abg1xfx3dx3.(25)

From (25), we have known that under the conditions of AOSB, h=1LWhh in (19) can be assumed as a fixed value for a given L.

In the same way, the expression h=1LWhh in (19) can also be reduced to a fixed value, for a given L, as follows:

Wiβ2σix2+σ2μixgμixg=1ψσ2fψki+σ2fψ+2σ2fψ4ki2+     +16σ2ffψ23σ2f2ψ28β2f2ψ+8σ2f2ψψ+16σ2ffψψ96fψki3++8σ2f3ψψ2+3σ2f3ψ3+16β2f2fψ2+16σ2f2fψ38β2f3ψψ10σ2f2fψψ28σ2f3ψψψ+24σ2f2fψψ2+24σ2f2fψψ2384f2ψ2ki4+Oki5.(26)
&xhxh+1σ2ψtffdt=1ψσ2fψki+σ2fψ+2σ2fψ4ki2+8σ2f2ψψ+16σ2ffψψ4σ2f2ψ2+16σ2ffψ296fψki3+       +16σ2f2fψ312σ2f3ψψψ+6σ2f3fψ324σ2f2fψ2ψ+8σ2f3ψψ2+24σ2f2fψψ212σ2f2fψψ2384f2ψ2ki4+Oki5(27)

Subtracting (27) from (26)

Wiβ2σix2+σ2μixgμixgxhxh+1σ2ψtftdt=8β2ψ+σ2ψ2f96ψ3/2ki3+2σ2fψψ2+4σ2fψψψ3σ2fψ3+16β2fψ28β2fψψ384ψ5/2ki4+Oki5

Thus, in this case too, as obtained in (25), we get

h=1LWhβ2σhx2+σ2μhxgμhxg=abσ2ψxfxdx+196L2abg2xft3dt3(28)
where g2x=8β2ψx+σ2ψ2xψ3/2x

Under the conditions of AOSB, the expression h=1LWhh in (19) can also be assumed as a fixed value for a given value of L. Thus the proof of lemma is completed.

Now, for the sake of convenience, we put two constant quantities P and Q in place of h=1LWhh and h=1LWhh in (19), and Equation (19) can be rewritten as

g1fkh2g1fkh33+Okh4P+g2fkh2g2fkh33+Okh4Q=g1fki2+g1fki33+Oki4P+g2fki2+g2fki33+Oki4Q,
where g1t=σ2ψ2t+8β2ψtψ3/2t,g2t=ψ2tψ3/2t
g3fkh2g3fkh33+Okh4=g3fki2+g3fki33+Oki4,
where g3t=Pg1t+Qg2t
g3fkh21g3fg3fkh3+Okh2=g3fki21+g3fg3fki3+Oki2

On raising power 3/2 and then applying binomial expansion

g3f3/2kh31kh2g3fg3f+Okh2=g3f3/2ki31+ki2g3fg3f+Oki2(29)

By using the Lemma 3.2 and on further simplification, the system of Equations (29) can be transformed into

kh2xh1xhg3tftdt1+Okh2=ki2xhxh+1g3tftdt1+Oki2
kh2xh1xhg3tftdt=constant=c1(30)

The Equation (30) can again be easily proved to be equivalent to

xh1xhg3tft3dt=c2(31)
where c2=1Labg3tft3dt(32)

For evaluating the value of constant c2, we can use (32) and approximate solutions may be determined by fixing xh1 and calculating upper boundaries.

The procedures used in this method of finding AOSB have finally yielded the following theorem.

### Theorem 3.1.

If the function g3tft is bounded and first two derivatives for all x in (a, b) exist, for a given number of strata taking equal intervals on cumulative g3tft or g3tft3 yields approximately OSBs on the auxiliary variable.

## 4. LIMIT EXPRESSION FOR THE VARIANCE

The variance expression in (5) can further be reduced to a form that will give an insight into the pattern of reduction of the variance of the estimate y¯st with the increase in the number of strata. It is also shown that Vy¯st does not tend to zero when the number of strata tends to infinity. It is a sequel to the techniques used by Yadava and Singh [21].

In Lemma 3.1, it is already shown that

h=1LWhμhxg=A+1L2B, where

A=abψtftdt and B=196abg1tft3dt3.

And Whβ2σhx2+σ2μhxgμhxg=C+DL2, where

C=σ2abψtftdt,D=1L2abg2tft3dt3.

Therefore,

Vy¯st=1nA+BL2C+DL2.(33)

In the above expression (33), it is seen that as LVy¯stACn. It shows that Vy¯st does not tend to zero when the number of strata tends to infinity, and hence the following theorem.

### Theorem 4.1.

When the approximately OSBs are obtained by using cum g3tft3 rule, it is observed that limLVy¯st=ACn.

## 5. NUMERICAL ILLUSTRATION BY USING GENERATED DATA

For numerical investigation, the following three densities of x, which were not only used by Singh and Sukhatme [13] but also by most of the later workers who furthered researches in the area of problem of construction of strata, are used in this paper too. We calculate the solutions of the Equation (12), corresponding solutions of the approximation methods (31) and (32), and the sampling variances of the stratified sampling for equal interval stratification for L=2,3,4,5,6. In the case of exponential density of x, slight deviation from equal interval stratification is considered for L=6 to avoid some inconveniences that came upon in using the generated data. The relative efficiencies of the equations giving OSB and their corresponding a few methods of approximation with respect to equal interval stratification are separately shown in Tables 19 by taking g=1,g=1.5 and g=2 successively.

1. Rectangular fx=1,                   1x2.

2. Right triangular fx=22x,                   1x2.

3. Exponential fx=ex+1,                   1x<.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.5170 0.02783 1.5 0.0277 99.68 1.5011 0.0283 97.87
3 1.3259, 0.01641 1.334, 0.0164 100.00 1.3203, 0.0163 100.61
1.6712 1.667 1.6582
4 1.2564, 0.01278 1.25, 0.0128 99.84 1.2388, 0.0130 98.41
1.4789, 1.5, 1.49009
1.7343 1.75 1.7477
5 1.2124, 0.01202 1.20, 0.0132 109.58 1.1904, 0.0119 110.39
1.3695, 1.40, 1.3861,
1.5302, 1.60, 1.5865,
1.7451 1.80 1.7912
6 1.1699, 0.00964 1.1667, 0.0088 90.87 1.1583, 0.0089 98.87
1.3208, 1.3334, 1.32033,
1.5200, 1.499, 1.4857,
1.7233, 1.667, 1.6543,
1.8767 1.8334 1.8257
Table 1

Uniform distribution, g = 1.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.5135 0.0279 1.5 0.0278 99.53 1.4800 0.0283 98.20
3 1.3418, 0.0163 1.334, 0.0164 100.68 1.3169, 0.0163 100.68
1.6792 1.667 1.6557
4 1.2558, 0.0127 1.25, 0.0127 99.92 1.2361, 0.0129 98.53
1.4785, 1.50, 1.4832,
1.7337 1.75 1.7403
5 1.2121, 0.0120 1.20, 0.0131 109.26 1.1882, 0.0119 110.46
1.3695, 1.40, 1.3832,
1.5302, 1.60, 1.5853,
1.7447 1.80 1.7940
6 1.1697, 0.0095 1.1667, 0.0086 90.52 1.1571, 0.0089 96.95
1.3205, 1.334, 1.3183,
1.5107, 1.499, 1.4854,
1.7232, 1.6667, 1.6562,
1.8774 1.8334 1.8305
Table 2

Uniform distribution, g = 1.5.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.5102 0.0280 1.5 0.0278 99.39 1.4734 0.0283 98.41
3 1.3408, 0.0167 1.334, 0.0164 98.26 1.3118, 0.0165 99.45
1.6783 1.667 1.6536
4 1.2516, 0.0127 1.25, 0.0127 99.69 1.2300, 0.0128 98.60
1.4762, 1.50, 1.4813,
1.7334 1.75 1.7422
5 1.2119, 0.0119 1.20, 0.0130 109.59 1.1840, 0.0118 110.05
1.3696, 1.40, 1.3774,
1.5303, 1.60, 1.5853,
1.7446 1.80 1.7955
6 1.1696, 0.0094 1.1667, 0.0084 90.05 1.1543, 0.0086 97.68
1.3204, 1.334, 1.3163,
1.5106, 1.50, 1.4841,
1.7233, 1.6667, 1.6570,
1.8784 1.8334 1.8323
Table 3

Uniform distribution, g = 2.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.3794 0.0226 1.5 0.0278 122.98 1.3926 0.0226 123.04
3 1.2754, 0.0131 1.334, 0.0138 105.11 1.2529, 0.0135 102.15
1.5898 1.667 1.5504
4 1.1915, 0.0104 1.25, 0.0121 117.29 1.1864, 0.0103 117.64
1.4006, 1.50, 1.3944,
1.6349 1.75 1.6373
5 1.1804, 0.0099 1.20, 0.0101 102.33 1.1480, 0.0093 108.73
1.3812, 1.40, 1.3085,
1.5800, 1.60, 1.4871,
1.7367 1.80 1.6960
6 1.1450, 0.0079 1.1667, 0.0093 117.35 1.1507, 0.0081 115.17
1.2891, 1.334, 1.2842,
1.4357, 1.50, 1.4300,
1.5897 1.6667, 1.5920,
1.7367 1.8334 1.7831
Table 4

Right triangular distribution, g = 1.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.3759 0.0225 1.5 0.0280 124.73 1.3864 0.0225 124.73
3 1.2737, 0.0131 1.334, 0.0142 108.54 1.2477, 0.0136 104.94
1.5889 1.667 1.5435
4 1.1842, 0.0103 1.25, 0.0122 118.40 1.1832, 0.0101 120.52
1.3959, 1.50, 1.3892,
1.6344 1.75 1.6330
5 1.1799, 0.0099 1.20, 0.0101 102.34 1.1450, 0.0092 109.92
1.3807, 1.40, 1.3034,
1.5799, 1.60, 1.4183,
1.7371 1.80 1.6903
6 1.1447, 0.0078 1.1667, 0.0092 117.92 1.1477, 0.0080 115.70
1.2888, 1.334, 1.2791,
1.4356, 1.50, 1.4230,
1.5897, 1.6667, 1.5850,
1.7371 1.8334 1.7776
Table 5

Right triangular distribution, g = 1.5.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 1.3732 0.0224 1.5 0.0283 126.40 1.3803 0.0221 127.71
3 1.2720, 0.0130 1.334, 0.0142 109.29 1.2431, 0.0129 110.39
1.5883 1.667 1.5375
4 1.1837, 0.0102 1.25, 0.0121 119.17 1.1790, 0.0098 123.42
1.3957, 1.50, 1.3830,
1.6341 1.75 1.6273
5 1.1794, 0.0098 1.20, 0.0101 102.97 1.1430, 0.0087 115.33
1.3804, 1.40, 1.3002,
1.5800, 1.60, 1.4775,
1.7377 1.80 1.6884
6 1.1445, 0.0077 1.1667, 0.0091 118.73 1.1447, 0.0072 126.70
1.2887, 1.334, 1.2744,
1.4358, 1.50, 1.4174,
1.5900, 1.6667, 1.5791,
1.7377 1.8334 1.7734
Table 6

Right triangular distribution, g = 2.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 2.2012 0.1817 2.50 0.1984 109.19 2.0632 0.1747 113.52
3 1. 6974, 2.5992 0.1037 2.0, 3.0 0.1208 116.43 1.6503, 2.5605 0.1055 114.49
4 1.6526, 0.0857 1.75, 0.0888 103.63 1.4686, 0.0802 110.75
2.3296, 2.50, 2.0625,
3.1672 3.25 2.8591
5 1.4058, 0.0674 1.6, 0.0807 119.77 1.3663, 0.0657 122.86
1.9199, 2.2, 1.8066,
2.5131, 2.8, 2.3489,
3.2297 3.4 3.0445
6 1.3710, 0.0575 1.5, 0.0644 111.90 1.3007, 0.0504 126.40
1.8163, 2.0, 1.6503,
2.3082, 2.3, 2.0626,
2.7549, 2.8, 2.5602,
3.2835 3.5 3.1815
Table 7

Exponential distribution, g = 1.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 2.1016 0.1715 2.50 0.1997 116.42 2.0247 0.1704 117.16
3 1. 6954, 0.1017 2.0, 0.1219 119.78 1.6264, 0.1041 117.08
2.5915 3.0 2.5684
4 1.6513, 0.0856 1.75, 0.0891 104.10 1.4466, 0.0762 116.98
2.3279, 2.50, 2.0249,
3.1658 3.25 2.8127
5 1.4039, 0.0661 1.6, 0.0806 121.88 1.3484, 0.0645 125.07
1.9191, 2.2, 1.7741,
2.5132, 2.8, 2.3082,
3.2294 3.4 3.0633
6 1.3695, 0.0606 1.5, 0.0682 112.57 1.2856, 0.0540 126.38
1.1816, 2.0, 1.6221,
2.3095, 2.30, 2.0250,
2.7586, 2.8, 2.5191,
3.2748 3.5 3.1482
Table 8

Exponential distribution, g = 1.5.

No. of Strata (L) Stratification by Using Equations (12)
Equal Interval Stratification
Relative Efficiency Stratification by Using Methods of Approximation
Relative Efficiency
Points nVY¯st Points nVY¯st Points nVY¯st
2 2.0613 0.1672 2.50 0.2014 120.90 1.9881 0.1683 119.71
3 1. 6939, 0.1001 2.0, 0.1228 122.73 1.5951, 0.1045 117.55
2.5846 3.0 2.4780
4 1.6503, 0.0854 1.75, 0.0893 104.56 1.4259, 0.0739 120.89
2.3271, 2.50, 1.9983,
3.1667 3.25 2.7732
5 1.4021, 0.0670 1.6, 0.0804 120.11 1.3315, 0.0648 124.10
1.9186, 2.2, 1.7429,
2.5144, 2.8, 2.2680,
3.2311 3.4 2.9714
6 1.3511, 0.0585 1.5, 0.0675 115.31 1.2714, 0.0530 127.23
1.8094, 2.0, 1.5952,
2.3114, 2.30, 1.9883,
2.7637, 2.8, 2.4780,
3.2886 3.5 3.1141
Table 9

Exponential distribution, g = 2.

The regression function cx is taken to be linear with the slope at 45°. The constant σ2 is determined in each case for the different values of g in such a way that 90% of the total variation is accounted for by the regression. In the case of the exponential distribution, we truncate the distribution such that the area under the curve to the right of the truncation point is 0.05. The optimum points of stratification are found by successive iterations. In solving the methods of finding AOSB (31) and (32), suitable techniques for solving numerical algebraic and transcendental equations, and numerical integrations are used.

## 6. CONCLUSION

In this paper, all the proposed methods of stratification—the equations giving OSBs and their methods of approximations—are found to be highly efficient in stratifying heteroscedastic populations. In uniform populations, equal interval stratification is considered to be efficient stratification method and all our proposed methods of stratification perform, in most cases, with nearly same with or, in a few cases, more efficiencies than that of equal interval stratification in the generated populations following uniform density function for all the strengths of heteroscedasticity, i.e., g=1,1.5,2. Moreover the proposed methods of stratifications perform with relatively higher efficiencies than equal interval stratification in stratifying populations of right triangular and exponential probability density functions for all the considered strengths of heteroscedasticity. Hence, it is observed that all the proposed methods of stratification perform efficiently in stratifying less skewed and lower level of heteroscedastic populations as well as highly skewed and higher level of heteroscedastic populations. These methods can be used effectively in stratifying heteroscedastic populations based on auxiliary variable which is highly correlated with estimation variable.

## CONFLICTS OF INTEREST

There is no conflict of interest between the authors.

## AUTHORS' CONTRIBUTIONS

The authors are equally involved in the work and the joint effort of the authors have led to the making of the paper.

## Funding Statement

There is no funding for the research work from any funding agency.

## ACKNOWLEDGEMENTS

We, the authors, are grateful to the anonymous reviewers for the inputs and guidance provided to us in improving the quality of the paper.

## REFERENCES

1.A.A. Tschuprow, Metron., Vol. 2, 1923, pp. 461-493.
4.T.V. Hanurav, Optimum Sampling Strategies and Some Related Problems, Indian Statistical Institute, 1965. unpublished Ph.D. Thesis
6.R.D. Narain, J. Indian. Soc. Agric. Stat., Vol. 3, 1951, pp. 169-174.
8.B.K. Gupt and T.J. Rao, J. Indian. Soc. Agric. Stat., Vol. 50, 1997, pp. 199-208. http://isas.org.in/jsp/volume/vol50/issue2/B.K.Gupt.pdf
18.R. Singh, Sankhya(c)., Vol. 37, 1975, pp. 109-115.
19.R. Singh, Sankhya(c)., Vol. 37, 1975, pp. 100-108.
22.B.K. Gupt, Metron-Int. J. Stat., Vol. LXI, 2003, pp. 35-52. https://ideas.repec.org/a/mtn/ancoec/030105.html
23.B.K. Gupt, Allocation of Sample Size in Stratified Sampling Under Superpopulation Models, LAP LAMBERT Academic Publishing AV Akademikerverlag GmbH & Co.KG, Saarbrucken, Germany, 2012.
Journal
Journal of Statistical Theory and Applications
Volume-Issue
20 - 1
Pages
46 - 60
Publication Date
2021/01/13
ISSN (Online)
2214-1766
ISSN (Print)
1538-7887
DOI
10.2991/jsta.d.210107.001How to use a DOI?
Open Access

TY  - JOUR
AU  - Bhuwaneshwar Kumar Gupt
AU  - Md. Irphan Ahamed
PY  - 2021
DA  - 2021/01/13
TI  - Construction of Strata for a Model-Based Allocation Under a Superpopulation Model
JO  - Journal of Statistical Theory and Applications
SP  - 46
EP  - 60
VL  - 20
IS  - 1
SN  - 2214-1766
UR  - https://doi.org/10.2991/jsta.d.210107.001
DO  - 10.2991/jsta.d.210107.001
ID  - Gupt2021
ER  -