Efficient Rotation Pattern in Two-Phase Sampling

The present investigation is an attempt to estimate the population mean on current occasion in two-phase successive (rotation) sampling over two occasions. Utilizing information on two auxiliary variables one chain-type estimator has been proposed to estimate the population mean on the current occasion. Properties of the proposed estimator have been studied and its optimum replacement strategy is discussed. The proposed estimator has been compared with sample mean estimator when there is no matching and the natural optimum estimator, which is a linear combination of the means of the matched and unmatched portions of the sample on the current occasion. Results are demonstrated through empirical studies which are followed by suitable recommendations.


Introduction
There are many problems of social, demographic, industrial and agricultural surveys in which the various characters opt to change over time with respect to different parameters. Hence, a survey carried out on a single occasion will provide information about the characteristics of the surveyed population for the given occasion only, and cannot, of itself, give any information on (a) the nature or rate of change of the characteristics over different occasions, and (b) the average value of the characteristics over all occasions or for the most recent occasion. To meet these requirements, the same population is sampled repeatedly and the study variable is measured on each occasion, so that the development over time can be followed. For example, data on price of goods are collected monthly to determine the consumer price index and political opinion surveys are conducted at regular intervals to know the voters preference, etc.
Theory of successive sampling appears to have started with the work of Jessen (1942). In successive sampling, it is common practice to use the information collected on a previous occasion as auxiliary variable to improve the precision of the estimates at current occasion. Information on the character under study from the sampled units on the preceding occasions can be utilized as auxiliary variable. Sen (1971) developed estimators for the population mean on the current occasion using information on two auxiliary variables available on previous occasion. Sen (1973) extended his work for several auxiliary variables. In many situations, information on an auxiliary variable may be readily available on the first as well as on the second occasion, for example, tonnage (or seat capacity) of each vehicle or ship is known in survey sampling of transportation, number of polluting industries and vehicles are known in environmental survey. Utilizing the auxiliary information on both the occasions, Feng and Zou (1997) and Biradar and Singh (2001), Singh (2005), , Singh and Priyanka (2008), Singh and Karna (2009) and Singh and Homa (2013) have proposed chain-type ratio and difference estimators, respectively, for estimating the population mean at second (current) occasion in two occasions successive sampling.
It is worth to be mentioned that all the above recent works are based on the assumption that the population means of the auxiliary variables are known, which may not often be the case. In such practical situations, it is more generously advisable to go for two-phase successive sampling. Two-phase sampling is a well-tested scheme to provide the estimates of unknown population parameters related to the auxiliary variables in firstphase sample. Motivated with this argument and utilizing the information on two auxiliary variables, we have proposed one chain-type estimator under two-phase sampling to estimate the population mean at the second (current) occasion in two occasion successive sampling. The performances of proposed estimator have been demonstrated empirically.

Formulation of the Estimator
Consider a finite population 1 2 N U = (U , U , . . ., U ) of N units which has been sampled over two occasions. The character under study is denoted by x(y) on the first (second) occasion respectively. Assume that the information on the auxiliary variables z and w (stable over occasions) are readily available on both the occasions and are positively correlated to x and y on the first and second occasions respectively. We also assume that the population mean of the auxiliary variable z is unknown while population mean of the auxiliary variable w be known. Let a sample of size n is selected by simple random sample without replacement (SRSWOR) scheme on the first occasion from the above population U and a random subsample from the sample selected on the first occasion of m = nλ units is retained (matched) for its use on the second (current) occasion. Once again following SRSWOR scheme, a fresh preliminary (first-phase) sample of size u′ is drawn from the non-sampled units of the population and a second-phase sample of size u = (n -m) = nμ (u = pu ) ′ is drawn from the first-phase (preliminary) sample. It is obvious that the sample size on the second occasion is also n. λ and μ ( λ + μ = 1) are the fractions of the matched and fresh samples respectively, on the second (current) occasion and p is a real scalar with 0 p 1. ≤ ≤ Hence onwards, we use the following notations: X, Y, Z and W : Population means of the variables x, y, z and w respectively.   To estimate the population mean Y on the second (current) occasion, two different estimators are considered. One estimator u T is based on sample of size u = nμ drawn afresh on the second occasion and the second estimator m T is based on the sample of size m(= nλ) common with both the occasions. Since, information on the variables y, z and w are available at sample level and the population mean of z is unknown, we define following two-phase chain ratio and regression type estimator u T as In follow up standard practice of utilizing the information of previous occasion as auxiliary variable for improving the precision of estimates at the current occasion and motivated with the estimation procedures suggested by Singh and Vishwakarma (2007) which discussed exponential type structure for estimating population mean, we propose following chain exponential and regression type estimator based on the sample of size m (= nλ) common with both the occasions as: T= ψT + 1-ψ T (3) where ψ is an unknown real constant to be determined so as to make the estimator T more precise.

Bias and Mean Square Error of the Proposed Estimator T
It can be found that u T and m T are the chain-type ratio, exponential and regression type estimators and they are biased for Y. Therefore, the resulting estimator T defined in (3) Thus, we have the following theorems.
The bias of the estimator T is given by and ( ) 2 2 m yw yx yz yw zw y Proof. It is clear that mean square error of T is given by  . Using the expressions given in equations (4)-(5) and taking expectations up to -1 o(n ), we have obtained the expression of mean square error of the estimator T as given in equation (10).
It should be noted that the estimators u T and m T are based on two non-overlapping samples of size u and m respectively. Therefore, their covariance type terms i.e.
 is of order -1 N , hence for large population size, it is ignored. (ii) Since ( ) y, x are the same variables over first and second (current) occasion respectively. So we have considered yz xz "ρ = ρ " and yw xw "ρ = ρ ". These are intuitive assumptions, which are also considered by Cochran (1977) and Feng and Zou (1997). (iii) Considering the stability nature (Reddy 1978) of the coefficient of variation and following the work of Cochran (1977) and Feng and Zou (1997), the coefficients of variation of y, x, z and w are taken to be approximately equal.

Minimum Mean Square Error of the Proposed Estimator
Since, mean square error of the estimator T derived in equation (10) is a function of unknown constant ψ, therefore, it could be minimized with respect to ψ and subsequently optimum value of ψ is derived as Substituting the value of opt ψ in equation (10), we get the optimum mean square error of the estimator T as Again, substituting the findings from equations (11) and (12) in equations (14) and (15)

Optimum Replacement Strategy for the estimator T
To determine the optimum values of μ (fraction of samples to be taken afresh at second occasion) so that population mean Y may be estimated with maximum precision, we minimize the mean square error of the estimator T given in equations (17) with respect to μ which result in quadratic equations in μ as The respective equations in μ are obtained as: Solving equation (18), we get solutions of μ sayμ. While choosing the values of μ , it should be remembered that 0 μ 1 ≤ ≤ and if two such admissible values of μ are obtained, the lowest one will be the chosen, as this indicate to have same mean square error by replacing only the lowest fraction of total sample size which reduces the cost of survey. All others values of μ are inadmissible. Substituting admissible values of μ say (o) μ into the equations (17), we have the optimum value of the mean square error of the proposed estimator T as

Efficiency Comparisons
The percent relative efficiencies of T with respect to (i) n y , when there is no matching and (ii) For different choices of different choices of yx ρ , yz ρ , yw ρ and zw ρ and p = 0.8, Table 1 show the optimum (admissible) values of μ and percent relative efficiencies of (1) E and (2) E of the proposed estimator T with respect to the estimators n y and Ŷ respectively, where

Conclusions
The following conclusions can be read-out from the Table 1. (a) For fixed values of yx ρ , zw ρ and yz ρ ,, the values of (1) E and (2) E are increasing while the values of (o) μ is decreasing with increasing choices of yw ρ , except when yx ρ = 0.7 and yz ρ = 0.9. This behaviour is highly desirable, since, it concludes that if highly correlated auxiliary variable is available, it pays in terms of enhance precision of estimates as well as reduces the cost of survey. (b) For the fixed values of values of zw ρ , yw ρ and yx ρ , the values of (1) E and (2) E are increasing while the values of (o) μ is decreasing with increasing choices of yz ρ . This pattern is also highly desirable as explained in (a). (d) The minimum value of (o) μ is 0.2662, which indicates that only 26.62 percent of the total sample size is to be replaced at the second (current) occasion for the corresponding choices of correlations.
From the above analysis, we may conclude that the proposed estimator is highly rewarding in terms of precision as well as in reducing the cost of the surveys. This is an agreement with the principle of optimization of sample survey. Hence, the proposed estimators may be recommended to the survey statisticians for their real life applications.