Journal of Statistical Theory and Applications

Volume 17, Issue 2, June 2018, Pages 271 - 282

A New Test for Simple Tree Alternative in a 2 x k Table

Department of Statistics, R.K.M. Residential College, Narendrapur, Kolkata 700 103, India
Uttam Bandyopadhyay*,
Department of Statistics, University of Calcutta, 35 Ballygunge Circular Road, Kolkata 700 019, India
*corresponding author.
Corresponding Author
Received 20 October 2016, Accepted 31 May 2017, Available Online 30 June 2018.
DOI to use a DOI?
order restriction; simple tree; empirical size; empirical power; bootstrap

This paper considers simple tree order restriction in 2×k cohort study and provides a consistent test in which the usual multiple comparison test statistics are modified by using the characteristic roots of a consistent estimator of the associated correlation matrix. The relevant performance measures of the proposed test are obtained and are compared numerically with existing competitors via simulation. It is shown that the proposed test is comparable to or better than the competitors in terms of type I error rate and power. Finally, data study illustrates the use of such a test.

Copyright © 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (

1. Introduction

Testing the equality of multiple mortality rates from different exposure categories against an ordered alternative occurs frequently in epidemiological studies. For example, consider the cohort study by Gupta and Mehta (2000) in which the age adjusted mortality rates among women in Mumbai, India using mishri (roasted, powdered form of tobacco used to clean teeth) and betel nut are, respectively, 12.3 and 12.6 per 1000 per annum, whereas such rate for control group is 9.9. Hence, it would be reasonable to assume the simple tree restriction π1π2,π3, where π1, π2 and π3 represent, respectively, the risks of dying among women for the control group, for those who use mishri and for those who chew betel nuts. In general, if H : π1 = π2 = ⋯ = πk represents no restriction on mortality rates for k exposure categories, H can be tested against the patterned alternative HstH, where Hst : π1π2,π3,…πk.

Several tests are available in the literature for testing H against Hst − H. These are, for example, based on restricted maximum likelihood estimator (RMLE), multiple comparison procedures and non parametric kernels (see, for example, Fligner and Wolfe, 1982; Magel, 1988; Desu et al., 1996). While detecting order restrictions on binomial probabilities based on a 2 × k cohort study, multi-nomial allocation probabilities corresponding to the exposure levels play an important role. The existing tests to detect simple tree order restriction in a 2 × k table, where allocation probabilities are unbalanced, occasionally fail to attain the nominal level for small values of π1. Our aim is to propose a multiple comparison consistent test using the characteristic roots of a consistent estimator of the associated correlation matrix based on the multinomial allocation probabilities, in which this short fall has been overcome.

Among the RMLE based approaches, the work on confidence interval estimation subject to order restriction (Hwang and Peddada, 1994) is based on modified generalised isotonic regression estimator (MGIRE). A number of testing procedures are obtained following MGIRE (see, for example, Peddada et al., 2001; Peddada and Haseman, 2006; Teoh et al., 2008). In this paper we choose an MGIRE based test as competitor and is referred to as the MGIRE test. Other RMLE based procedures to detect simple tree alternative are, for example, due to Wright and Tran (1985), Conaway et al. (1991), Singh et al. (1993), Futschik and Pflug (1998), Tsai (2004). Multiple comparison procedure (Bretz et al., 2001, 2003; Genz, 2004; Schaarschmidt et al., 2008; Hothorn et al., 2009), based on normal and binary responses, is proposed as a method in which the cut off points of the related tests are obtained from the distribution functions of multivariate normal and multivariate t distributions and are provided numerically through the R-packages mnormt and mvtnorm. In our setting we also choose one of such tests under binary response as another competitor and call the corresponding test as the GBH (Genz-Bretz-Hothorn) test. Besides these multiple comparison tests some single contrast tests are available to detect order restriction among binomial probabilities (see, for example, Leuraud and Benichou, 2001, 2004; Bretz and Hothorn, 2003; Bandyopadhyay and Chakrabarti, 2013 and the references there in). Our numerical computation shows that for small sample size the MGIRE and GBH tests often fail to attain the nominal level under unbalanced allocation as compared to that under balanced allocation. The proposed test overcomes such shortfall and increases its power locally.

The outline of the paper is as follows. Section 2 provides the data layout and notations. Section 3 contains some asymptotics and formulation of the proposed test. Section 4 describes competitors of the proposed test. Simulation results on size and power of the tests are given in Section 5. Section 6 contains data study. The paper concludes with some discussions in Section 7, followed by some technical details in Appendices A and B.

2. Data layout and notations

Consider a cohort study on n individuals, where the dichotomous response variable Y, indicating survival status, is recorded for the exposure X consisting of the levels x1,x2,…,xk, measured in a nominal scale, satisfying x1x2,x3,…,xk. Let pj = P(X = xj) > 0, the chance of occurrence of the exposure level xj, j = 1,2,…,k with j=1kpj=1, and πj = P(Y = 1|X = xj) = 1−P(Y = 0|X = xj), the mortality rate at xj, j = 1,2,…,k. Define nj = #(X = xj) as the number of individuals observed at xj and sj = #(Y = 1|X = xj) as the disease count at xj, j = 1,2,…,k, where n=j=1knj.

Let us write nT = (n1,n2,…,nk), pT = (p1, p2,…,pk) and πT = (π1,π2,…,πk). Evidently, the distribution of n is multinomial on k categories with index n and parameter p. Further (s1,s2,…,sk), conditioning on n, constitutes k–independent binomial random variables, where sj follows binomial distribution with index nj and parameter πj, j = 1,2,…,k. In order to understand the simple tree order of the mortality rates at different exposure levels, H is tested against HstH.

In the subsequent discussions, pˆj and πˆj are used to denote, respectively, the observed proportions of individuals and successes at xj, where pˆj=nj/n and πˆj=sj/nj. Then the overall proportion of success is obtained by πˆ=1nj=1knjπˆj=pˆTπˆ, where pˆT=(pˆ1,pˆ2,,pˆk) and πˆT=(πˆ1,πˆ2,,πˆk). If nj vanishes for some j = 1,2,…,k, dirichlet prior is used to choose pˆj=nj+1/kn+1, j = 1,2,…,k. Similarly, if πˆ is found to be 0 or 1 for a specific sample, we choose πˆ=j=1knjπˆj+1/2n+1 by use of beta prior.

3. Proposed test and related asymptotic results

A naive test, analogous to Dunnett’s procedure (1955), can be constructed through Bonferroni’s correction in which H is rejected at level α against HstH if and only if

exceeds τα/(k−1), where τα is the (1 − α)th quantile of standard normal distribution, 0 < α < 1. Such a test is referred to as the T-test. In this paper a modification of the T-test is proposed by standardizing t = (t2,t3,…,tk)T through the estimators of the characteristic roots of the correlation matrix of t. Towards such modification, H is expressed in terms of multiple contrasts of π by
where C(k−1)×k = (1k−1 e1 e2ek−1) with ej, j = 1,2,…,k − 1 as (k − 1) component independent unit vectors and 1k1=j=1k1ej. Then, H is rejected against HstH if and only if Hj is rejected against Hja for at least one j, where Hj : π1 = πj and Hja:π1<πj, j = 2,3,…,k. Furthermore, an upper tail test based on tj is appropriate for the testing problem (Hj,Hja), j = 2,3,…,k. Hence, combining all such component tests, the resulting test becomes the T-test. Now, we consider the proposed modification.

Modifying T:

It is not difficult to see that, for 0 < pj < 1, j = 1,2,…,k, as n → ∞,

converges in distribution to Nk(0,I), the k-variate normal distribution with mean vector 0 and unit dispersion matrix I. That means, for large n,
which under H reduces to π(1π)Diag(pˆ11,pˆ21,,pˆk1), and the notation un~avn is used to mean that asymptotic distributions of the random variables un and vn are same. Therefore, the asymptotic distribution of t, shown in Appendix A, is Nk−1 (0,R(p)), under H, where R(p) is the correlation matrix with elements

Let λj = λj(p) > 0, j = 1,2,…,k − 1 be the characteristic roots of R(p) and wj be the unit norm characteristic vector corresponding to λj, j = 1,2,…,k − 1. Then, setting W = (w1 w2wk−1), it follows that


Hence, there exists a positive definite matrix R1/2(p) for which, as n → ∞,

in distribution under H, where

Since, R1/2(pˆ)R1/2(p) in probability, we get, as n → ∞,

in distribution under H, and hence T can be modified by

As usual, an upper tail test based on Tm would be appropriate. Such a test can be described by the critical region

where, for given α : 0 < α < 1, T is obtained approximately from the relation
which, by use of (1) and 2), yields the approximate relation
with Φ(.) as the distribution function of standard normal variable. Thus the test (referred to as the Tm-test), given by (3) and (4), is asymptotically level α test for the testing problem (H,HstH) and is a modification of the T-test. It is shown (see Appendix B) that the test is consistent.

4. Competitors

MGIRE test

Here components of π are estimated ( subject to a general order restriction) by


Then, incorporating Bonferroni’s corrections, the test, described by the critical region

is asymptotically level α test for the testing problem (H,HstH) and is used as a competitor of the proposed tests. From Appendix B, it is not difficult to see that under any π,
in probability, which is ‘zero’ or positive according as πH or πHstH. This implies that the MGIRE test is consistent for testing H against πHstH.

GBH test

Here H is rejected at level α against HstH if and only if

exceeds the 100(1 − α) equi-percentage point, ck1,R(pˆ),1α, of Nk1(0,R(pˆ)), the approximated null distribution of t. The consistency of this test for testing H against πHstH can be established by the same technique as in the previous test.

5. Simulation study

We perform a simulation study with hundred thousand replications taking k = 3 and, for the purpose of illustration, the nominal level (α) is chosen at 0.05. The proposed test and the competitors are compared with respect to both empirical type I error rate and empirical power. Empirical type I error rate (power) of a test is computed by that proportion of hundred thousand replications of the experiment under H (HHst), in which the test statistic exceeds the 0.95th quantile of its asymptotic null distribution.

For a 2 × k cohort data, setting p=pˆ and the common success probability under H at π=πˆ, 100,000 tables, similar to the data, are generated. If the bootstrap percentile points of the simulated null distributions of the statistics agree with the percentile points of the asymptotic null distributions of the respective statistics, P-values of the tests are obtained using the approximate null distributions, otherwise P-values are determined by bootstrapping (See, for example, Efron and Tibshirani (1993), Noreen (1989) and Romano (1988, 1989) for details) in which the proportion of cases the test statistics, evaluated from all such 100,000 tables, exceed the respective observed values obtained from the data set.

Similarly, if the empirical type I error rates do not agree with the nominal level, the powers of the corresponding test are evaluated using empirical cut-off point (0.95th quantile of the simulated null distribution of the test statistic) instead of the approximate cut-off point.

The simulation study is performed for different choices of n and p. For illustration, we choose n = 100,200,300,400 and 500 for both balanced (p1 = p2 = p3) and unbalanced situations. As most of the cohort studies indicate highly unbalanced situations , we take p = (0.9,0.05,0.05) (more allocation towards control) and p = (0.1,0.45,0.45) (less allocation towards control) for the present computation. For balanced allocation ρ = r12(p) is equal to 0.5 and for p = (0.9,0.05,0.05), (0.1,0.45,0.45) ρ is, respectively, equal to 0.053 and 0.818. π is chosen from {0.1,0.3,0.5} in order to ensure the conformity of the type I error rates to the nominal level. The empirical powers of the tests are obtained under the following cases of the parametric configurations:

  • Case A: π lying in the boundary of the alternative region, such as: π1 = π3 < π2.

  • Case B: π is well within the alternative region, such as: (B1) π1 < π2 = π3, (B2) π1 < π3 < π2.

For revealing the behaviour of the tests under Case A, we choose π = (0.1,0.2,0.1), and that under Case B, we choose π = (0.1,0.2,0.2) and (0.1,0.3,0.2) for (B1) and (B2), respectively.

Simplification: k = 3.

Settingρˆ=pˆ2pˆ3(pˆ1+pˆ2)(pˆ1+pˆ3),we getR(pˆ)=(1ρˆρˆ1),which givesΛ(pˆ)=(1ρˆ001+ρˆ)andW=(1/21/21/21/2),
and hence

Consequently, Tm becomes



Computation of Type I error rate

In Table 1, the entries, showing maximum departure of the type I error rates from the nominal level (more than 10 %departure from the nominal level) for different choices of π and n, are marked in bold faces. The table shows that under balanced allocation and unbalanced allocation probabilities (0.9,0.05,0.05) the Tm test and its competitors, except one exception, have similar behaviour. Again, unlike the Tm test, type I error rates of the MGIRE test do not agree with the nominal level under the allocation probabilities (0.1,0.45,0.45). However, in this situation, the GBH test maintains the nominal level except for small values of π. The more the increase in ρ, more is the deviation of the type I error rates for the MGIRE and GBH tests from the nominal level.

p = (1/3,1/3,1/3) p = (0.09,0.05,0.05) p = (0.1,0.45,0.45)
0.1 100 0.058 0.052 0.053 0.099 0.099 0.098 0.039 0.004 0.009
200 0.055 0.050 0.052 0.083 0.083 0.083 0.404 0.014 0.026
300 0.055 0.052 0.053 0.082 0.082 0.081 0.045 0.023 0.036
400 0.053 0.050 0.050 0.077 0.077 0.076 0.045 0.029 0.040
500 0.054 0.051 0.051 0.073 0.074 0.073 0.048 0.028 0.041

0.3 100 0.053 0.049 0.051 0.070 0.069 0.070 0.050 0.029 0.043
200 0.050 0.049 0.051 0.063 0.064 0.063 0.047 0.033 0.043
300 0.052 0.052 0.053 0.060 0.061 0.060 0.047 0.033 0.044
400 0.049 0.049 0.050 0.061 0.061 0.061 0.050 0.034 0.046
500 0.049 0.047 0.048 0.059 0.060 0.060 0.048 0.033 0.046

0.5 100 0.052 0.049 0.050 0.042 0.042 0.043 0.053 0.040 0.053
200 0.049 0.048 0.049 0.048 0.048 0.048 0.050 0.039 0.051
300 0.050 0.050 0.051 0.048 0.049 0.048 0.049 0.038 0.050
400 0.049 0.046 0.047 0.052 0.052 0.052 0.052 0.040 0.051
500 0.051 0.048 0.050 0.048 0.049 0.048 0.051 0.037 0.049
Table 1.

Empirical type I error rate: Tm, MGIRE and GBH tests (α= 0.05).

Computation of empirical power

Table 2 and Table 3 show, respectively, the empirical powers of the tests under Case A and Case B. For each of the given choices of p, π and n maximum powers are marked in bold faces.

  • Case A:

    Table 2 shows the empirical powers of all the tests for the given choices of π lying in a boundary of parametric space under HstH. For the given choices of n, the Tm test is found to be more powerful than the MGIRE and GBH tests under both balanced and unbalanced allocation probabilities. Based on this empirical power comparison, an approximate ordering of the tests is Tm, GBH, MGIRE, in which the Tm-test is the best in terms of having maximum empirical power.

  • Case B: Table 3 shows numerical computations of empirical power under both Case B1 and Case B2. Here, under Case B1, the GBH test is found to be be more powerful than the MGIRE and Tm tests under both balanced and unbalanced allocation probabilities. Based on this empirical power comparison, like Case A, an approximate ordering of the tests is GBH, MGIRE, Tm. Under Case B2 for balanced allocation probabilities and allocation probabilities (0.9,0.05,0.05) ordering of the tests with respect to empirical powers remains unaltered with an insignificant variation among the empirical powers. Under allocation probabilities (0.1,0.45,0.45) corresponding to Case B2 , for n ≥ 300, Tm test is found to be more powerful, whereas ordering of the tests remains same as in Case B1 for n = 100 and 200.

p = (1/3,1/3,1/3) p = (0.09,0.05,0.05) p = (0.1,0.45,0.45)
(0.1,0.2,0.1) 100 0.261 0.230 0.240 0.116 0.115 0.116 0.280 0.240 0.243
200 0.461 0.397 0.407 0.166 0.162 0.164 0.463 0.289 0.290
300 0.623 0.549 0.561 0.292 0.290 0.292 0.620 0.384 0.384
400 0.745 0.668 0.679 0.267 0.264 0.267 0.735 0.463 0.463
500 0.832 0.764 0.772 0.307 0.306 0.307 0.813 0.535 0.537
Table 2.

Empirical power: Tm,MGIRE and GBH tests (α= 0.05, Case A).


p = (1/3,1/3,1/3) p = (0.09,0.05,0.05) p = (0.1,0.45,0.45)
(0.1,0.2,0.2) 100 0.245 0.302 0.312 0.175 0.178 0.178 0.137 0.259 0.260
200 0.422 0.507 0.521 0.265 0.268 0.270 0.206 0.311 0.312
300 0.572 0.667 0.681 0.346 0.351 0.352 0.282 0.397 0.398
400 0.683 0.775 0.788 0.404 0.409 0.411 0.347 0.492 0.493
500 0.792 0.866 0.874 0.486 0.490 0.494 0.392 0.563 0.564

(0.1,0.3,0.2) 100 0.519 0.566 0.581 0.285 0.285 0.287 0.374 0.468 0.469
200 0.810 0.847 0.854 0.444 0.448 0.450 0.629 0.634 0.635
300 0.939 0.957 0.960 0.577 0.580 0.582 0.803 0.777 0.779
400 0.980 0.989 0.990 0.683 0.688 0.690 0.903 0.879 0.880
500 0.995 0.997 0.998 0.759 0.763 0.765 0.951 0.905 0.926
Table 3.

Empirical power: Tm,MGIRE and GBH tests (α = 0.05, Case B).

6. Data study

Example 1:

The data, given in Table 4, are extracted from the cohort study (Gupta and Mehta, 2000) on the risk of mortality among tobacco users in Mumbai, India,

category frequency pˆ mortality risk ( πˆ)
Control 64414 0.5225 0.0099
Mishri 56515 0.4585 0.0123
Betel nut 2343 0.0190 0.0126
total 123272 1 -
Table 4.

Mortality risk by use of mishri and betel nut among women.

where users are classified gender-wise into smoking groups (smoking cigarette and bidi (tobacco hand rolled in temburni leaf and flaked)) and consuming smokeless tobacco (mishri, betel quid, betel nut, etc). Table 4 shows the risk of mortality among women in Mumbai from the use of mishri and betel nut. Here, the P-values of all the tests, proposed and competitors, are obtained by bootstrapping. In addition the bootstrap 0.95th percentile points of the simulated null distributions of such test statistics are obtained at various sample sizes. Furthermore, setting p=pˆ and π=πˆ, another 100,000 tables are generated for those sample sizes. From each such tables the test statistics are computed. Finally, the powers of the tests are obtained as the proportions of cases in which such test statistics exceed the respective bootstrap percentile points. Estimated P-values and powers of the tests corresponding to Example 1 are given in Table 5.

P-value/Power Tm MGIRE GBH
P-value 0.00020 0.00025 0.00025
n = 123272 0.979 0.980 0.981
n = 50000 0.722 0.728 0.735
n = 25000 0.434 0.442 0.451
n = 10000 0.208 0.209 0.216
n = 5000 0.139 0.134 0.139
n = 1000 0.075 0.073 0.074
Table 5.

P-values and powers of the tests obtained by bootstrapping .

It is observed (Table 5) that all the tests, proposed and competitors, strongly reject the null hypothesis of no difference among the risks of mortality, where the Tm test has the least P-value. For different choices of n in Table 5 we see that empirical powers of the Tm, GBH and MGIRE tests are approximately equal. For n > 5,000, an approximate ordering of the tests with respect to empirical power is GBH, MGIRE, Tm, in which the GBH-test is the best in terms of having maximum power. However, for n ≤ 5,000, the ordering becomes Tm, GBH, MGIRE.

Example 2:

All the tests are applied to another data set (Graubard and Korn, 1987) relating to the effect of maternal alcoholism on congenital sex organ malformation among infants. The information on alcohol consumption is collected from would-be mothers after the first trimester and the malformations among infants are recorded following childbirth. Alcohol consumption categories are classified as average number of drinks per day. The data set is summarized in Table 6.

average number of drinks/day frequency of mothers pˆ risk of malformation
< 1 31616 0.9706 0.0027
1 − 2 793 0.0243 0.0063
> 2 165 0.0051 0.0121
Total 32574 1
Table 6.

Risk of infant’s sex organ malformation for maternal alcoholism.

Adopting the similar technique, as used in Example 1, P-values and powers of the tests are determined and exhibited in Table 7.

P-value/Power Tm MGIRE GBH
P-value 0.062 0.062 0.062
n = 32574 0.629 0.631 0.633
n = 10000 0.323 0.323 0.323
n = 5000 0.264 0.228 0.228
n = 1000 0.130 0.130 0.130
Table 7.

P-values and powers of the tests obtained by resampling.

Table 6 shows that pˆ1 is almost unity and pˆ2 is significantly larger than pˆ3. Thus, the sample corresponds to an extremely unbalanced situation. Here, as the P-values suggest, all the Tm, GBH, MGIRE tests strongly reject the null hypothesis. The Tm test is found to be more powerful than its competitors for n ≤ 10,000.

7. Discussion

The failure of the type I error rate to attain the nominal level occurs more frequently in the MGIRE and GBH tests than in the Tm test under unbalanced allocation probabilities. On the boundary of the parameter space under HstH, that is, under Case A, the Tm test is found to be locally more powerful than its competitors. Power of the Tm test in this case becomes significantly more as compared to that of its competitors with the increase in the value of ρ. Thus, for unbalanced allocation probabilities yielding high values of rij(p)′s, the Tm test can be preferred for its agreement of type I error rate with the nominal level.

Appendix A

Asymptotic Distribution under H


it follows that, under H, for large n,
with R(pˆ) given by Section 3 when p is estimated by pˆ. Hence the statistic t is identified as

Now, using the fact that pˆp in probability, it follows that, under H,

in distribution as n → ∞.

Appendix B


First, we prove the following result.

Result B.1: Let A = (a1a2ad) be a positive definite symmetric matrix, and α = (α1,α2,…,αd)T be a vector of non-negative elements with α0. Then



Assume that the assertion is false. Then, by the given conditions, we have αT Aα ≤ 0. But this is a contradiction as A is positive definite. Hence the result follows.

Next, writing θ = (θ1,θ2,…,θk−1)T with

we can find a vector valued function g = (g1,g2,…,gk−1)T of θ such that as n → ∞,
converges to
in probability for any π, where M(pˆ) is defined in Appendix A. It is obvious that θ = 0 when πH and θj ≥ 0, j = 1,2,…,k − 1 with θ0 when πHstH. Furthermore, as n → ∞,
in probability, where gj(θ)=ajTθ with aj as the jth column of the symmetric positive definite matrix R12(p). Hence, by use of Result B.1, we get

This implies that the proposed test, described by (3), is consistent for testing H against any π under Hst − H.


[3]B Efron and R Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, New York, 1993.
[10]F Bretz and LA Hothorn, Statistical analysis of monotone or non-monotone dose–response data from in vitro toxicological assays, ATLA-ALTERN LAB ANIM, Vol. 31, No. 1, 2003, pp. 81-96.
[17]K Leuraud and J Benichou, Tests for monotonic trend from case-control data: Cochran-Armitage-Mantel trend test, isotonic regression and single and multiple contrast tests, Biom. J, Vol. 46, 2004, pp. 731-749.
[18]LA Hothorn, M Vaeth, and T Hothorn, Trend tests for the evaluation of exposure-response relationships in epidemiological exposure studies, Epidemiol Perspect Innov, Vol. 6, 2009, pp. 1.
[23]PC Gupta and HC Mehta, Cohort study of all-cause mortality among tobacco users in Mumbai, India, Bull. World Health Organ, Vol. 78, No. 7, 2000, pp. 877-883.
Journal of Statistical Theory and Applications
17 - 2
271 - 282
Publication Date
ISSN (Online)
ISSN (Print)
DOI to use a DOI?
Copyright © 2018, the Authors. Published by Atlantis Press.
Open Access
This is an open access article under the CC BY-NC license (

Cite this article

AU  - Parthasarathi Chakrabarti
AU  - Uttam Bandyopadhyay
PY  - 2018
DA  - 2018/06/30
TI  - A New Test for Simple Tree Alternative in a 2 x k Table
JO  - Journal of Statistical Theory and Applications
SP  - 271
EP  - 282
VL  - 17
IS  - 2
SN  - 2214-1766
UR  -
DO  -
ID  - Chakrabarti2018
ER  -