Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Open Mathematics

formerly Central European Journal of Mathematics

Editor-in-Chief: Vespri, Vincenzo / Marano, Salvatore Angelo


IMPACT FACTOR 2018: 0.726
5-year IMPACT FACTOR: 0.869

CiteScore 2018: 0.90

SCImago Journal Rank (SJR) 2018: 0.323
Source Normalized Impact per Paper (SNIP) 2018: 0.821

Mathematical Citation Quotient (MCQ) 2018: 0.34

ICV 2018: 152.31

Open Access
Online
ISSN
2391-5455
See all formats and pricing
More options …
Volume 15, Issue 1

Issues

Volume 13 (2015)

Quotient of information matrices in comparison of linear experiments for quadratic estimation

Czesław Stępniak
  • Corresponding author
  • Faculty of Mathematics and Natural Sciences, University of Rzeszów, Pigonia 1, PL-35-959 Rzeszów, Poland
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2017-12-29 | DOI: https://doi.org/10.1515/math-2017-0135

Abstract

The ordering of normal linear experiments with respect to quadratic estimation, introduced by Stępniak in [Ann. Inst. Statist. Math. A 49 (1997), 569-584], is extended here to the experiments involving the nuisance parameters. Typical experiments of this kind are induced by allocations of treatments in the blocks. Our main tool, called quotient of information matrices, may be interesting itself. It is known that any orthogonal allocation of treatments in blocks is optimal with respect to linear estimation of all treatment contrasts. We show that such allocation is, however, not optimal for quadratic estimation.

Keywords: Normal linear experiment; Comparison of experiments for quadratic estimation; Nuisance parameter; Quotient of information matrices; Orthogonal block design; Nonoptimality for quadratic estimation

MSC 2010: 62K10; 05B20; 62J05

1 Introduction

Any statistical experiment may be perceived as an information channel transforming a deterministic quantity (parameter) into a random one (observation) according to a design indicated by experimenter. The primary aim of statistician is to recover the information about the parameter from the observation. However the efficiency of this process depends not only on the statistical rule but also on the experimental design. Such design, which may be identified with the experiment, is represented by a probabilistic structure.

When observations have normal distribution the entire statistical analysis is based on their linear and quadratic forms. Thus the properties of such forms should be taken into account in any reasonable choice of statistical experiment.

Comparison of linear experiments by linear forms has been intensively studied in statistical literature. It is well known (for instance [1, 2, 3, 4, 5, 6]) that almost all criteria used for comparison of two linear experiments with respect to linear estimation reduce to the Loewner order between their information matrices, say M1 and M2. However, the comparison of normal linear experiments with respect to quadratic estimation is still at the initial stage and we are looking for respective tools.

It was revealed in Stępniak [7] that the relation “to be at least as good with respect to quadratic estimation” needs some knowledge about the matrix M1+M2, where symbol + means the Moore-Penrose generalized inversion. We shall refer to this matrix as quotient of M2 by M1. Properties of such quotient may be interesting themselves. It appears that the Loewner order may be expressed in terms of the quotient, but not vice versa.

In this note we use the quotient of positive semidefinite matrices as the main tool in the ordering of normal linear experiments with respect to quadratic estimation. The orderings of linear experiments with respect to linear and with respect to quadratic estimation are extended here to the experiments involving nuisance parameters. Typical experiments of this kind are induced by allocations of treatments in blocks.

It is well known (see [8]) that any orthogonal allocation of treatments in blocks is optimal by means of linear estimation of all treatment contrasts. We show that this allocation is, however, not optimal for quadratic estimation.

2 Definitions and known results

In this paper the standard vector-matrix notation is used. All vectors and matrices considered here have real entries. The space of all n × 1 vectors is denoted by Rn. For any matrix M the symbols MT, R(M), N(M) and r(M) denote, respectively, its transpose, range (column space), kernel (null space) and rank. The symbol PM stands for the orthogonal projector onto R(M), i.e. the square matrix P satisfying the conditions Px = x for xR(M) and zero for xN(MT). Moreover, if M is square then tr(M) denotes its trace and the symbol M0 means that M is symmetric and positive semidefinite (psd, for short).

Let x be a random vector with the expectation E(x) = Aα and the variance-covariance matrix σ V, where A and V are known matrices while α = (α1, … αp)T and σ > 0 are unknown parameters. In this situation we shall say that x is subject to the linear experiment 𝓛(A α, σ V). If V = I, then we say that the experiment is standard. If x is normally distributed then instead of 𝓛(A α, σ V) we shall use the symbol 𝓝(A α, σ V).

Now let us consider two experiments 𝓛1 = 𝓛(A1 α, σ V) and 𝓛2 = 𝓛(A2 α, σ W) with the same parameters and with observation vectors xRm and yRn, respectively.

Definition 2.1

([9]). Experiment 𝓛1 is said to be at least as good as 𝓛2 with respect to linear estimation [notation: 𝓛1 ⊵ 𝓛2] if for any parametric function ψ = cT α and for any estimator bT y there exists an estimator aT x with uniformly not greater squared risk. If 𝓛1 ⊵ 𝓛2 and 𝓛2 ⊵ 𝓛1 then we say that the experiments are equivalent for linear estimation.

The relation ⊵ may be expressed in terms of linear forms (see [8, 9]). Namely 𝓛1 ⊵ 𝓛2, if and only if, for any bRn there exists aRm such that E(aTx)=E(bTy)and var(aTx)var(bTy)(1)

for all α and σ . It is worth to note that the relation 𝓛(A1 α, σ V) ⊵ 𝓛(A2 α, σ W) does not depend on whether σ is known or not. Thus 𝓛1 ⊵ 𝓛2 if and only if 𝓛(A1 α, V) ⊵ 𝓛(A2 α, W).

Moreover, under the normality assumption, the condition (1) may be expressed in the form:

For any parametric function ψ and for any bRn there exists aRm such that | aT xψ| is stochastically not greater than | bT yψ| for all α and σ

(Sinha [10] and Stępniak [11]).

Now consider two normal linear experiments 𝓝1 = 𝓝(A α, σ V) and 𝓝2 = 𝓝(B α, σ W) with observation vectors xRm and yRn. It is well known (cf. [12, 13]) that such experiments are not comparable with respect to all possible statistical problems. Therefore we shall restrict our attention to quadratic estimation only.

Definition 2.2

([7]). Experiment 𝓝1 is said to be at least as good as 𝓝2 with respect to quadratic estimation [notation: 𝓝1 ⪰ 𝓝2] if for any quadratic form yT Gy there exists a quadratic form xT Hx such that E(xTHx)=E(yTGy)andvar(xTHx)var(yTGy)

for all α and σ . If 𝓝1 ⪰ 𝓝2 and 𝓝2 ⪰ 𝓝1 then we say that the experiments are equivalent for quadratic estimation.

In the last definition the quadratic forms xT Hx and yT Gy play the role of potential unbiased estimators for parametric functions of type ϕ(α, σ) = c σ + αT C α. It is known that any mean squared error of a linearly estimable parametric function ψ = ψ(α) in the experiment 𝓝1 (or in 𝓝2) has such a form (Stępniak [14]). The orderings ⊵ and ⪰ possess invariance property with respect to nonsingular linear transformation both the parameter α and the observation vectors x and y as well ([7], Lemmas 2.1 and 2.2).

The main tool in comparison of the standard linear experiments is the information matrix M defined as the Fisher information matrix AT A corresponding to the experiment 𝓝(A α, I).

The relation ⊵ may be characterized by the following theorem.

Theorem 2.3

([15], Theorem 1). For standard linear experiments 𝓛1 = 𝓛(A1 α, σ Im) and 𝓛2 = 𝓛(A2 α, σ In) with information matrices M1=A1TA1andM2=A2TA2 the followings are equivalent:

  1. 𝓛1 ⊵ 𝓛2,

  2. M1M2 is psd,

  3. R(M2) ⊆ R(M1) and the maximal eigenvalue of the matrix M1+M2 is not greater than 1.

A corresponding result for the relation ⪰ is due to Stępniak ([7], Theorem 5.1) in the form

Theorem 2.4

For standard normal linear experiments 𝓝1 = 𝓝(A1 α, σ Im) and 𝓝2 = 𝓝(A2 α, σ In) with information matrices M1=A1TA1andM2=A2TA2 the followings are equivalent:

  1. 𝓝1 ⪰ 𝓝2,

  2. M1M2 is psd and i=1q1λi1+λimnr(A1)+r(A2),(2)

    where λi, i = 1, …, q, are the positive eigenvalues of the matrix M1+M2, counted with their multiplicities.

  3. R(M2) ⊆ R(M1), the maximal eigenvalue of the matrix M1+M2 is not greater than 1 and the inequality (2) holds.

It is interesting that the both orderings ⊵ and ⪰ may be expressed in terms of the matrix M1+M2, where Mj, j = 1, 2, are information matrices corresponding to the experiments 𝓝(A1 α, σ Im) and 𝓝(A2 α, σ In). Matrix of this kind will be called quotient of M2 by M1.

3 Quotient of matrices in comparison of experiments

For given psd matrices T and U of the same order we shall refer to the expressions Q1 = TU+, Q2 = U+ T, Q3 = (U+)1/2 T(U+)1/2 and Q4 = T1/2 U+ T1/2 as versions of the quotient of T by U. We note that only Q3 and Q4 are always symmetric.

We shall start from basic properties of the quotients.

Theorem 3.1

For arbitrary positive semidefinite matrices T and U of the same order

  1. All versions Q1 = TU+, Q2 = U+ T, Q3 = (U+)1/2 T(U+)1/2 and Q4 = T1/2 U+ T1/2 of the quotient of T by U have the same eigenvalues.

  2. All eigenvalues of arbitrary quotient are nonnegative.

  3. UT is psd if and only if R(T) ⊆ R(U) and all eigenvalues of arbitrary quotient Qi are not greater than 1.

Proof

  1. If Q1w = λ w then U+ Q1 w = Q2 U+ w = λ U+ w and λ is an eigenvalue of Q2. Conversely, if Q2 w = λ w then Q1 Tw = λTw. Thus Q1 and Q2 have the same eigenvalues. To prove the same for Q3 and Q4 we note that Q3 = FFT a nd Q4 = FT F for F = (U+)1/2 T1/2 and the desired correspondence follows by the implications Q3 w = λ wQ4 FT w = λ FT w and Q4 w = λ wQ3 Fw = λ Fw. Thus it remains to show a similar correspondence for Q2 and Q3.

    The equality Q2 w = λ w implies Q4 T1/2 w = λ T1/2 w. Thus λ is an eigenvalue of Q4 and, in consequence, of Q3. Similarly, if Q3 w = λ w then Q2(U+)1/2 w = λ(U+)1/2 w. This implies the desired condition and completes the proof of the part (a).

  2. It follows immediately from (a).

  3. By (a) we only need to show the desired equivalence for i = 3. Implication UT0R(T) ⊆ R(U) is evident. For the remain we note that under assumption R(T) ⊆ R(U), λmax(Q3) ≤ 1 if and only if (U+)1/2 U(U+)1/2 ≥ (U+)1/2 T(U+). This implies (c) and completes the proof of Theorem 2.4. □

Now we shall use Theorem 3.1 to comparison of normal linear experiments 𝓝1 = 𝓝(A1 α, σ Im) and 𝓝2 = 𝓝(A2 α, σ In) w.r.t. quadratic estimation.

We note that r(A1) = r(M1) while nr(A1) means the number of degrees of freedom in the experiment 𝓝1. By Theorem 2.4 we get the following result.

Lemma 3.2

If the numbers of degrees of freedom in the experiments 𝓝1 = 𝓝(A1 α, σ Im) and 𝓝2 = 𝓝(A2 α, σ In) are equal then 𝓝1 ⪰ 𝓝2 if and only if M1M2 is psd and any quotient Qi, i = 3, 4, of the information matrix M2 by M1 is idempotent, i.e. Qi2=Qi.

Proof

Under our assumption the right side of the inequality (2) is 0 and hence each eigenvalue of any quotient Qi is either 0 or 1. Since the quotients Qi, i = 3, 4, are symmetric this is equivalent their idempotency. □

The case when the numbers of observations in the both experiments are equal, i.e. m = n, is the most interesting. In this case by Theorem 3.1 we get

Lemma 3.3

For standard normal experiments 𝓝1 and 𝓝2 with the same number observations the relation 𝓝1 ⪰ 𝓝2 holds if and only if M1 = M2, i.e. when the experiments are equivalent.

Proof

Assume that M1M2 is psd, the inequality (2) is true and m = n. Then R(M1) = R(M2) and, by Lemma 3.2, Q3 is idempotent. Thus Q3 is the orthogonal projector onto R(M1) = R(M2), and, in consequence, Q3=(M1+)1/2M1(M1+)1/2. This implies the desired result. □

Now let us consider a linear experiment where observation vector x depends on several parameters but only some of them are of interest. More precisely, assume that E(x)=Aα+Bβ

and Cov(x)=σI

with unknown parameters αRp, βRk and σ > 0 such that α (or α and σ) is of interest, while β is treated as the nuisance one. Such experiment will be denoted by 𝓛(A α +B β, σ I) or (under the normality assumption) by 𝓝(A α + B β, σ I).

We shall say that a statistic t = t(x) is invariant (with respect to β) if its first two moments exist and they do not depend on β. It is evident that a linear form aT x is invariant in the experiment 𝓛(A α + B β, σ I) if and only if it depends on x only through (IPB) x. The same condition for invariance of quadratic form xT Hx follows by the well known formula var(xTHx)=2σ2trH2+4σ(αT,βT)[A,B]TH2[A,B](αT,βT)T

for variance of quadratic forms in normal variables (cf. [16, 17]).

Now let us consider two linear experiments 𝓛1 = 𝓛(A1 α + B1 β, σ Im) and 𝓛2 = 𝓛(A2 α + B2 β, σ In) (or 𝓝1 = 𝓝(A1 α + B1 β, σ Im) and 𝓝2 = 𝓝(A2 α + B2 β, σ In)) with observation vectors xRm and yRn.

Definition 3.4

We shall say that 𝓛1 is at least as good as 𝓛2 w.r.t. invariant linear estimation if for any invariant statistic bT y there exists an invariant aT x such that E(aT x) = E(bT y) and var(aT x) ≤ var(bT y) for all α and σ . Similarly, we shall say that 𝓝1 is at least as good as 𝓝2 w.r.t. invariant quadratic estimation if for any invariant statistic yT Gy there exists an invariant xT Hx such that E(xT Hx) = E(yT Gy) and var(xT Hx) ≤ var(yT Gy) for all α and σ.

First we shall reduce the comparison of linear experiments with a nuisance parameter β to the same problem for the usual linear experiments. To this aim we need the invariance condition in a a more explicit form.

Let x be observation vector in a linear experiment 𝓛(A α + B β, σ I) or 𝓝(A α + B β, σ I) and let b1, …, bnr be orthonormal basis in N(BT). Then IPB may be presented in the form b1,...,bnrb1T..bnrT.

Define A~i=b1T..bnrTAi.(3)

In this way 𝓛(A1 α + B1 β, σ Im) is at least as good as 𝓛(A2 α + B2 β, σ In) w.r.t. invariant linear estimation if and only if 𝓛(1 α, σ Imr1) ≥ 𝓛(2 α, σ Inr2), where i is defined by (3) and ri = r(Bi). Similarly 𝓝(A1 α + B1 β, σ Im) is at least as good as 𝓝(A2 α + B2 β, σ In) w.r.t. invariant quadratic estimation if and only if 𝓝(1 α, σ Imr1) ⪰ 𝓝(2 α, σ Inr2).

For convenience the matrices A~iTA~i,i=1,2, will be called the reduced information matrices and will be denoted by Mi~. We note that Mi~=AiT(IPBi)Ai(4)

As a direct consequence of Theorems 2.3 and 2.4 we get the following lemmas.

Lemma 3.5

For arbitrary linear experiments 𝓛1 = 𝓛(A1 α + B1 β, σ Im) and 𝓛2 = 𝓛(A2 α + B2 β, σ In), 𝓛1 is at least as good as 𝓛2 w.r.t. invariant linear estimation if and only if 120.

Lemma 3.6

For arbitrary normal linear experiments 𝓝1 = 𝓝(A1 α + B1 β, σ Im) and 𝓝2 = 𝓝(A2 α + B2 β, σ In), 𝓝1 is at least as good as 𝓝2 w.r.t. invariant quadratic estimation if and only if 120 and i=1q1λi1+λimr(M~1)r(B1)[nr(M~2)r(B2)],

where λi, i = 1, …, q, are positive eigenvalues of arbitrary version of the quotient of 2 by 1, counted with their multiplicities.

In particular, if mr(B1) = nr(B2) then by Lemma 3.2 we get

Corollary 3.7

If nr(B1) = mr(B2) then 𝓝1 is at least as good as 𝓝2 w.r.t. invariant quadratic estimation if and only if the matrix (M~1+)1/2M~2(M~1+)1/2 is idempotent.

Similarly, by Lemmas 3.3 and 3.6 we get

Corollary 3.8

If m = n and B1 = B2 then 𝓝1 is at least as good as 𝓝2 w.r.t. invariant quadratic estimation if and only if 1 = 2.

4 Problem of optimal allocation of treatments in blocks

Consider allocation of v treatments with replications t1, …, tv in k blocks of sizes b1, …, bk, where ∑iti = ∑jbj = n . Let us introduce matrices B = diag(1b1, …, 1bk) and D = (dij), where dij=1,if the i-th observation refers to the j-th treatment,0,otherwise.

These matrices indicate allocation of treatments in blocks. For this reason D is sometimes identified with block design.

To each pair (B, D) corresponds a linear experiment 𝓝 = 𝓝(D α + [1n, B] β, σ In), where α = (α1, …, αυ)T refers to the treatment effects, while β = (μ, β1, …, βk)T refers to the general mean and block effects. In this case the reduced information matrix (4), called also C-matrix (see [18, 19, 20]), may be presented in the form C=DTDDTdiag(b111b11b1T,...,bk11bk1bkT)D=diag(t1,...,tv)Ndiag(b11,...,bk1)NT,

where N = (nij) is the incidence matrix defined as N = DBT. It is clear that N1k = t and 1vTN=bT, where t = (t1, …, tv)T and b = (b1, …, bk)T. A design D is said to be orthogonal if N=1ntbT.

One can verify that Ndiag(b11,...,bk1)NT=jn1j2bjjn1jn2jbj...jn1jnvjbjjn2jn1jbjjn2j2bj...jn2jnvjbj............jnvjn1jbjjnvjn2jbj...jnvj2bj.

In particular, for the orthogonal design, Ndiag(b11,...,bk1)NT=1nttT.

Denote by 𝓓 = 𝓓(t; b) the class of all possible allocations of v treatments with replications t1, …, tv in k blocks of sizes b1, …, bk for v, k ≥ 2. Such class contains or does not contain an orthogonal design. If it does then by Stępniak [8] this design is optimal in 𝓓 w.r.t. invariant linear estimation, i.e. it is at least as good as any other design in the class.

It is natural to ask whether the orthogonal design is also optimal w.r.t. invariant quadratic estimation. In the light of the results presented in Section 3 we are strongly convinced that the answer is negative, but for formal reasons we are ready to provide a rigorous proof of this fact. By Corollary 3.8 we only need to show that for any incidence matrix N = (nij) corresponding to the orthogonal design there exists an incidence matrix M = (mij) such that Mdiag(b11,...,bk1)MTNdiag(b11,...,bk1)NT Define mij=nij+1,if i=1 and j=1, or i=2 and j=2,nij1,if i=1 and j=2, or i=2 and j=1,nij,otherwise.

We note that M1k = N1k and 1vTM=1vTN. Therefore, the designs represented by M and N belong to the same class. To show the desired inequality we only need, for instance, to compare the left upper entries, say u11 and u110, of the matrices Mdiag(b11,...,bk1)MT and Ndiag(b11,...,bk1)NT.

Since nij=1ntibj we have u11u110=m112b1+m122b2(n112b1+n122b2)=(n11+1)2b1+(n121)2b2(n112b1+n122b2)=2(n11b1n12b2)+1b1+1b2=2n(t1b1b1t1b2b2)+1b1+1b2=1b1+1b2>0.

This leads to the following

Conclusion 4.1

Any orthogonal block design is not optimal w.r.t. invariant quadratic estimation. Moreover, for any t = (t1, …, tv)T and b = (b1, …, bk)T there is no optimal design in the class 𝓓 = 𝓓(t; b).

By the way we have demonstrated that, with reference to the orthogonal block design, the meaning of the optimality w.r.t. linear estimation may be strengthened in the sense that the words “at least as good” may be replaced by “better”.

Acknowledgement

This work was partially supported by the Centre for Innovation and Transfer of Natural Sciences and Engineering Knowledge.

References

  • [1]

    Ehrenfeld S., Complete class theorem in experimental design, Proc. Third Berkeley Symp. on Math. Statist. Probab., 1955, Vol. 1, 69-75. Google Scholar

  • [2]

    Heyer H., Order relations for linear models: A survey on recent developments, Statist. Papers, 2006, 47, 331-372. CrossrefGoogle Scholar

  • [3]

    Kiefer J., Optimum experimental designs, J. Roy. Statist. Soc. Ser. B, 1959, 21, 272-319. Google Scholar

  • [4]

    Goel P.K., Ginebra J., When one experiment is ‘always better than’ another?, Statistician, 2003, 52, 515-537. Google Scholar

  • [5]

    Shaked M., Suarez-Llorens A., On the comparison of reliability experiments based on convolution order, J. Amer. Statist. Assoc., 2003, 98, 693-702. CrossrefGoogle Scholar

  • [6]

    Torgersen E., Comparison of Statistical Experiments. Cambridge University Press, Cambridge, England, 1991. Google Scholar

  • [7]

    Stępniak C., Comparison of normal linear experiments by quadratic forms, Ann. Inst. Statist. Math. A, 1997, 49, 569-584. CrossrefGoogle Scholar

  • [8]

    Stępniak C., Optimal allocation of treatments in block designs, Studia Sc. Math. Hung., 1987, 22, 341-345. Google Scholar

  • [9]

    Stępniak C., Optimal allocation of units in experimental designs with hierarchical and cross classification, Ann. Inst. Statist. Math. A, 1983, 35, 461-473. CrossrefGoogle Scholar

  • [10]

    Sinha B.K., Comparison of some experiments from sufficiency consideration, Ann. Inst. Statist. Math. A, 1973, 25, 501-520. CrossrefGoogle Scholar

  • [11]

    Stępniak C., Stochastic ordering and Schur-convex functions in comparison of linear experiments, Metrika, 1989, 36, 291-298. CrossrefGoogle Scholar

  • [12]

    Hansen O.H., Torgersen E., Comparison of linear experiments, Ann. Statist., 1974, 2, 265-373. Google Scholar

  • [13]

    Lehmann E.L., Comparing location experiments, Ann. Statist., 1988, 16, 521-533. CrossrefGoogle Scholar

  • [14]

    Stępniak C., On admissibility in estimating the mean squared error of a linear estimator, Probab. Math. Statist., 1998, 18, 33-38. Google Scholar

  • [15]

    Stępniak C., Ordering of nonnegative matrices with application to comparison of linear models, Linear Algebra Appl., 1985, 70, 67-71. CrossrefGoogle Scholar

  • [16]

    Mathai A.M., Provost S.B., Quadratic Forms in Random Variables, Marcel Dekker, New York 1992. Google Scholar

  • [17]

    Searle S.R., Linear Models, Wiley, New York, 1971. Google Scholar

  • [18]

    Chakrabarti M.C., On the C-matrix in design of experiments, J. Indian Statist. Assoc., 1963, 1, 8-23. Google Scholar

  • [19]

    Dey A., Theory of Block Designs, Wiley, New York, 1986. Google Scholar

  • [20]

    Raghavarao D., Padgett L.V., Block Designs: Analysis, Combinatorics and Applications, World Scientific Publishers, Singapore, 2005. Google Scholar

About the article

Received: 2017-04-13

Accepted: 2017-08-23

Published Online: 2017-12-29


Citation Information: Open Mathematics, Volume 15, Issue 1, Pages 1599–1605, ISSN (Online) 2391-5455, DOI: https://doi.org/10.1515/math-2017-0135.

Export Citation

© 2017 Stępniak. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. BY-NC-ND 4.0

Comments (0)

Please log in or register to comment.
Log in