Show Summary Details
More options …

# Open Mathematics

### formerly Central European Journal of Mathematics

Editor-in-Chief: Vespri, Vincenzo / Marano, Salvatore Angelo

IMPACT FACTOR 2018: 0.726
5-year IMPACT FACTOR: 0.869

CiteScore 2018: 0.90

SCImago Journal Rank (SJR) 2018: 0.323
Source Normalized Impact per Paper (SNIP) 2018: 0.821

Mathematical Citation Quotient (MCQ) 2018: 0.34

ICV 2018: 152.31

Open Access
Online
ISSN
2391-5455
See all formats and pricing
More options …
Volume 15, Issue 1

# Matrix rank and inertia formulas in the analysis of general linear models

Yongge Tian
• Corresponding author
• China Economics and Management Academy, Central University of Finance and Economics, Beijing 100081, China
• Email
• Other articles by this author:
Published Online: 2017-02-27 | DOI: https://doi.org/10.1515/math-2017-0013

## Abstract

Matrix mathematics provides a powerful tool set for addressing statistical problems, in particular, the theory of matrix ranks and inertias has been developed as effective methodology of simplifying various complicated matrix expressions, and establishing equalities and inequalities occurred in statistical analysis. This paper describes how to establish exact formulas for calculating ranks and inertias of covariances of predictors and estimators of parameter spaces in general linear models (GLMs), and how to use the formulas in statistical analysis of GLMs. We first derive analytical expressions of best linear unbiased predictors/best linear unbiased estimators (BLUPs/BLUEs) of all unknown parameters in the model by solving a constrained quadratic matrix-valued function optimization problem, and present some well-known results on ordinary least-squares predictors/ordinary least-squares estimators (OLSPs/OLSEs). We then establish some fundamental rank and inertia formulas for covariance matrices related to BLUPs/BLUEs and OLSPs/OLSEs, and use the formulas to characterize a variety of equalities and inequalities for covariance matrices of BLUPs/BLUEs and OLSPs/OLSEs. As applications, we use these equalities and inequalities in the comparison of the covariance matrices of BLUPs/BLUEs and OLSPs/OLSEs. The work on the formulations of BLUPs/BLUEs and OLSPs/OLSEs, and their covariance matrices under GLMs provides direct access, as a standard example, to a very simple algebraic treatment of predictors and estimators in linear regression analysis, which leads a deep insight into the linear nature of GLMs and gives an efficient way of summarizing the results.

MSC 2010: 15A03; 15A09; 62H12; 62J05

## 1 Introduction

Throughout this paper, the symbol ℝm×n stands for the collection of all m×n real matrices. The symbols 𝐀′, r(𝐀), and ℛ(𝐀) stand for the transpose, the rank, and the range (column space) of a matrix 𝐀 ∈ ℝm×n, respectively; 𝐈m denotes the identity matrix of order m. The Moore–Penrose generalized inverse of 𝐀, denoted by 𝐀+, is defined to be the unique solution 𝐗 satisfying the four matrix equations $AGA=A,GAG=G,(AG)′=AG,(GA)′=GA.$

Further, let 𝐏𝐀, 𝐄𝐀 and 𝐅𝐀 stand for the three orthogonal projectors (symmetric idempotent matrices) 𝐏𝐀 = 𝐀𝐀+, 𝐄𝐀 = 𝐀 = 𝐈m - 𝐀𝐀+, and 𝐅𝐀 = 𝐈n - 𝐀+𝐀. All about the orthogonal projectors 𝐏𝐀, 𝐄𝐀, and 𝐅𝐀 with their applications in the linear statistical models can be found in [13]. The symbols i+(𝐀) and i(𝐀) for 𝐀 = 𝐀′ ∈ ℝm×m, called the positive inertia and negative inertia of 𝐀, denote the number of the positive and negative eigenvalues of 𝐀 counted with multiplicities, respectively. For brief, we use i±(𝐀) to denote the both numbers. 𝐀 ≻ 0, 𝐀 ≽ 0, 𝐀 ≺ 0, and 𝐀 ≼ 0 mean that 𝐀 is a symmetric positive definite, positive semi-definite, negative definite, negative semi-definite matrix, respectively. Two symmetric matrices 𝐀 and 𝐁 of the same size are said to satisfy the inequalities 𝐀≻ 𝐁, 𝐀 ≽ 𝐁, 𝐀≺ 𝐁, and 𝐀 ≼ 𝐁 in the Löwner partial ordering if 𝐀 – 𝐁 is positive definite, positive semi-definite, negative definite, and negative semi-definite, respectively. Also, it is well known that the Löwner partial ordering is a surprisingly strong and useful property between two symmetric matrices. For more results on connections between inertias and the Löwner partial ordering of real symmetric (complex Hermitian) matrices, as well as applications of inertias and the Löwner partial ordering in statistics, see, e.g., [2, 49].

Recall that linear models are the first type of regression models to be studied extensively in statistical inference, which have had a profound impact in the field of statistics and applications and have been regarded without doubt as a kernel part in current statistical theory. A typical form of linear models is defined by $y=Xβ+ε,E(ε)=0,D(ε)=σ2Σ,$(1)

where y ∈ ℝn × 1 is vector of observable response variables, 𝐗 ∈ ℝn × p is a known matrix of arbitrary rank, β ∈ ℝp × 1 is a vector of fixed but unknown parameters, E(𝜺) and D(𝜺) denote the expectation vector and the dispersion matrix of the random error vector 𝜺 ∈ ℝn× 1, 𝚺 ∈ ℝn× n is a known positive semi-definite matrix of arbitrary rank, and σ2 is unknown positive number.

Once a general linear model (GLM) is formulated, the first and most important task is to estimate or predict unknown parameters in the model by using various mathematical and statistical tools. As a foundation of current regression theory, there are a sufficient bunch of results on estimators and predictors of parameter spaces in GLMs. Even so, people are still able to obtain more and more new and valuable results on statistical inference of GLMs. Estimation of β as well as prediction of 𝜺 in (1) are major concerns in the statistical inference of (1), and it is always desirable, as claimed in [10, 11], to simultaneously identify estimators and predictors of all unknown parameters in GLMs. As formulated in [10, 11], a general vector of linear parametric functions involving the two unknown parameter vectors β and 𝜺 in (1) is given by $ϕ=Kβ+Jε,$(2)

where 𝐊 ∈ ℝk × p and 𝐉 ∈ ℝk × n are given matrices of arbitrary ranks. Eq. (2) includes all vector and matrix operations in (1) as its special cases. For instance,

1. if 𝐊 = 𝐗 and 𝐉 = 𝐈n, (2) becomes ϕ = 𝐗 β + 𝜺 = y, the observed response vector;

2. if 𝐉 = 0, (2) becomes ϕ = 𝐊 β, a general vector of linear parametric functions;

3. if 𝐊 = 𝐗 and 𝐉 = 0, (2) becomes ϕ = 𝐗 β, the mean vector;

4. if 𝐊 = 0 and 𝐉 = 𝐈n, (2) becomes ϕ = 𝜺, the random error vector.

Theoretical and applied researches seek to develop various possible predictors/estimators of (2) and its specific cases. During these approaches, the unbiasedness of ϕ with respect to the parameter spaces in (1) is an important property. In statistical inference of a GLM, usually there are many unbiased predictors/estimators for the same parameter space. Under the situation that there exist unbiased predictors/estimators for the same parameter space, it is natural to find such an unbiased predictor/estimator that has the smallest dispersion matrix among all the unbiased predictors/estimators. Thus, the unbiasedness and the smallest dispersion matrices of predictors/estimators are most intrinsic requirements in the statistical analysis of GLMs. Based on these requirements, we introduce the following classic concepts of the predictability, estimability, and the BLUP/BLUEs of ϕ in (2) and its special cases originated from [12].

#### Definition 1.1

The vector ϕ in (2) is said to be predictable under (1) if there exists an 𝐋 ∈ ℝk × n such that 𝐄(𝐋y – ϕ) = 0. In particular, 𝐊β is said to be estimable under (1) if there exists an 𝐋∈ ℝk × n such that 𝐄(𝐋y – 𝐊β) = 0.

#### Definition 1.2

Let ϕ be as given in (2) and assume that it is predictable under (1). If there exists a matrix 𝐋 such that $E(Ly−ϕ)=0andD(Ly−ϕ)=min$(3) holds in the Löwner partial ordering, the linear statistic Ly is defined to be the best linear unbiased predictor (BLUP) of ϕ under (1), and is denoted by $Ly=BLUP(ϕ)=BLUP(Kβ+Jε).$(4)

If 𝐉 = 0 in (2), or 𝐊 = 0 in (2), then the Ly satisfying (3) is called the best linear unbiased estimator (BLUE) and the BLUP of 𝐊β and 𝐉𝜺 under (1), respectively, and are denoted by $Ly=BLUE(Kβ),Ly=BLUP(Jε),$(5) respectively

#### Definition 1.3

Let ϕ be as given in (2). The ordinary least-squares estimator (OLSE) of the unknown parameter vector β in (1) is defined to be $OLSE(β)=argminβ∈Rp×1(y−Xβ)′(y−Xβ),$(6)

while the OLSE of 𝐊β under (1) is defined to be OLSE(𝐊β) = 𝐊OLSE(β); the ordinary least-squares predictor (OLSP) of the random error vector 𝜺 in (1) is defined to be $OLSP(ε)=y−OLSE(Xβ),$(7)

while the OLSP of 𝐉𝜺 under (1) is defined to be OLSP (𝐉𝜺) = 𝐉OLSP(𝜺). The OLSP of ϕ under (1) is defined to be $OLSP(ϕ)=OLSE(Kβ)+OLSP(Jε).$(8)

The above definitions enable us to deal with various prediction and estimation problems under most general assumptions of GLMs.

The purpose of this paper is to investigate the performances of BLUP(ϕ), BLUP(ϕ)–ϕ, OLSP(ϕ), and OLSP(ϕ)–ϕ under the assumption that ϕ is predictable under (1) by using the matrix rank/inertia methodology. Our work in this paper includes:

1. establishing linear matrix equations and exact analytical expressions of the BLUPs and the OLSPs of (2);

2. characterizing algebraic and statistical properties of the BLUPs and the OLSPs of (2);

3. deriving formulas for calculating $i±(A−D[ϕ−BLUP(ϕ)]),r(A−D[ϕ−BLUP(ϕ)]),$(9)

$i±(A−D[ϕ−OLSP(ϕ)]),r(A−D[ϕ−OLSP(ϕ)]),$(10)

where A is a symmetric matrix, such as, dispersion matrices of other predictors and estimators;

4. establishing necessary and sufficient conditions for the following matrix equalities and inequalities $D[ϕ−BLUP(ϕ)]=A,D[BLUP(ϕ)−ϕ]≻A(≽A,≺A,≼A),$(11)

$D[ϕ−OLSP(ϕ)]=A,D[OLSP(ϕ)−ϕ]≻A(≽A,≺A,≼A)$(12) to hold, respectively, in the Löwner partial ordering;

5. establishing equalities and inequalities between the BLUPs and OLSPs of ϕ and their dispersion matrices.

As defined in (1), we do not attach any restrictions to the ranks of the given matrices in the model in order to obtain general results on this group of problems. Regression analysis is a very important statistical method that investigates the relationship between a response variable and a set of other variables named as independent variables. Linear regression models were the first type of models to be studied rigorously in regression analysis, which were regarded without doubt as a noble and magnificent part in current statistical theory. As demonstrated in most statistical textbooks, some common estimators of unknown parameters under a linear regression model, such as, the well-known OLSEs and BLUEs are usually formulated from certain algebraic operations of the observed response vector, the given model matrix, and the covariance matrix of the error term in the model. Hence, the standard inference theory of linear regression models can be established without tedious and ambiguous assumptions from the exact algebraic expressions of estimators, which is easily acceptable from both mathematical and statistical points of view. In fact, linear regression models are only type of statistical models that have complete and solid supports from linear algebra and matrix theory, while almost all results on linear regression models can be formulated in matrix forms and calculations. It is just based this fact that linear regression models attract a few of linear algebraists to consider their matrix contributions in statistical analysis.

The paper is organized as follows. In Section 2, we introduce a variety of mathematical tools that can be used to simplify matrix expressions and to characterize matrix equalities, and also present a known result on analytical solutions of a quadratic matrix-valued function optimization problems. In Section 3, we present a group of known results on the predictability of (2), the exact expression of the BLUP of (2) and its special cases, as well as various statistical properties and features of the BLUP. In Section 4, we establish a group of formulas for calculating the rank and inertia in (9) and use the formulas to characterize the equality and inequalities in (11). In Section 5, we first give a group of results on the OLSP of (2) and its statistical properties. We then establish a group of formulas for calculating the rank and inertia in (10) and use the formulas to characterize the equality and inequalities in (12). The connections between the OLSP and BLUP of (2), as well as the equality and inequalities between the dispersion matrices of the OLSP and BLUP of (2) are investigated in Section 6. Conclusions and discussions on algebraic tools in matrix theory, as well as the applications of the rank/inertia formulas in statistical analysis are presented in Section 7.

## 2 Preliminaries in linear algebra

This section begins with introducing various formulas for ranks and inertias of matrices and claims their usefulness in matrix analysis and statistical theory. Recall that the rank of a matrix and the inertia of a real symmetric matrix are two basic concepts in matrix theory, which are the most significant finite nonnegative integers in reflecting intrinsic properties of matrices, and thus are cornerstones of matrix mathematics. The mathematical prerequisites for understanding ranks and inertias of matrices are minimal and do not go beyond elementary linear algebra, while many simple and classic formulas for calculating ranks and inertias of matrices can be found in most textbooks of linear algebra. The intriguing connections between generalized inverses and ranks of matrices were recognized in 1970s. A variety of fundamental formulas for calculating the ranks of matrices and their generalized inverses were established, and many applications of the rank formulas in matrix theory and statistics were presented in [13]. Since then, the matrix rank theory has been greatly developed and has become an influential and effective tool in simplifying complicated matrix expressions and establishing various matrix equalities.

In order to establish and characterize various possible equalities for predictors and estimators under GLMs, and to simplify various matrix equalities composed by the Moore–Penrose inverses of matrices, we need the following well-known results on ranks and inertias of matrices to make the paper self-contained.

#### Lemma 2.1

Let 𝐀, 𝐁 ∈ ℝm×n, or 𝐀 = 𝐀′, 𝐁 = 𝐁′ ∈ ℝm × m. Then the following results hold:

1. 𝐀 = 𝐁 if and only if r(𝐀–𝐁) = 0,

2. AB is nonsingular if and only if r(𝐀–𝐁) = m = n,

3. 𝐀 ≻ 𝐁 (𝐀 ≺ 𝐁) if and only if i+ (𝐀 – 𝐁) = m (i(𝐀 – 𝐁) = m),

4. 𝐀 ≽ 𝐁 (𝐀 ≼ 𝐁) if and only if i (𝐀−𝐁) = 0 (i+(𝐀−𝐁) = 0).

The assertions in Lemma 2.1 directly follow from the definitions of ranks, inertias, definiteness, and semi-definiteness of (symmetric) matrices, and were first summarized and effectively utilized in [4]. This lemma manifests that if certain explicit formulas for calculating ranks/inertias of differences of (symmetric) matrices are established, we can use the formulas to characterize the corresponding matrix equalities and inequalities. Thus there are important and peculiar consequences of establishing formulas for calculating ranks/inertias of matrices. This fact reflects without doubt the most exciting roles of matrix ranks/inertias in matrix analysis and applications, and thus it is technically necessary to establish exact matrix rank/inertia formulas as many as possible from the theoretical and applied points of view.

#### Lemma 2.2

([13]). Let 𝐀 ∈ ℝm×n, 𝐁 ∈ ℝm × k, and 𝐂 ∈ ℝl × n. Then $r[A,B]=r(A)+r(EAB)=r(B)+r(EBA),$(13)

$rAC=r(A)+r(CFA)=r(C)+r(AFC),$(14)

$rABC0=r(B)+r(C)+r(EBAFC).$(15)

In particular, the following results hold.

1. $r\left[\mathbf{A},\mathbf{B}\right]\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{A}\right)⇔\mathcal{R}\left(\mathbf{B}\right)\subseteq \mathcal{R}\left(\mathbf{A}\right)⇔\mathbf{A}{\mathbf{A}}^{+}\mathbf{B}\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathbf{B}\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}{\mathbf{E}}_{\mathbf{A}}\mathbf{B}\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}.$

2. $r\left[\begin{array}{l}\mathbf{A}\\ \mathbf{C}\end{array}\right]=r\left(\mathbf{A}\right)⇔\mathcal{R}\left({\mathbf{C}}^{\prime }\right)\subseteq \mathcal{R}\left({\mathbf{A}}^{\prime }\right)⇔\mathbf{C}{\mathbf{A}}^{+}\mathbf{A}=\mathbf{C}⇔\mathbf{C}{\mathbf{F}}_{\mathbf{A}}\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}.$

3. $r\left[\mathbf{A},\mathbf{B}\right]\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{A}\right)\phantom{\rule{thinmathspace}{0ex}}+\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{B}\right)\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{A}\right)\phantom{\rule{thinmathspace}{0ex}}\cap \phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{B}\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\left\{\mathbf{0}\right\}\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left[\left({\mathbf{E}}_{\mathbf{A}}\mathbf{B}{\right)}^{\prime }\right]=\mathcal{R}\left({\mathbf{B}}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left[\left({\mathbf{E}}_{\mathbf{B}}\mathbf{A}{\right)}^{\prime }\right]\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left({\mathbf{A}}^{\prime }\right).$

4. $r\left[\begin{array}{l}\mathbf{A}\\ \mathbf{C}\end{array}\right]=r\left(\mathbf{A}\right)+r\left(\mathbf{C}\right)\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left({\mathbf{A}}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}\cap \phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left({\mathbf{C}}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\left\{\mathbf{0}\right\}\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{C}{\mathbf{F}}_{\mathbf{A}}\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{C}\right)\phantom{\rule{thinmathspace}{0ex}}⇔\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{A}{\mathbf{F}}_{\mathbf{C}}\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{A}\right).$

5. $r\left[\begin{array}{}\mathbf{A}& \mathbf{B}\\ \mathbf{C}& \mathbf{0}\end{array}\right]\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{B}\right)+r\left(\mathbf{C}\right)⇔{\mathbf{E}}_{\mathbf{B}}\mathbf{A}{\mathbf{F}}_{\mathbf{C}}\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}.$

6. $r\left(\mathbf{A}\phantom{\rule{thinmathspace}{0ex}}+\phantom{\rule{thinmathspace}{0ex}}\mathbf{B}\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{A}\right)\phantom{\rule{thinmathspace}{0ex}}+\phantom{\rule{thinmathspace}{0ex}}r\left(\mathbf{B}\right)⇔\mathcal{R}\left(\mathbf{A}\right)\phantom{\rule{thinmathspace}{0ex}}\cap \phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left(\mathbf{B}\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\left\{\mathbf{0}\right\}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}and\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left({\mathbf{A}}^{\prime }\right)\cap \mathcal{R}\left({\mathbf{B}}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}\left\{\mathbf{0}\right\}.$

#### Lemma 2.3

([14]). If ℛ(𝐀1′) ⊆ ℛ(𝐁1′), ℛ(𝐀2) ⊆ ℛ(𝐁1), ℛ(𝐀2′) ⊆ ℛ(𝐁2′), and ℛ(𝐀3) ⊆ ℛ(𝐁2), then $r(A1B1+A2)=rB1A2A10−r(B1),$(16)

$r(A1B1+A2B2+A3)=r0B2A3B1A20A100−r(B1)−r(B2).$(17)

The results collected in the following lemma are obvious or well known.

#### Lemma 2.4

Let 𝐀 = 𝐀′ ∈ ℝm × m, 𝐁 = 𝐁′ ∈ ℝn× n, Q ∈ ℝm×n, and assume that 𝐏∈ ℝm × m is nonsingular. Then $r(A)=i+(A)+i−(A),$(18)

$i±(PAP′)=i±(A)(Sylvester′slawofinertia),$(19)

$i±(A+)=i±(A),i±(−A)=i∓(A),$(20)

$i±A00B=i±(A)+i±(B),$(21)

$i+0QQ′0=i−0QQ′0=r(Q).$(22)

#### Lemma 2.5

([4]). Let 𝐀 = 𝐀′ ∈ ℝm × m, 𝐁∈ ℝm×n, and D = D′ ∈ ℝn× n. Then $i±ABB′0=r(B)+i±(EBAEB),rABB′0=2r(B)+r(EBAEB),$(23)

$i±ABB′D=i±(A)+i±0EABB′EAD−B′,A+B.$(24)

In particular, $i+ABB′0=r[A,B],i−ABB′0=r(B),rABB′0=r[A,B]+r(B),ifA≽0,$(25) $i±ABB′D=i±(A)+i±(D−B′A+B)ifR(B)⊆R(A).$(26)

#### Lemma 2.6

([4]). Let A = A′ ∊ ℝm × m, B ∊ ℝq × n, D = D′ ∊ ℝn×n, P ∊ ℝq×m with ℛ(A) ⊆ ℛ(P′) and ℛ(B) ⊆ ℛ(P). Also let $\mathbf{M}=\left[\begin{array}{}-\mathbf{A}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{P}}^{\prime }\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}\\ \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{P}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{B}\\ \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{B}}^{\prime }\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{D}\end{array}\right].$ Then $i±[D−B′(P′)+AP+B]=i±(M)−r(P),$(27) $r[D−B′(P′)+AP+B]=r(M)−2r(P).$(28)

Hence, $B′(P′)+AP+B=D⇔r(M)=2r(P),B′(P′)+AP+B≽D⇔i+(M)=r(P),B′(P′)+AP+B≼D⇔i−(M)=r(P).$

#### Lemma 2.7

([15]). The linear matrix equation AX = B is consistent if and only if r[A, B] = r(A), or equivalently, AA+B = B. In this case, the general solution of the equation can be written in the parametric form X = A+B+ FAU, where U is an arbitrary matrix.

In statistical inference of parametric regression models, the unknown parameters in them are usually predicted/estimated by various optimization methods or algorithms. A brief survey on modern optimization methods in statistical analysis can be found in [16]. In any case, we expect that the optimization problems occurred in parameter predictions/estimations under a GLM have analytical solutions, so that we can use the analytical solutions to establish a perfect theory on the statistical inference of the GLM. The notion of BLUP was well established in the literature, but it is not easy to believe that there exist analytical results on BLUPs of a very general nature due to the lack of explicit solutions of BLUPs’ optimization problems. The present author recently developed an algebraic method of solving quadratic matrix-valued function optimization problems in [17], and used the method to derive many new analytical matrix equations and formulas of BLUPs of all unknown parameters in GLMs with random effects. By using this optimization method, we are now able to directly obtain exact formulas for calculating the BLUPs of ϕ in (2) and their dispersion matrices, and to use the formulas in dealing with various statistical inference problems under general assumptions. In order to directly solve the matrix minimization problem associated with the BLUPs under a GLM, we need the following known result on analytical solutions of a constrained quadratic matrix-valued function minimization problem.

#### Lemma 2.8

([17]). Let $f(L)=(L+C)M(L+C)′s.t.LA=B,$ where A ∊ ℝp×q, B ∊ ℝn×q, and C ∊ ℝn×p are given, M ∊ ℝp×p is positive semi-definite, and the matrix equation LA = B is consistent. Then there always exists a solution L0 of L0A = B such that $f(L)≽f(L0)$ holds for all solutions of LA = B, while the matrix L0 satisfying the inequality is determined by the following consistent matrix equation $L0[A,MA⊥]=[B,−CMA⊥].$

In this case, the general expression of L0 and the corresponding f(L0) and f(L)–f(L0) are given by $L0=argminLA=B⁡f(L)=[B,−CMA⊥][A,MA⊥]++U[A,M]⊥,$ $f(L0)=minLA=Bf(L)=KMK′−KM(A⊥MA⊥)+MK′,f(L)−f(L0)=(LMA⊥+CMA⊥)(A⊥MA⊥)+(LMA⊥+CMA⊥)′,$ where K = BA+ + C and U ∊ ℝn× p is arbitrary.

The assertions in this lemma show that a clear way of obtaining analytical solutions of a typical constrained quadratic matrix-valued function minimization problem is established, and all the properties and features of the minimization problem can been discovered from the analytical solutions. With the supports of the assertions in Lemma 2.1 and the rank/inertia formulas in Lemmas 2.22.6, we are able to convert many inference problems in statistics into algebraic problems of characterizing matrix equalities and inequalities composed by matrices and their generalized inverses, and to derive, as demonstrated in Sections 46 below, analytical solutions of the problems by using the methods of matrix equations, matrix rank formulas, and various tricky partitioned matrix calculations.

## 3 Formulas and properties of BLUPs

In what follows, we assume that (1) is consistent, that is, y ∊ ℛ[X, Σ] holds with probability 1; see [18, 19].

Notice from (3) that the BLUP of ϕ in (2) is defined from the minimization of the dispersion matrix of Lyϕ subject to E(Lyϕ) = 0. The advantage of the minimization property of the BLUP has created many problems in statistical inference of GLMs. In particular, the minimum dispersion matrix of Lyϕ can be utilized to compare the optimality and efficiency of other types of predictor of ϕ under (1). In fact, BLUPs are primary choices in all possible predictors due to their simple and optimality properties, and have wide applications in both pure and applied disciplines of statistical inference. The theory of BLUPs under GLMs belongs to the classical methods of mathematical statistics, and has been core issues of researches in the field of statistics and applications. It should be pointed out that (3) can equivalently be assembled as certain constrained matrix-valued function optimization problem in the Löwner partial ordering. This kind of equivalences between dispersion matrix minimization problems and matrix-valued function minimization problems were firstly characterized in [11]; see also, [20, 21]. Along with some new development of optimization methods in matrix theory, it is now easy to deal with various complicated matrix operations occurring in the statistical inference of (2).

We next show how to translate the above statistical problems under GLMs into mathematical problems on matrix analysis, and solve the problems by using various results and methods in matrix algebra. Under (1) and (2), $E(ϕ)=Kβ,D(ϕ)=σ2JΣJ′,Cov{ϕ,y}=σ2JΣ,$(29) while Lyϕ can be rewritten as $Ly−ϕ=LXβ+Lε−Kβ−Jε=(LX−K)β+(L−J)ε.$(30)

Then, the expectation of Lyϕ can be expressed as $E(Ly−ϕ)=E[(LX−K)β]=(LX−K)β;$(31) the matrix mean square error of Lyϕ is $E[(Ly−ϕ)(Ly−ϕ)′]=(LX−K)ββ′(LX−K)′+σ2(L−J)Σ(L−J)′;$(32) the dispersion matrix of Lyϕ is $D(Ly−ϕ)=σ2(L−J)Σ(L−J)′=△σ2f(L).$(33)

Hence, the constrained covariance matrix minimization problem in (3) converts to a mathematical problem of minimizing the quadratic matrix-valued function f(L) subject to (LXK)β = 0. A general method for solving this kind of matrix optimization problems in the Löwner partial ordering was formulated in Lemma 2.8. In particular, the following comprehensive results were established in [22]; see also [23].

#### Theorem 3.1

The vector ϕ in (2) is predictable by y in (1) if and only if $R(X′)⊇R(K′).$(34)

#### Proof

It follows from (31) that E (Ly - ϕ) = 0⇔(LX - K)β = 0 for all βLX = K. From Lemma 2.7, the matrix equation is consistent if and only if (34) holds.

#### Theorem 3.2

(Fundamental BLUP equation). Assume that ϕ in (2) is predictable under (1). Then $E(Ly−ϕ)=0andD(Ly−ϕ)=min⇔L[X,ΣX⊥]=[K,JΣX⊥].$(35) The equation in (35), called the fundamental BLUP equation, is consistent, i.e., $[K,JΣX⊥][X,ΣX⊥]+[X,ΣX⊥]=[K,JΣX⊥]$(36) holds under (34), while the general expression of L and the corresponding BLUP (ϕ) are given by $BLUP(ϕ)=Ly=[K,JΣX⊥][X,ΣX⊥]++U[X,ΣX⊥]⊥y,$(37) where U ∊ ℝk×n is arbitrary. In particular, $BLUE(Kβ)=([K,0][X,ΣX⊥]++U1[X,ΣX⊥]⊥)y,$(38) $BLUP(Jε)=([0,JΣX⊥][X,ΣX⊥]++U2[X,ΣX⊥]⊥)y,$(39) where U1, U2 ∊ ℝk×n are arbitrary. Furthermore, the following results hold.

1. r[X, Σ X] = r[X, Σ], ℛ[X, Σ X] = ℛ[X, Σ], and ℛ(X)∩ ℛ(Σ X) = {0}.

2. L is unique if and only if r[X, Σ] = n.

3. BLUP (ϕ) is unique with probability 1 if and only if (1) is consistent.

4. The dispersion matrices of BLUP (ϕ) and ϕ-BLUP(ϕ), as well as the covariance matrix between BLUP (ϕ) and ϕ are unique, and satisfy the following equalities $D[BLUP(ϕ)]=σ2[K,JΣX⊥][X,ΣX⊥]+Σ([K,JΣX⊥][X,ΣX⊥]+)′,$(40) $Cov{BLUP(ϕ),ϕ}=σ2[K,JΣX⊥][X,ΣX⊥]+ΣJ′,$(41) $D(ϕ)−D[BLUP(ϕ)]=σ2JΣJ′−σ2[K,JΣX⊥][X,ΣX⊥]+Σ([K,JΣX⊥][X,ΣX⊥]+)′,$(42) $D[ϕ−BLUP(ϕ)]=σ2(J−[K,JΣX⊥][X,ΣX⊥]+)Σ(J−[K,JΣX⊥][X,ΣX⊥]+)′.$(43)

5. BLUP (ϕ), BLUE (K β ), and BLUP (J𝜺) satisfy $BLUP(ϕ)=BLUE(Kβ)+BLUP(Jε),$(44) $Cov{BLUE(Kβ),BLUP(Jε)}=0,$(45) $D[BLUP(ϕ)]=D[BLUE(Kβ)]+D[BLUP(Jε)].$(46)

6. Tϕ is predictable for any matrix T ∊ ℝt× k, and BLUP (Tϕ)=TBLUP(ϕ) holds.

7. In particular, $BLUE(Xβ)=([X,0][X,ΣX⊥]++U[X,ΣX⊥]⊥)y,$(47) $D[BLUE(Xβ)]=σ2[X,0][X,ΣX⊥]+Σ([X,0][X,ΣX⊥]+)′,$(48) where U ∊ ℝn×n is arbitrary, and $BLUP(ε)=([0,ΣX⊥][X,ΣX⊥]++U[X,ΣX⊥]⊥)y=(Σ(X⊥ΣX⊥)++U[X,ΣX⊥]⊥)y,$(49) $Cov{BLUP(ε),ε}=D[BLUP(ε)]=σ2Σ(X⊥ΣX⊥)+Σ,$(50) $D[ε−BLUP(ε)]=D(ε)−D[BLUP(ε)]=σ2Σ−σ2Σ(X⊥ΣX⊥)+Σ,$(51) where U ∊ ℝn×n is arbitrary.

8. y, BLUE (X β ), and BLUP (𝜺) satisfy $y=BLUE(Xβ)+BLUP(ε),$(52) $Cov{BLUE(Xβ),BLUP(ε)}=0,$(53) $D(y)=D[BLUE(Xβ)]+D[BLUP(ε)].$(54)

#### Proof

Eq. (3) is equivalent to finding a solution L0 of L0X=K such that $f(L)≽f(L0)s.t.LX=K$(55) holds in the Löwner partial ordering. From Lemma 2.8, there always exists a solution L0 of L0X=K such that f(L)≽ f(L0) holds for all solutions of LX=K, and the L0 is determined by the matrix equation L0[X, Σ X]= [K, JΣ X], establishing the matrix equation in (35). Solving the matrix equation in (35) yields (37). Eqs. (38) and (39) follow directly from (37).

The three formulas in (a) are well known for the coefficient matrix in (37); see e.g., [1] and [2, p. 123]. Note that [X, Σ X]=0⇔[X, Σ X][X, Σ X]+=Inr[X, Σ X]=r[X, Σ]=n. Combining this fact with (37) leads to (b). Setting the term [X, Σ X]y = 0 in (37) leads to (c). Eqs. (40)–(43) follow from (1), (29), (33), and (37).

Rewrite (37) as $BLUP(ϕ)=([K,JΣX⊥][X,ΣX⊥]++U[X,ΣX⊥]⊥)y=([K,0][X,ΣX⊥]++U1[X,ΣX⊥]⊥)y+([0,JΣX⊥][X,ΣX⊥]++U2[X,ΣX⊥]⊥)y=BLUE(Kβ)+BLUP(Jε),$ establishing (44).

From (38) and (39), the covariance matrix between BLUE (K β ) and BLUP (J𝜺) is $Cov{BLUE(Kβ),BLUP(Jε)}=σ2([K,0][X,ΣX⊥]++U1[X,ΣX⊥]⊥)Σ([0,JΣX⊥][X,ΣX⊥]++U2[X,ΣX⊥]⊥)′=σ2[K,0][X,ΣX⊥]+Σ([0,JΣX⊥][X,ΣX⊥]+)′.$

Applying (17) to this matrix and simplifying, we obtain $r(Cov{BLUE(Kβ),BLUP(Jε)})=r([K,0][X,ΣX⊥]+Σ([0,JΣX⊥][X,ΣX⊥]+)′)=r 0X′X⊥Σ 0X⊥ΣJ′[X,ΣX⊥] Σ 0 [K,0] 0 0−2r[X,ΣX⊥]=r000−X⊥ΣX⊥X′00X⊥ΣJ′[X,0]Σ0[K,0]00−2r[X,Σ]=r0XKX′Σ0+r[X⊥ΣX⊥,X⊥ΣJ′]−2r[X,Σ]=rXK+rX′Σ+r[X,ΣX⊥,ΣJ′]−r(X)−2r[X,Σ](by(13)and(25))=r(X)+r[X,Σ]+r[X,Σ]−r(X)−2r[X,Σ](by(34))=0,$ establishing (45). Eq. (46) follows from (44) and (45). Result (f) follows directly from (37). Results (g) and (h) follow from (38), (39), (d) and (e).

Eq. (35) shows that the BLUPs of all unknown parameters in (2) can jointly be determined by a linear matrix equation composed by the two given coefficient matrices K and J, the given model matrix X, the dispersion matrix of the observed random vector y, and the covariance matrix between ϕ and y. So that we find it convenient to present a simple yet general algebraic treatment of the BLUPs of all unknown parameters in a GLM via a basic linear matrix equation, while the BLUPs have a large number of algebraic and statistical properties and features that are technically convenient from analytical solutions of the matrix equation. Matrix equations and formulas for BLUPs like those in (35) and (37) under GLMs were established in the statistical literature by using various direct and indirect methods, for instance, the BLUE of Kβ and the BLUP of J𝜺, as well as (52) were established separately in [11]. In comparison, the whole results collected in Theorem 3.2 provide an exclusive and unified theory about BLUPs and BLUEs of parameter spaces and their magnificent properties and features under GLMs. As demonstrated in [22, 23], the results in Theorem 3.2 can serve as basic and useful references being applied in the statistical inference of GLMs.

From the fundamental matrix equation and formulas in Theorem 3.2, we are now able to derive many new and valuable consequences on properties of BLUPs of parameter spaces in GLMs under various assumptions. For instance, one of the well-known special cases of (1) is $y=Xβ+ε,E(ε)=0,D(ε)=σ2In,$(56) where σ2 is unknown positive scalar. In this setting, the BLUP and the OLSP of ϕ in (2) coincide under (56), while Theorem 3.2 reduces to the following results.

#### Corollary 3.3

Assume that ϕ is predictable under (56), i. e., ℛ(X′)⊇ ℛ(K′) holds. Then $E(Ly−ϕ)=0andD(Ly−ϕ)=min⇔L[X,X⊥]=[K,JX⊥].$

The matrix equation on the right-hand side is consistent as well, and the unique solution of L and the corresponding Ly are given by $BLUP(ϕ)=Ly=(KX++JX⊥)y.$

In particular, $BLUE(Kβ)=KX+y,BLUP(Jε)=JX⊥y,BLUE(Xβ)=XX+y,BLUP(ε)=X⊥y.$

Furthermore, the following results hold.

1. BLUP (ϕ) satisfies the following covariance matrix equalities $D[BLUP(ϕ!)]=σ2(KX++JX⊥)(KX++JX⊥)′,Cov{BLUP(ϕ),ϕ}=σ2(KX++JX⊥)J′,D(ϕ)−D[BLUP(ϕ)]=σ2JJ′−σ2(KX++JX⊥)(KX++JX⊥)′,D[ϕ−BLUP(ϕ)]=σ2(KX+−JPX)(KX+−JPX)′.$

2. BLUP (ϕ), BLUE (Kβ), and BLUP (J𝜺) satisfy $BLUP(ϕ)=BLUE(Kβ)+BLUP(Jε),Cov{BLUE(Kβ),BLUP(Jε)}=0,D[BLUP(ϕ)]=D[BLUE(Kβ)]+D[BLUP(Jε)].$

3. BLUP (Tϕ) = TBLUP(ϕ) holds for any matrix T ∈ ℝt × k.

4. BLUE (Xβ) and BLUP (𝜺) satisfy $BLUE(Xβ)=PXyBLUP(ε)=(In−PX)y,D[BLUE(Xβ)]=σ2PX,Cov{BLUP(ε),ε}=D[BLUP(ε)]=σ2(In−PX),D[ε−BLUP(ε)]=D(ε)−D[BLUP(ε)]=σ2PX.$

5. y, BLUE (Xβ), and BLUP (𝜺) satisfy $y=BLUE(Xβ)+BLUP(ε),Cov{BLUE(Xβ),BLUP(ε)}=0,D(y)=D[BLUE(Xβ)]+D[BLUP(ε)].$

## 4 Rank/inertia formulas for dispersion matrices of BLUPs

Once predictors/estimators of parameter spaces in GLMs are established, more attention is paid to investigating algebraical and statistical properties and features of the predictors/estimators. Since BLUPs/BLUEs are fundamental statistical methodology to predict and estimate unknown parameters under GLMs, they play an important role in statistical inference theory, and are often taken as cornerstones for comparing the efficiency of different predictors/estimators due to the minimization property of the BLUPs’/BLUEs’ dispersion matrices. As demonstrated in Section 3, we are now able to give exact expressions of BLUPs/BLUEs under GLMs, so that we can derive various algebraical and statistical properties of BLUPs/BLUEs and utilize the properties in the inference of GLMs. Since dispersion matrix of random vector is a conceptual foundation in statistical analysis and inference, statisticians are interested in studying dispersion matrices of predictors/estimators and their algebraic properties. Some previous and present work on the dispersion matrices of BLUPs/BLUEs and their properties under GLMs can be found, e.g., in [2427]. As is known to all, equalities and inequalities of dispersion matrices of predictors/estimators under GLMs play an essential role in characterizing behaviors of the predictors/estimators. Once certain equalities and inequalities are established for dispersion matrices of predictors/estimators under various assumptions, we can use them to describe performances of the predictors/estimators. This is, however, not an easy task from both mathematical and statistical points of view, because dispersion matrices of predictors/estimators often include various complicated matrix operations of given matrices and their generalized inverses in GLMs, as formulated in (40)(43). In recent years, the theory of matrix ranks and inertias have been introduced to the statistical analysis of GLMs. We are able to establish various equalities and inequalities for dispersion matrices of predictors/estimators under GLMs by using the matrix rank/inertia methodology; see [5, 6, 9]. Note from (3) that D[ϕ − BLUP(ϕ)] has the smallest dispersion matrix among all Lyϕ with E(Lyϕ) = 0. So that the dispersion matrix plays a key role in characterizing performances of the BLUP of ϕ in (2). In order to establish possible equalities and inequalities for dispersion matrices of BLUPs, we first establish three basic formulas for calculating the rank/inertia described in (9).

#### Theorem 4.1

Assume that ϕ in (2) is predictable under (1), and let BLUP (ϕ) be as given in (37). Also let A = A′ ∈ ℝk×k, and denote $M=σ2Σ0x0−AK−JXX′K′−X′J′0.$

Then $i+(A−D[ϕ−BLUP(ϕ)])=i−(M)−r(X),$(57) $i−(A−D[ϕ−BLUP(ϕ)])=i+(M)−r[X,Σ],$(58) $r(A−D[ϕ−BLUP(ϕ)])=r(M)−r[X,Σ]−r(X).$(59)

In consequence, the following results hold.

1. D[ϕ − BLUP(ϕ)]≻ Ai+(M) = r(X) + Σ] + k.

2. D[ϕ − BLUP(ϕ)]≺ Ai(M) = r(X) + k.

3. D[ϕ BLUP(ϕ)] ≽ Ai(M)= r(X).

4. D[ϕ − BLUP(ϕ)]≼ Ai+(M)= r[X, Σ].

5. D[ϕ − BLUP(ϕ)] = Ar(M) = r[X, Σ] + r(X).

#### Proof

Note from (43) that $A−D[ϕ−BLUP(ϕ)]=A−σ2([K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J)′=σ2σ−2A−[K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J′.$(60)

Also note that ℛ([K, JΣ X]′) ⊆ ℛ([X, Σ X]′) and ℛ(Σ)⊆ ℛ[X, Σ X]. Then applying (26) to (60) and simplifying by Lemmas 2.4 and 2.5, and congruence operations, we obtain $i±(A−D[ϕ−BLUP(ϕ)])=i±σ−2A−[K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J′=i±ΣΣ([K,JΣX⊥][X,ΣX⊥]+−J)′([K,JΣX⊥][X,ΣX⊥]+−J)Σσ−2A−i±(Σ)=i±Σ−ΣJ′−JΣσ−2A+Σ00[K,JΣX⊥]0[X,ΣX⊥][X,ΣX⊥]′0+Σ00[K,JΣX⊥]′−i±(Σ)=i±0−X−ΣX⊥Σ0−X′000K′−X⊥Σ000X⊥ΣJ′Σ00Σ−ΣJ′0KJΣX⊥−JΣσ−2A−i∓0[X,ΣX⊥][X,ΣX⊥]′0−i±(Σ)=i±1−Σ−X−ΣX⊥0ΣJ′−X′000K′−X⊥Σ000X⊥ΣJ′000Σ0JΣKJΣX⊥0σ−2A−JΣJ′−r[X,ΣX⊥]−i±(Σ)(by(22))=i±−Σ−X−ΣX⊥ΣJ′X′00K′−X⊥Σ00X⊥ΣJ′JΣKJΣX⊥σ−2A−JΣJ′−r[X,Σ](by(21))=i±−Σ−X00−X′00K′−X′J′00X⊥ΣX⊥00K−JX0σ−2A−r[X,Σ]=i±−Σ0X0σ−2AJX−KX′X′J′−K′0+i±(X⊥ΣX⊥)−r[X,Σ](by(21))=i∓σ2Σ0X0−AK−JXX′K′−X′J′0+i±(X⊥ΣX⊥)−r[X,Σ](by(20)),$(61) that is, $i+(A−D[ϕ−BLUP(ϕ)])=i−(M)+i+(X⊥ΣX⊥)−r[X,Σ]=i−(M)+r(X⊥Σ)−r[X,Σ]=i−(M)−r(X),i−(A−D[ϕ−BLUP(ϕ)])=i+(M)+i−(X⊥ΣX⊥)−r[X,Σ]=i+(M)−r[X,Σ]$ by (13), establishing (57) and (58). Adding the two equalities in (57) and (58) yields (59). Applying Lemma 2.1 to (57)(59) yields (a)–(e).

Eqs. (57)(59) establish links between the dispersion matrices of the BLUPs of ϕ and any symmetric matrix. Hence, they can be applied to characterize behaviors of the BLUPs, especially, they can be used to establish many equalities and inequalities for BLUPs’ dispersion matrices under various assumptions. Because the five matrices A, K, J, X, and Σ occur separately in the symmetric block matrix M in Theorem 4.1, it is easy to further simplify (57)(59) for different choices of the five matrices in the formulas. We next present several special cases of (57)(59) and give lots of interesting consequences on dispersion matrices of BLUPs/BLUEs and their operations.

Concerning the ranks and inertias of D[ϕ − BLUP(ϕ)] and D(ϕ) − D[BLUP(ϕ)] in (42) and (43), we have the following result.

#### Theorem 4.2

Assume that ϕ in (2) is predictable under (1), and let $M=−Σ0X000Σ0X0X′000K′0X′00X′J′00KJX0.$

Then $r(D[ϕ−BLUP(ϕ)])=rΣXJΣK−r[X,Σ],$(62) and $i±(D(ϕ)−D[BLUP(ϕ)])=i±(M)−r(X)−r[X,Σ],$(63) $r(D(ϕ)−D[BLUP(ϕ)])=r(M)−2r(X)−2r[X,Σ].$(64)

#### Proof

Eq. (62) follows from (59) by setting A = 0. Note from (42) that $\text{D}\left(\mathbit{\varphi }\right)-\text{D}\left[\text{BLUP}\left(\mathbit{\varphi }\right)\right]={\sigma }^{2}\mathbf{J}\mathbf{\Sigma }{\mathbf{J}}^{\prime }-{\sigma }^{2}\left[\mathbf{K},\mathbf{J}\mathbf{\Sigma }{\mathbf{X}}^{\perp }\right]\left[\mathbf{X},\mathbf{\Sigma }{\mathbf{X}}^{\perp }{\right]}^{+}\mathbf{\Sigma }\left(\left[\mathbf{K},\mathbf{J}\mathbf{\Sigma }{\mathbf{X}}^{\perp }\right]\left[\mathbf{X},\mathbf{\Sigma }{\mathbf{X}}^{\perp }{\right]}^{+}{\right)}^{\prime }.$ Applying (27) and simplifying by congruence transformations gives $i±(D(ϕ)−D[BLUP(ϕ)])=i±(JΣJ′−[K,JΣX⊥][X,ΣX⊥]+Σ([K,JΣX⊥][X,ΣX⊥]+)′)=i±−ΣXΣX⊥0X′00K′X⊥Σ00X⊥ΣJ′0KJΣX⊥JΣJ′−r[X,ΣX⊥]=i±−ΣX00X′00K′00X⊥ΣX⊥X⊥ΣJ′0KJΣX⊥JΣJ′−r[X,Σ]=i±−ΣX000X′00K′000ΣΣJ′X0KJΣJΣJ′000X′00−r(X)−r[X,Σ](by(23))=i±−ΣX000X′00K′000Σ0X0K00−JX00X′−X′J′0−r(X)−r[X,Σ]=i±−Σ0X000Σ0X0X′000K′0X′00−X′J′00K−JX0−r(X)−r[X,Σ]=i±−Σ0X000Σ0X0X′000K′0X′00X′J′00KJX0−r(X)−r[X,Σ]=i±(M)−r(X)−r[X,Σ],$

establishing (63) and (64).

Many consequences can be derived from the previous two theorems for different choices of K, J, and A in them. Here, we only give the rank/inertia formulas for the difference D(Ay) − D[ϕ − BLUP(ϕ)].

#### Corollary 4.3

Assume that ϕ in (2) is predictable under (1), and let BLUP (ϕ) be as given in (37). Also let A ∈ ℝk × n, and denote $M=∑0X0−A∑A′K−JXX′K′−X′J′0.$

Then $i+(D(Ay)−D[ϕ−BLUP(ϕ)])=i−(M)−r(X),i−(D(Ay)−D[ϕ−BLUP(ϕ)])=i+(M)−r[X,Σ],r(D(Ay)−D[ϕ−BLUP(ϕ)])=r(M)−r[X,Σ]−r(X).$

## 5 Formulas and properties of OLSPs

The method of least squares in statistics is a standard approach for estimating unknown parameters in linear statistical models, which was first proposed as an algebraic procedure for solving overdetermined systems of equations by Gauss (in unpublished work) in 1795 and independently by Legendre in 1805, as remarked in [2831]. The notion of least-squares estimation is well established in the literature, and we briefly review the derivations of OLSEs and OLSPs. It is easy to verify that the norm (yXβ)′(yX β ) in (6) can be decomposed as the sum $(y−Xβ)′(y−Xβ)=y′EXy+(PXy−Xβ)′(PXy−Xβ),$

where yEXy ⩾ 0 and (PXyXβ)′(PXyXβ) ⩾ 0. Hence, $minβ∈Rp×1(y−Xβ)′(y−Xβ)=y′EXy+minβ∈Rp×1(PXy−Xβ)′(PXy−Xβ);$

see also [7, 8]. The matrix equation Xβ = PXy, which is equivalent to the so-called normal equation XXβ = Xy by pre-multiplying X′, is always consistent; see, e.g., [32, p. 114] and [33, pp. 164–165]. Solving this linear matrix equation by Lemma 2.7 yields the following general results.

#### Theorem 5.1

Assume that ϕ in (2) is predictable by y in (1), and K β is estimable under (1). Then the OLSPs and OLSEs of ϕ and its special cases are given by $OLSE(Kβ)=KX+y,$(65) $OLSE(Xβ)=XX+y,$(66) $OLSP(ε)=(In−XX+)y=X⊥y,$(67) $OLSP(Jε)=JX⊥y,$(68) $OLSP(ϕ)=(KX++JX⊥)y.$(69)

Furthermore, the following results hold.

1. OLSP (ϕ) satisfies the following equalities $D[OLSP(ϕ)]=σ2(KX++JX⊥)Σ(KX++JX⊥)′,$(70) $Cov{OLSP(ϕ),ϕ}=σ2(KX++JX⊥)ΣJ′,$(71) $D(ϕ)−D[OLSP(ϕ)]=σ2JΣJ′−σ2(KX++JX⊥)Σ(KX++JX⊥)′,$(72) $D[ϕ−OLSP(ϕ)]=σ2(KX+−JPX)Σ(KX+−JPX)′.$(73)

2. OLSP (ϕ), OLSE (Kβ), and OLSP (J𝜺) satisfy $OLSP(ϕ)=OLSE(Kβ)+OLSP(Jε),$(74) $Cov{OLSE(Kβ),OLSP(Jε)}=σ2KX+ΣX⊥J′.$(75)

3. OLSP (Tϕ) = TOLSP(ϕ) holds for any matrix T ∈ ℝt × k.

4. y, OLSE (Xβ), and OLSP (𝜺) satisfy $y=OLSE(Xβ)+OLSP(ε),$(76) $Cov{OLSE(Xβ),OLSP(ε)}=σ2PXΣX⊥,$(77) $D[OLSE(Xβ)]=D[ε−OLSP(ε)]=σ2PXΣPX,$(78) $D[OLSP(ε)]=σ2X⊥ΣX⊥,$(79) $D(ε)−D[OLSP(ε)]=σ2Σ−σ2X⊥ΣX⊥.$(80)

We next give a group of rank/inertia formulas related to OLSPs and their consequences.

#### Theorem 5.2

Assume that ϕ in (2) is predictable under (1), and let OLSP (ϕ) be as given in (69). Also let A = A′ ∈ ℝk × k, and denote $M=σ2X′ΣX0X′X0−AK−JXX,XK′−X′J′0.$

Then $i±{A−D[ϕ−OLSP(ϕ)]}=i∓(M)−r(X),$(81) $r{A−D[ϕ−OLSP(ϕ)]}=r(M)−2r(X).$(82)

In consequence, the following results hold.

1. $\mathrm{D}\left[\mathbit{\varphi }-\mathrm{O}\mathrm{L}\mathrm{S}\mathrm{P}\left(\mathbit{\varphi }\right)\right]\succ \mathbf{A}⇔{i}_{+}\left(\mathbf{M}\right)=r\left(\mathbf{X}\right)+k.$

2. $\mathrm{D}\left[\mathbit{\varphi }-\mathrm{O}\mathrm{L}\mathrm{S}\mathrm{P}\left(\mathbit{\varphi }\right)\right]\prec \mathbf{A}⇔{i}_{-}\left(\mathbf{M}\right)=r\left(\mathbf{X}\right)+k.$

3. $\mathrm{D}\left[\mathbit{\varphi }-\mathrm{O}\mathrm{L}\mathrm{S}\mathrm{P}\left(\mathbit{\varphi }\right)\right]\succcurlyeq \mathbf{A}⇔{i}_{-}\left(\mathbf{M}\right)=r\left(\mathbf{X}\right).$

4. $\mathrm{D}\left[\mathbit{\varphi }-\mathrm{O}\mathrm{L}\mathrm{S}\mathrm{P}\left(\mathbit{\varphi }\right)\right]\preccurlyeq \mathbf{A}⇔\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}⇔{i}_{+}\left(\mathbf{M}\right)=r\left(\mathbf{X}\right).$

5. $\mathrm{D}\left[\mathbit{\varphi }-\mathrm{O}\mathrm{L}\mathrm{S}\mathrm{P}\left(\mathbit{\varphi }\right)\right]=\mathbf{A}⇔r\left(\mathbf{M}\right)=2r\left(\mathbf{X}\right).$

#### Proof

Note from (73) that $A−D[ϕ−OLSP(ϕ)]=A−σ2(KX+−JPX)Σ(KX+−JPX)′=A−σ2(K−JX)X+Σ(X′)+(K′−X′J′)=A−σ2(K−JX)(X′X)+X′ΣX(X′X)+(K′−X′J′).$(83)

Applying (27) to (83) gives $i±{A−D[ϕ−OLSP(ϕ)]}=i±[A−σ2(K−JX)(X′X)+X′ΣX(X′X)+(K′−X′J′)]=i∓σ2X′ΣXX′X0X′X0K′−X′J′0K′−JXA−r(X),=i∓σ2X′ΣX0X′X0−AK−JXX,XK′−X′J′0−r(X),$

establishing (81) and (82). Applying Lemma 2.1 to (81) and (82) yields the results in (a)–(e).

#### Theorem 5.3

Assume that ϕ in (2) is predictable under (1), and let OLSP (ϕ) be as given in (69). Then y, OLSE (Xβ), and OLSP (𝜺)satisfy $r{D(y)−D[OLSE(Xβ)]−D[OLSP(ε)]}=r{D[ε−OLSP(ε)]−D(ε)+D[OLSP(ε)]}=2r[X,ΣX]−2r(X),$(84) $r(Cov{OLSE(Xβ),OLSP(ε)})=r[X,ΣX]−r(X).$(85)

#### Proof

It comes from (78)(80) that $D(y)−D[OLSE(Xβ)]−D[OLSP(ε)]=D(ε)−D[ε−OLSP(ε)]−D[OLSP(ε)]=Cov{OLSE(Xβ),OLSP(ε)}+Cov{OLSP(ε),OLSE(Xβ)}=σ2PXΣX⊥+σ2X⊥ΣPX.$

Then by (13) and Lemma 2.2(f), $r{D(y)−D[OLSE(Xβ)]−D[OLSP(ε)]}=r{D[ε−OLSP(ε)]−D(ε)+D[OLSP(ε)]}=r(PXΣX⊥+X⊥ΣPX)=r(PXΣX⊥)+r(X⊥ΣPX)=2r(X⊥ΣPX)=2r[X,ΣPX]−2r(X)=2r[X,ΣX]−2r(X),$

as required for (84). By (77) and (13), $r(Cov{OLSE(Xβ),OLSP(ε)})=r(X⊥ΣPX)=r[X,ΣX]−r(X),$

as required for (85).

## 6 Twists of BLUPs and OLSPs under GLMs

After establishing BLUPs and OLSPs of all the unknown parameter vectors in (1) and their algebraic and statistical properties in the previous three sections, it may be of interest to consider the relations between the BLUPs and OLSPs and to establish possible equalities of these predictors. In fact, the equivalence problems of the OLSEs and BLUEs in linear regression theory were initialized and approached in late 1940s from theoretical and applied points of view, which was previously reviewed in [34], and considerable literature exists with many results and discussions on this kind of classic problems; see, e.g., [2, 27, 3558]. After the sufficient preparations in the previous sections, we are now able to establish possible equalities between BLUPs and OLSPs under various assumptions.

#### Theorem 6.1

Assume that ϕ in (2) is predictable under (1), and let BLUP (ϕ) and OLSP (ϕ) be as given in (37) and (69), respectively Then the following results hold.

1. The following statements are equivalent:

1. OLSP (ϕ) = BLUP(ϕ).

2. (KJX)X+ Σ X = 0.

3. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }{\mathbf{X}}^{\perp }\\ \mathbf{K}-\mathbf{J}\mathbf{X}& \mathbf{0}\end{array}\right]=r\left(\mathbf{X}\right).\end{array}$

4. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\\ \mathbf{K}-\mathbf{J}\mathbf{X}& \mathbf{0}\end{array}\right]=2r\left(\mathbf{X}\right).\end{array}$

5. ℛ{[(KJX)X+Σ]′} ⊆ ℛ(X).

6. ℛ([KJX, 0 ⊆ ℛ([XX, XΣ X]′).

7. $\begin{array}{}\mathcal{R}\left(\left[\mathrm{K}-\mathrm{J}\mathrm{X},\phantom{\rule{thinmathspace}{0ex}}0{\right]}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}\subseteq \mathcal{R}\left(\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\end{array}\right]\right).\end{array}$

2. [47, 55] The following statements are equivalent:

1. OLSE (Kβ) = BLUE(Kβ).

2. KX+ Σ X = 0.

3. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }{\mathbf{X}}^{\perp }\\ \mathbf{K}& \mathbf{0}\end{array}\right]=r\left(\mathbf{X}\right).\end{array}$

4. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\\ \mathbf{K}& \mathbf{0}\end{array}\right]=2r\left(\mathbf{X}\right).\end{array}$

5. ℛ[(KX+Σ)′] ⊆ ℛ(X).

6. ℛ([K, 0]′) ⊆ ℛ([XX, XΣ X]′).

7. $\begin{array}{}\mathcal{R}\left(\left[\mathrm{K},\phantom{\rule{thinmathspace}{0ex}}0{\right]}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}\subseteq \mathcal{R}\left(\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\end{array}\right]\right).\end{array}$

3. The following statements are equivalent:

1. OLSP (J𝜺) = BLUP(J𝜺).

2. JXX+ Σ X = 0.

3. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }{\mathbf{X}}^{\perp }\\ \mathbf{J}\mathbf{X}& \mathbf{0}\end{array}\right]=r\left(\mathbf{X}\right).\end{array}$

4. $\begin{array}{}r\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\\ \mathbf{J}\mathbf{X}& \mathbf{0}\end{array}\right]=2r\left(\mathbf{X}\right).\end{array}$

5. ℛ[(JXX+Σ)′] ⊆ ℛ(X).

6. ℛ([JX, 0]′) ⊆ ℛ([XX, XΣ X]′).

7. $\begin{array}{}\mathcal{R}\left(\left[\mathrm{J}\mathrm{X},\phantom{\rule{thinmathspace}{0ex}}0{\right]}^{\prime }\right)\phantom{\rule{thinmathspace}{0ex}}\subseteq \mathcal{R}\left({\left[\begin{array}{cc}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \mathbf{0}& {\mathbf{X}}^{\prime }\end{array}\right]}^{\prime }\right).\end{array}$

4. [34] The following statements are equivalent:

1. OLSE (Xβ) = BLUE(Xβ).

2. OLSP (𝜺) = BLUP(𝜺).

3. PX Σ = Σ PX.

4. X Σ = Σ X.

5. r[X, ΣX] = r(X).

6. r[X, ΣX]= r(X).

7. ℛ(ΣX) ⊆ ℛ(X).

8. ℛ(ΣX) ⊆ ℛ(X).

9. ℛ(ΣX) = ℛ(Σ) ∩ ℛ(X).

10. ℛ(ΣX) = ℛ(Σ) ∩ ℛ(X).

#### Proof

From (37) and (69), the difference of OLSP (ϕ) and BLUP (ϕ) is given by $OLSP(ϕ)−BLUP(ϕ)=(KX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+−U[X,ΣX⊥]⊥)y.$

Hence OLSP (ϕ) = BLUP(ϕ) holds if and only if the coefficient matrix of y is null, i.e., $KX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+−U[X,ΣX⊥]⊥=0.$

From Lemma 2.7, the matrix equation is solvable for U if and only if $rKX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+[X,ΣX⊥]⊥=r([X,ΣX⊥]⊥),$(86)

where by (13), $rKX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+[X,ΣX⊥]⊥=rKX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+0In[X,ΣX⊥]−r[X,ΣX⊥]=rKX++JX⊥KJΣX⊥InXΣX⊥−r[X,Σ]=r00JΣX⊥−(KX++JX⊥)ΣX⊥In00−r[X,Σ]=n+r[(K−JX)X+ΣX⊥]−r[X,Σ]=n+r[(K−JX)X′X+X′ΣX⊥]−r[X,Σ]=rX′XX′ΣX⊥K−JX0+n−r(X)−r[X,Σ](by(16))=rX′XX′Σ0X′K−JX0+n−2r(X)−r[X,Σ](by(14)),$(87) $r([X,ΣX⊥]⊥)=n−r[X,Σ].$(88)

Substituting (87) and (88) into (86) yields the following equalities $(K−JX)X+ΣX⊥=0,rX′XX′ΣX⊥K−JX0=r(X),rX′XX′Σ0X′K−JX0=2r(X),$

establishing the equivalences of (i), (ii), (iii), and (iv) in (a). The equivalences of (ii) and (v), (iii) and (vi), (iv) and (vii) in (a) follow from Lemma 2.2(a) and (b). Results (b)–(d) are special cases of (a) for different choices of K and J.

Note that (KJX)X+ΣX = 0 in (ii) of Theorem 6.1(a) is a linear matrix equation for K and J to satisfy. Solving this equation will produce allϕ such that OLSP(ϕ) = BLUP(ϕ) hold. Concerning the relationships between D[ϕ − BLUP(ϕ)] and D[ϕ − OLSP(ϕ)], we have the following results.

#### Theorem 6.2

Assume that ϕ in (2) is predictable under (1), and let BLUP(ϕ) and OLSP(ϕ) be as given in (37) and (69), respectively. Then $i−{D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)]}=r{D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)]}=rX′XX′Σ0X′K−JX0−2r(X).$(89)

In particular, $i−{D[BLUE(Kβ)]−D[OLSE(Kβ)]}=r{D[BLUE(Kβ)]−D[OLSE(Kβ)]}=rΣXX0X′X0K′−2r(X)ifKβisestimable,$(90) $i−{D[BLUE(Xβ)]−D[OLSE(Xβ)]}=r{D[BLUE(Xβ)]−D[OLSE(Xβ)]}=r[X,ΣX]−r(X)(91) $i−{D[BLUE(β)]−D[OLSE(β)]}=r{D[OLSE(β)]−D[BLUE(β)]}=r[ΣX,X]−p(92) $i−{D[ε−BLUP(ε)]−D[ε−OLSP(ε)]}=r{D[ε−BLUP(ε)]−D[ε−OLSP(ε)]}=r[X,ΣX]−r(X)(93) $i+{D[BLUP(ε)]−D[OLSP(ε)]}=i−{D[BLUP(ε)]−D[OLSP(ε)]}=r[X,ΣX]−r(X)(94) $r{D[BLUP(ε)]−D[OLSP(ε)]}=2r[X,ΣX]−2r(X)<2n.$(95)

In consequence, the following results hold.

1. $\begin{array}{}\text{D}\left[\mathbit{\varphi }-\text{BLUP}\left(\mathbit{\varphi }\right)\right]\prec \text{D}\left[\mathbit{\varphi }-\text{OLSP}\left(\mathbit{\varphi }\right)\right]⇔r\left[\begin{array}{ll}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}& \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{X}}^{\prime }\\ \mathbf{K}-\mathbf{J}\mathbf{X}& \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}\end{array}\right]=2r\left(\mathbf{X}\right)+k.\end{array}$

2. $\begin{array}{}\text{D}\left[\text{BLUE}\left(\mathbf{K}\mathbit{\beta }\right)\right]\prec \text{D}\left[\text{OLSE}\left(\mathbf{K}\mathbit{\beta }\right)\right]⇔r\left[\begin{array}{lll}\mathbf{\Sigma }\mathbf{X}& \mathbf{X}& 0\\ {\mathbf{X}}^{\prime }\mathbf{X}& 0& {\mathbf{K}}^{\prime }\end{array}\right]=2r\left(\mathbf{X}\right)+k.\end{array}$

#### Proof

By (37), (43), (69) and (73) $D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)]=σ2([K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J)′−σ2(KX+−JPX)Σ(KX+−JPX)′,$(96) $D[OLSP(ϕ)−BLUP(ϕ)]=σ2(KX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+)Σ(KX++JX⊥−[K,JΣX⊥][X,ΣX⊥]+)′.$(97)

Applying (81) to (96), we first obtain $i±(D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)])=i∓σ2X′ΣX0X′X0−D[ϕ−BLUP(ϕ)]K−JXX′XK′−X′J′0−r(X).$(98)

Furthermore, substituting D[ϕ − BLUP(ϕ)] in (43) into the block matrix in (98) and applying (61), we obtain $i∓σ2X′ΣX0X′X0−D[ϕ−BLUP(ϕ)]K−JXX′XK′−X′J′0=i∓X′ΣX0X′X0−([K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J)′K−JXX′XK′−X′J′0=i∓X′ΣX0X′X00K−JXX′XK′−X′J′0−0Ik0([K,JΣX⊥][X,ΣX⊥]+−J)Σ([K,JΣX⊥][X,ΣX⊥]+−J)′[0,Ik,0]=i±Σ0X0−X′ΣX0X′X00K−JXX′XK′−X′J′00K−JX0X′[0,K′−X′J′0]0+i∓(X⊥ΣX⊥)−r[X,Σ]=i±Σ000X0−X′ΣX0−X′X0000−K+JXK−JX0−X′X−K′+X′J′00X′0K′−X′J′00+i∓(X⊥ΣX⊥)−r[X,Σ]=i±Σ000X0−X′ΣX0−X′X−X′X000−K+JX00−X′X−K′+X′J′00X′−X′X000+i∓(X⊥ΣX⊥)−r[X,Σ]=i±ΣΣX00XX′Σ00−X′X0000−K+JX00−X′X−K′+X′J′00X′0000+i∓(X⊥ΣX⊥)−r[X,Σ]=i±Σ0ΣXX000X′X0K′−X′J′X′ΣX′X000X′00000K−JX000+i∓(X⊥ΣX⊥)−r[X,Σ].$(99)

Substituting (99) into (98) gives $i±(D[ϕ−BLUP(ϕ)]−D[OLSP(ϕ)−ϕ])=i±Σ0ΣXX000X′X0K′−X′J′X′ΣX′X000X′00000K−JX000+i∓(X⊥ΣX⊥)−r(X)−r[X,Σ].$

Hence, $i+(D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)])=i+Σ0ΣXX000X′X0K′−X′J′X′ΣX′X000X′00000K−JX000+i−(X⊥ΣX⊥)−r(X)−r[X,Σ]=rΣ0ΣXX000X′X0K′−X′J′−r(X)−r[X,Σ]=rΣ00X000X′X00−r(X)−r[X,Σ]=0,i−(D[ϕ−BLUP(ϕ)]−D[ϕ−OLSP(ϕ)])=i−Σ0ΣXX000X′X0K′−X′J′X′ΣX′X000X′00000K−JX000+r(X⊥Σ)−r(X)−r[X,Σ]=rX′ΣX′XX′00K−JX−2r(X),$

establishing (89). Eqs. (90)(93) follow directly from (89). Eqs. (94) and (95) were proved in [5]. Applying Lemma 2.1 to (89) and (90) yields (a) and (b).

The following results follow from Theorems 5.3, 6.1, and 6.2.

#### Corollary 6.3

Assume that ϕ in (2) is predictable under (1), and let BLUP(ϕ) and OLSP(ϕ) be as given in (37) and (69), respectively. Then the following statements are equivalent:

1. BLUP(ϕ) = OLSP(ϕ), i.e., ϕ − BLUP(ϕ) = ϕ − OLSP(ϕ).

2. D[ϕ − BLUP(ϕ)] = D[ϕ − OLSP(ϕ)].

3. $\begin{array}{}r\left[\begin{array}{ll}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{X}}^{\prime }\mathbf{X}& {\mathbf{X}}^{\prime }\mathbf{\Sigma }\\ \phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}& \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}{\mathbf{X}}^{\prime }\\ \mathbf{K}-\mathbf{J}\mathbf{X}& \phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}\mathbf{0}\end{array}\right]=2r\left(\mathbf{X}\right),\phantom{\rule{thinmathspace}{0ex}}\phantom{\rule{thinmathspace}{0ex}}i.e.,\mathcal{R}\phantom{\rule{thinmathspace}{0ex}}\left[\begin{array}{l}\left(\mathbf{K}-\mathbf{J}\mathbf{X}{\right)}^{\prime }\\ \phantom{\rule{2em}{0ex}}\mathbf{0}\end{array}\right]\subseteq \phantom{\rule{thinmathspace}{0ex}}\mathcal{R}\left[\begin{array}{ll}{\mathbf{X}}^{\prime }\mathbf{X}& \mathbf{0}\\ \mathbf{\Sigma }\mathbf{X}& \mathbf{X}\end{array}\right].\end{array}$

#### Corollary 6.4

Assume that Kβ is estimable under (1). Then the following statisticalfacts are equivalent:

1. BLUE(Kβ) = OLSE(Kβ).

2. D[BLUE(Kβ)] = D[OLSE(Kβ)].

3. Cov{OLSE(Kβ), y} = Cov{BLUE(Kβ), y}.

4. Cov{OLSE(Kβ), y} = Cov{BLUE(Kβ), OLSE(Xβ)}.

5. Cov{OLSE(Kβ), OLSP(𝜺)} = 0.

#### Corollary 6.5

Under the assumption in (1), the following statistical facts are equivalent:

1. BLUE(Xβ) = OLSE(Xβ).

2. BLUP(𝜺) = OLSP(𝜺).

3. D[BLUE(Xβ)] = D[OLSE(Xβ)].

4. D[BLUP(𝜺)] = D[OLSP(𝜺)].

5. D[𝜺 − BLUP(𝜺)] = D[𝜺 − OLSP(𝜺)].

6. D(y) = D[OLSE(Xβ)] + D[OLSP(𝜺)].

7. D[𝜺 − OLSP(𝜺)] = D(𝜺) − D[OLSP(𝜺)].

8. Cov{OLSE(Xβ), y} = Cov{BLUE(Xβ), y}.

9. Cov{OLSE(Xβ), y} = Cov{BLUE(Xβ), OLSE(Xβ)}.

10. Cov{OLSE(Xβ), OLSE(Xβ)} = Cov{BLUE(Xβ), y}.

11. Cov{OLSE(Xβ), OLSE(Xβ)} = Cov{BLUE(Xβ), OLSE(Xβ)}.

12. Cov{OLSE(Xβ), y} = Cov{y, OLSE(Xβ)}.

13. Cov{OLSP(𝜺), y} = Cov{y, OLSP(𝜺)}.

14. Cov{OLSE(Xβ), OLSP(𝜺)} = Cov{OLSP(𝜺), OLSE(Xβ)}.

15. Cov{OLSE(Xβ), OLSP(𝜺)} + Cov{OLSP(𝜺), OLSE(Xβ)} = 0.

16. Cov{OLSE(Xβ), OLSP(𝜺)} = 0.

Some of the equivalent facts in Corollaries 6.4 and 6.5 were proved in [5, 58]. Furthermore, many interesting vector and matrix norm equalities for OLSEs to be BLUEs can be established. We shall present them in another paper.

## 7 Conclusions

We have offered a predominantly theoretical coverage of the statistical predictions and estimations by establishing two groups of standard result on exact algebraic expressions of the BLUPs and OLSPs of all unknown parameters and their fundamental properties under (1).We have also established a variety of exact algebraic formulas for calculating ranks and inertias of the matrices associated with the BLUPs and OLSPs, and have used the formulas to characterize many interesting and valuable equalities and inequalities for the dispersion matrices of the BLUPs and OLSPs under (1). The whole work contains a massive amount of useful results related to the world of GLMs and can serve as a comprehensive description of rank/inertia theory in the parameter prediction and estimation problems under GLMs.

Statistical theory and methods often require various mathematical computations with vectors and matrices. In particular, formulas and algebraic tricks for handling matrices in linear algebra and matrix theory play important roles in the derivations and characterizations of estimators and predictors and their features and performances under linear regression models. So that matrix theory provides a powerful tool set for addressing statistical problems. There is a long list of handbooks on matrix algebra for statistical analysis published since 1960s; see, e.g., [2, 5974], while various new algebraic methods were developed regularly in matrix mathematics. But it is rarely the case that the algebraic techniques in matrix theory are ready-made to address statistical challenges. This is why the dialogue between matrix theory and statistics benefits both disciplines.

Although the ranks/inertias of matrices are the conceptual foundation in elementary linear algebra and are the most significant finite integers in reflecting intrinsic properties of matrices, it took a long time in the development of mathematics to establish various analytical and valuable formulas for calculating ranks/inertias of matrices and to use the formulas, as demonstrated in the previous sections, in the intuitive and rigorous derivations of matrix equalities and inequalities in the statistical analysis of GLMs. The present author has been devoting to this subject and has proved thousands of matrix rank/inertia formulas since 1980s by using various skillful calculations for partitioned matrices. This work has provided significant advances to general algebraical and statistical methodology, and a state-of-the-art theory on matrix ranks/inertias with applications has been established. We are now able to use matrix rank/inertia formulas to describe many fundamental properties and features of matrices, such as, establishing and simplifying various complicated matrix expressions, deriving matrix equalities and inequalities that involve generalized inverses of matrices, characterizing definiteness and semi-definiteness of symmetric matrices, and solving matrix optimization problems in the Löwner sense. In the past decade, the present author has been working on applications of the matrix rank/inertia methodology in linear regression analysis, while much experience of using various rank/inertia formulas in statistical inference of GLMs has been achieved, and many fundamental and valuable mathematical and statistical features of predictors/estimators under GLMs have been obtained during this approach. Some recent contributions on this topic by the present author and his collaborators are presented in [57, 9, 17, 22, 57, 58, 7579], which contain a massive amount of useful results related to the world of GLMs. The whole findings in these papers provide significant advances to algebraical methodology in statistical analysis and inference of GLMs, and can merge into the essential part of unified theory of GLMs.

## Acknowledgements

The author thanks anonymous referees for their helpful comments and suggestions on an earlier version of this paper.

This work was supported by the National Natural Science Foundation of China (Grant No. 11271384).

## References

• [1]

Markiewicz A., Puntanen S., All about the ⊥ with its applications in the linear statistical models, Open Math., 2015, 13, 33–50 Google Scholar

• [2]

Puntanen S., Styan G.P.H., Isotalo J., Matrix Tricks for Linear Statistical Models: Our Personal Top Twenty, Springer, Berlin Heidelberg, 2011 Google Scholar

• [3]

Rao C.R., Mitra S.K., Generalized Inverse of Matrices and Its Applications, Wiley, New York, 1971 Google Scholar

• [4]

Tian Y., Equalities and inequalities for inertias of Hermitian matrices with applications, Linear Algebra Appl., 2010, 433, 263–296

• [5]

Tian Y., Some equalities and inequalities for covariance matrices of estimators under linear model, Stat. Papers, 2015,

• [6]

Tian Y., Guo W., On comparison of dispersion matrices of estimators under a constrained linear model, Stat. Methods Appl., 2016, 25, 623–649

• [7]

Tian Y., Jiang B., Matrix rank/inertia formulas for least-squares solutions with statistical applications, Spec. Matrices, 2016, 4, 130–140 Google Scholar

• [8]

Tian Y., Jiang B., Quadratic properties of least-squares solutions of linear matrix equations with statistical applications, Comput. Statist.,

• [9]

Dong B., Guo W., Tian Y., On relations between BLUEs under two transformed linear models, J. Multivariate Anal., 2014, 131, 279–292

• [10]

Lowerre J.M., Some simplifying results on BLUEs, J. Amer. Stat. Assoc., 1977, 72, 433–437

• [11]

Rao C.R., A lemma on optimization of matrix function and a review of the unified theory of linear estimation, In: Y. Dodge (ed.), Statistical Data Analysis and Inference, North-Holland, Elsevier, 1989, 397–417 Google Scholar

• [12]

Goldberger A.S., Best linear unbiased prediction in the generalized linear regression models, J. Amer. Stat. Assoc., 1962, 57, 369–375

• [13]

Marsaglia G., Styan G.P.H., Equalities and inequalities for ranks of matrices, Linear Multilinear Algebra, 1974, 2, 269–292

• [14]

Tian Y., More on maximal and minimal ranks of Schur complements with applications, Appl. Math. Comput., 2004, 152, 675–692

• [15]

Penrose R., A generalized inverse for matrices, Proc. Cambridge Phil. Soc., 1955, 51, 406–413

• [16]

Lange K., Chi E.C., Zhou H., A brief survey of modern optimization for statisticians, Internat. Stat. Rev., 2014, 82, 46–70

• [17]

Tian Y., A new derivation of BLUPs under random-effects model, Metrika, 2015, 78, 905–918

• [18]

Rao C.R., Unified theory of linear estimation, Sankhyā Ser. A, 1971, 33, 371–394 Google Scholar

• [19]

Rao C.R., Representations of best linear unbiased estimators in the Gauss–Markoff model with a singular dispersion matrix, J. Multivariate Anal., 1973, 3, 276–292

• [20]

Rao C.R., Toutenburg H., Shalabh, Heumann C., Linear Models and Generalizations Least Squares and Alternatives, 3rd ed., Springer, Berlin Heidelberg, 2008 Google Scholar

• [21]

Searle S.R., The matrix handling of BLUE and BLUP in the mixed linear model, Linear Algebra Appl., 1997, 264, 291–311

• [22]

Gan S., Sun Y., Tian Y., Equivalence of predictors under real and over-parameterized linear models, Comm. Stat. Theory Meth., 2016,

• [23]

Tian Y., Jiang, B., A new analysis of the relationships between a general linear model and its mis-specified forms, J. Korean Stat. Soc., 2016,

• [24]

Baksalary J.K., Puntanen S., Characterizations of the best linear unbiased estimator in the general Gauss–Markov model with the use of matrix partial orderings, Linear Algebra Appl., 1990, 127, 363–370

• [25]

Baksalary J.K., Puntanen S., Styan G.P.H., A property of the dispersion matrix of the best linear unbiased estimator in the general Gauss–Markov model, Sankhyā Ser. A, 1990, 52, 279–296Google Scholar

• [26]

Isotalo J., Puntanen S., Styan G.P.H., The BLUE’s covariance matrix revisited: A review, J. Stat. Plann. Inference, 2008, 138, 2722–2737

• [27]

Puntanen S., Styan G.P.H., Tian Y., Three rank formulas associated with the covariance matrices of the BLUE and the OLSE in the general linear model, Econometric Theory, 2005, 21, 659–664 Google Scholar

• [28]

Abdulle A., Wanner G., 200 years of least squares method, Elem. Math., 2002, 57, 45–60

• [29]

Farebrother R.W., Some early statistical contributions to the theory and practice of linear algebra, Linear Algebra Appl., 1996, 237/238, 205–224

• [30]

Paris Q., The dual of the least-squares method, Open J. Statist., 2015, 5, 658–664

• [31]

Stigler S.M., Gauss and the invention of least squares, Ann. Stat., 1981, 9, 465–474

• [32]

Graybill F.A., An Introduction to Linear Statistical Models, Vol. I, McGraw–Hill, New York, 1961 Google Scholar

• [33]

Searle S.R., Linear Models, Wiley, New York, 1971 Google Scholar

• [34]

Puntanen S., Styan G.P.H., The equality of the ordinary least squares estimator and the best linear unbiased estimator, with comments by O. Kempthorne, S.R. Searle, and a reply by the authors, Amer. Statistican, 1989, 43, 153–164 Google Scholar

• [35]

Alalouf I.S., Styan G.P.H., Characterizations of the conditions for the ordinary least squares estimator to be best linear unbiased, in: Y.P. Chaubey, T.D. Dwivedi (Eds.), Topics in Applied Statistics, Concordia University, Montréal, 1984, 331–344 Google Scholar

• [36]

Baksalary J.K., Criteria for the equality between ordinary least squares and best linear unbiased estimators under certain linear models, Canad. J. Stat., 1988, 16, 97–102

• [37]

Baksalary J.K., Kala R., An extension of a rank criterion for the least squares estimator to be the best linear unbiased estimator, J. Stat. Plann. Inference, 1977, 1, 309–312

• [38]

Baksalary J.K., Kala R., Simple least squares estimation versus best linear unbiased prediction, J. Stat. Plann. Inference, 1981, 5, 147–151

• [39]

Baksalary J.K., van Eijnsbergen A.C., Comparison of two criteria for ordinary-least-squares estimators to be best linear unbiased estimators, Amer. Statistican, 1988, 42, 205–208Google Scholar

• [40]

Baksalary O.M., Trenkler G., Between OLSE and BLUE, Aust. N.Z.J. Stat., 2011, 53, 289–303

• [41]

Baksalary O.M., Trenkler G., Liski E.P., Let us do the twist again, Stat. Papers, 2013, 54, 1109–1119

• [42]

Haslett S.J., Isotalo J., Liu Y., Puntanen S., Equalities between OLSE, BLUE and BLUP in the linear model, Stat. Papers, 2014, 55, 543–561

• [43]

Haslett S.J., Puntanen S., A note on the equality of the BLUPs for new observations under two linear models, Acta Comment. Univ. Tartu. Math., 2010, 14 27–33Google Scholar

• [44]

Haslett S.J., Puntanen S., Equality of BLUEs or BLUPs under two linear models using stochastic restrictions, Stat. Papers, 2010, 51, 465–475

• [45]

Haslett S.J., Puntanen S., On the equality of the BLUPs under two linear mixed models, Metrika, 2011, 74, 381–395

• [46]

Herzberg A.M., Aleong J., Further conditions on the equivalence of ordinary least squares and weighted least squares estimators with examples, in: J. Lanke, G. Lindgren (Eds.), Contributions to Probability and Statistics in Honour of Gunnar Blom, University of Lund, 1985, 127–142Google Scholar

• [47]

Isotalo J., Puntanen S., A note on the equality of the OLSE and the BLUE of the parametric functions in the general Gauss–Markov model, Stat. Papers, 2009, 50, 185–193

• [48]

Jiang B., Sun Y., On the equality of estimators under a general partitioned linear model with parameter restrictions, Stat. Papers, 2016,

• [49]

Kruskal W., When are Gauss–Markov and least squares estimators identical? A coordinate-free approach, Ann. Math. Statist., 1968, 39, 70–75

• [50]

Liski E.P., Puntanen S., Wang S., Bounds for the trace of the difference of the covariance matrices of the OLSE and BLUE, Linear Algebra Appl., 1992, 176, 121–130

• [51]

McElroy F.W., A necessary and sufficient condition that ordinary least-squares estimators be best linear unbiased, J. Amer. Stat. Assoc., 1967, 62, 1302–1304

• [52]

Milliken G.A., Albohali M., On necessary and sufficient conditions for ordinary least squares estimators to be best linear unbiased estimators, Amer. Statistican, 1984, 38, 298–299Google Scholar

• [53]

Norlèn U., The covariance matrices for which least squares is best linear unbiased, Scand. J. Statist., 1975, 2, 85–90Google Scholar

• [54]

Styan G.P.H., When does least squares give the best linear unbiased estimate?, in: D.G. Kabe, R.P. Gupta (Eds.), Multivariate Statistical Inference, North-Holland, Amsterdam, 1973, 241–246Google Scholar

• [55]

Tian Y., On equalities of estimations of parametric functions under a general linear model and its restricted models, Metrika, 2010, 72, 313–330

• [56]

Tian Y., On properties of BLUEs under general linear regression models, J. Stat. Plann. Inference, 2013, 143, 771–782

• [57]

Tian Y., Zhang J., Some equalities for estimations of partial coefficients under a general linear regression model, Stat. Papers, 2011, 52, 911–920

• [58]

Tian Y., Zhang X., On connections among OLSEs and BLUEs of whole and partial parameters under a general linear model, Stat. Prob. Lett., 2016, 112, 105–112

• [59]

Abadir K.M., Magnus J.R., Matrix Algebra, Cambridge University Press, 2005 Google Scholar

• [60]

Banerjee S., Roy A., Linear Algebra and Matrix Analysis for Statistics, CRC Press, New York, 2014Google Scholar

• [61]

Bapat R.B., Linear Algebra and Linear Models, 3rd ed., Springer, Berlin Heidelberg, 2012Google Scholar

• [62]

Eldén L., Matrix Methods in Data Mining and Pattern Recognition, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 2007Google Scholar

• [63]

Fieller N., Basics of Matrix Algebra for Statistics with R, Chapman and Hall/CRC, 2015Google Scholar

• [64]

Gentle J.E., Numerical Linear Algebra for Applications in Statistics, Springer, Berlin Heidelberg, 1998Google Scholar

• [65]

Gentle J.E., Matrix Algebra: Theory, Computations, and Applications in Statistics, Springer, Berlin Heidelberg, 2007Google Scholar

• [66]

Graybill F.A., Matrices with Applications in Statistics, 2nd ed., Brooks/Cole, 2002Google Scholar

• [67]

Harville D.A., Matrix Algebra From a Statistician’s Perspective, Springer, New York, 1997Google Scholar

• [68]

Harville D.A., Matrix Algebra: Exercises and Solutions, Springer, New York, 2001Google Scholar

• [69]

Healy M.J.R., Matrices for Statistics, 2nd ed., Oxford University Press, 2000Google Scholar

• [70]

Magnus J.R., Neudecker H., Matrix Differential Calculus with Applications in Statistics and Econometrics, Revised edition of the 1988 original, Wiley, New York, 1999Google Scholar

• [71]

Rao C.R., Rao M.B., Matrix Algebra and Its Applications to Statistics and Econometrics, World Scientific, Singapore, 1998Google Scholar

• [72]

Schott J.R., Matrix Analysis for Statistics, 2nd ed., Wiley, Hoboken, NJ, 2005Google Scholar

• [73]

Searle S.R., Matrix Algebra Useful for Statistics, Wiley, New York, 1982Google Scholar

• [74]

Seber G.A.F., A Matrix Handbook for Statisticians, Wiley, New York, 2008 Google Scholar

• [75]

Lu C., Gan S., Tian Y., Some remarks on general linear model with new regressors, Stat. Prob. Lett., 2015, 97, 16–24

• [76]

Tian Y., A matrix handling of predictions under a general linear random-effects model with new observations, Electron. J. Linear Algebra, 2015, 29, 30–45

• [77]

Tian Y., Jiang B., Equalities for estimators of partial parameters under linear model with restrictions, J. Multivariate Anal., 2016, 143, 299–313

• [78]

Tian Y., Jiang B., An algebraic study of BLUPs under two linear random-effects models with correlated covariance matrices, Linear Multilinear Algebra, 2016, 64, 2351–2367

• [79]

Zhang X., Tian Y., On decompositions of BLUEs under a partitioned linear model with restrictions, Stat. Papers, 2016, 57, 345–364

Accepted: 2017-01-03

Published Online: 2017-02-27

Citation Information: Open Mathematics, Volume 15, Issue 1, Pages 126–150, ISSN (Online) 2391-5455,

Export Citation

## Citing Articles

[1]
Nesrin Güler and Melek Eriş Büyükkaya
Iranian Journal of Science and Technology, Transactions A: Science, 2019
[2]
Nesrin Güler and Melek Eriş Büyükkaya
Communications in Statistics - Theory and Methods, 2019, Page 1
[3]
Yongge Tian and Jie Wang
Communications in Statistics - Theory and Methods, 2018, Page 1
[4]
Bo Jiang and Yongge Tian
Applied Mathematics and Computation, 2017, Volume 315, Page 400
[5]
Bo Jiang and Yongge Tian
Journal of the Korean Statistical Society, 2017
[6]
Yongge Tian and Cheng Wang
Statistics & Probability Letters, 2017, Volume 128, Page 52