Show Summary Details
More options …

# Open Mathematics

### formerly Central European Journal of Mathematics

Editor-in-Chief: Gianazza, Ugo / Vespri, Vincenzo

IMPACT FACTOR 2017: 0.831
5-year IMPACT FACTOR: 0.836

CiteScore 2018: 0.90

SCImago Journal Rank (SJR) 2018: 0.323
Source Normalized Impact per Paper (SNIP) 2018: 0.821

Mathematical Citation Quotient (MCQ) 2017: 0.32

ICV 2017: 161.82

Open Access
Online
ISSN
2391-5455
See all formats and pricing
More options …
Volume 16, Issue 1

# Simultaneous prediction in the generalized linear model

Chao Bai
• Corresponding author
• College of Finance and Statistics, Hunan University, Changsha, Hunan, 410006, China
• College of Science, Central South University of Forestry and Technology, Changsha, Hunan, 410000, China
• Email
• Other articles by this author:
/ Haiqi Li
Published Online: 2018-08-24 | DOI: https://doi.org/10.1515/math-2018-0087

## Abstract

This paper studies the prediction based on a composite target function that allows to simultaneously predict the actual and the mean values of the unobserved regressand in the generalized linear model. The best linear unbiased prediction (BLUP) of the target function is derived. Studies show that our BLUP has better properties than some other predictions. Simulations confirm its better finite sample performance.

MSC 2010: 62M20; 62J12

## 1 Introduction

Generalized linear models have a long history in the statistical literature and have been used to analyze data from various branches of science on account of both mathematical and practical convenience. Consider the following generalized linear model:

$yy0=XX0β+εε0$(1)

where

y is the n-dimensional vector of observed data;

y0 is the m-dimensional vector of unobserved values that is to be predicted;

X and X0 are n × p and m × p known matrices of explanatory variables. Let rk(A) denote the rank of matrix A and suppose rk(X) ≤ p;

β is the p × 1 unknown vector of regression coefficients, and

ε and ε0 are random errors with zero mean and covariance matrix

$Cov(ε,ε0′)=ΣV′VΣ0,$

where Σ ≥ 0 and Σ0 ≥ 0 are known positive semi-definite matrices of arbitrary ranks.

The problem of predicting unobserved variables plays an important role in decision making and has received much attention in recent years. For the prediction of y0 in model (1), [1] obtained the best linear unbiased predictor (BLUP) when Σ > 0. The Bayes and minimax prediction were obtained by [2] when random errors were normally distributed. [3] and [4] derived the linear minimax prediction under a modified quadratic loss function. [5] considered the optimal Stein-rule prediction. [6] reviewed the existing theory of minimum mean squared error loss predictors and made an extension based on the principle of equivariance. [7] investigated the admissibility of linear predictors with inequality constraints under the mean squared error loss function. Another interested subject of prediction relates to the mean of y0, since [8] figured out that the best predictor of y0 is the conditional mean under the criterion of minimum mean squared error. In model (1), prediction of the mean value of y0 (namely = X0β) relates naturally to the plug-in estimators of parameter β. [9] proposed the simple projection predictor (SPP) of X0β by plugging in the best linear unbiased estimator (BLUE) of β. [10, 11] considered plugging in the prediction of β under the balanced loss function. The plug-in approach spawned a large literature for the derivation of combined prediction, see [12, 13, 14].

Generally, predictions are investigated either for y0 or for Ey0 at a time. However, sometimes in the fields of medicine and economics, people would like to know the actual value of y0 and its mean value Ey0 simultaneously. For example, in the financial markets, some investors may want to know the actual profit while others would be more interested in the mean profit. Therefore, in order to meet different requirements, the market manager should acquire both the prediction of the actual profit and the prediction of the mean profit simultaneously. Let aside investors’ demands and from the point of view of a decision maker, the market manager needs to determine which prediction should be preferred or provides another comprehensive combined prediction both of the actual and the mean profit based on empirical data. [15] gave other examples of practical situations where one is required to predict both the mean and the actual values of a variable. Under these circumstances, we consider predictions of the following target function

$δ=λy0+(1−λ)Ey0,$(2)

where λ ∈ [0, 1] is a non-stochastic weight scalar representing the preference to the prediction of actual and the mean value of the studied variable. Note that, δ = y0 if λ = 1 and δ = Ey0 if λ = 0, which means predicting δ can achieve the prediction of y0 and Ey0 simultaneously. If 0 < λ < 1, then prediction of δ balances the prediction of actual and the average value of y0. Besides, the unbiased prediction of δ is also the unbiased prediction of y0 or Ey0. Therefore, δ is more sensitive and inclusive to be studied.

Studies on the prediction of δ have been carried out in the literature from various perspective. The properties of the predictors by plugging in Stein-rule estimators have been concerned by [16, 17, 18]. [19] investigated the Stein-rule prediction for δ in linear regression model when the error covariance matrix was positive definite yet unknown. [20] studied the admissible prediction of δ. [21, 22], and [23] considered predictors for δ in linear regression models with stochastic or non-stochastic linear constraints on the regression coefficients. The issues of simultaneous prediction in measurement error models have been addressed in [24] and [25]. [26] considered a scalar multiple of the classical prediction vector for the prediction of δ and discussed the performance properties.

For model (1), most former work concerned about biased prediction under Σ > 0 (including the special case Σ = I), and did not discuss the value of the weight scalar λ in (2). In this paper, supposing Σ ≥ 0, we studied the best linear unbiased prediction (BLUP) of δ and make some comparisons to the usual BLUPs of y0 and Ey0. We also propose a method to choose the value of λ in (2), which can give the way to determine which prediction of δ or y0 or Ey0 should be provided by finite sample data.

The rest of the paper is organized as follows. In Section 2, we derive the BLUPs of the target function (2) in the generalized linear model, and discuss the efficiency of our BLUP comparing to the usual BLUP and SPP. Simulation studies are provided in Section 3 to illustrate the determination of the weight scalar in our BLUP and the performance of our proposed BLUP comparing to the other two predictors. Concluding remarks are given in Section 4.

## 2 The BLUP of δ and its efficiency

Denote ℒℋ = {CyC is an m × n matrix} as the set of all the homogeneous linear predictor of y0. Denote δ̂BLUP as the best linear unbiased predictor of δ in model (1). In this section, we first derive the expressions of δ̂BLUP in ℒℋ, and then study its performance comparing to the BLUP of y0 and the SPP of Ey0. All of the predictors discussed in this paper are derived under the criterion of minimum mean squared error. Some preliminaries and basic results are given as follows:

#### Definition 2.1

The predictor δ̂ of δ is unbiased if E δ̂ = E δ.

#### Definition 2.2

δ is linearly predictable if there exists a linear predictor Cy in ℒℋ such that Cy is an unbiased predictor of d.

#### Lemma 2.3

In model (1), δ is linearly predictable if there exists a matrix C such that CX = X0, or ℳ($\begin{array}{}{X}_{0}^{\mathrm{\prime }}\end{array}$)⊆ ℳ(X).

#### Proof

From Definition 2.1 and 2.2, there exists a matrix C such that E(Cy) = Eδ for any β, namely CX = X0 or XC′ = $\begin{array}{}{X}_{0}^{\mathrm{\prime }}\end{array}$ which is equivalent to ℳ( $\begin{array}{}{X}_{0}^{\mathrm{\prime }}\end{array}$)⊆ℳ(X′). □

If not specified otherwise, the variables we aim to predict in this paper are all linearly predictable.

#### Lemma 2.4

([27]). Suppose the n × n matrix Σ ≥ 0 and let X be an n × p matrix, then

$ΣXX′0−=T+−T+X(X′T+X)−X′T+T+X(X′T+X)−(X′T+X)−X′T+(X′T+X)−(X′T+X)−(X′T+X)−,$

where T = Σ + XX. Especially, if Σ > 0, then

$ΣXX′0−=Σ−1−Σ−1X(X′Σ−1X)−X′Σ−1Σ−1X(X′Σ−1X)−(X′Σ−1X)−X′Σ−1−(X′Σ−1X)−.$

#### Lemma 2.5

In model (1), the BLUP of y0 and the SPP of Ey0 are respectively

$y~0BLUP=X0β~+VT+(y−Xβ~),andy~0SPP=X0β~,$

where T = Σ + XX and β͠ = (XT+ X) X T+ y is the best linear unbiased estimator (BLUE) of β in model (1).

If Σ > 0 and rk (X) = p in model (1), the BLUP of y0 and the SPP of Ey0 are respectively

$y^0BLUP=X0β^BLUE+VΣ−1(y−Xβ^BLUE),andy^0SPP=X0β^BLUE,$

where β̂BLUE = (XΣ–1 X)–1 XΣ–1y is the BLUE of β.

#### Proof

BLUPs of y0 in Lemma 2.5 were derived by [1] and [28]. The SPPs of Ey0 were derived by [9]. □

The BLUPs and SPPs are presented here for further comparisons.

## 2.1 The best linear unbiased predictor of δ

#### Theorem 2.6

In model (1), the BLUP of δ in ℒℋ is

$δ^BLUP=X0β~+λVT+(y−Xβ~),$

where T = Σ + XX, β͠ = (X T+ X)X T+y.

#### Proof

Suppose δ̂ = Cy ∈ ℒℋ and is unbiased, then by Lemma 2.3, CX = X0. Denote R(δ̂;β) as the risk of δ̂ and tr(A) as the trace of squared matrix A, we have

$R(δ^;β)=E[(δ^−δ)′(δ^−δ)]=E[Cy−λy0−(1−λ)X0β]′[Cy−λy0−(1−λ)X0β]=E(Cy)′(Cy)+λE(Cy)′(Cy0)−(1−λ)E(Cy)′X0β−λE(y0)′(Cy)+λ2Ey0′y0+λ(1−λ)Ey0′X0β−(1−λ)E(X0β)′Cy+λ(1−λ)E(X0β)′y0+(1−λ)2E(X0β)′X0β=tr(CΣC′)+λ2trΣ0−2λtr(CV′)+β′(CX−X0)′(CX−X0)β.$

Minimizing R(δ̂;β) is equivalent to solve the following optimization problem to obtain C such that

$argmin[tr(CΣC′)+λtrΣ0−2λtr(CV′)]=CCX−X0=0.$

Let Λ be a p × m Lagrange multiplier and construct the Lagrange function as

$L(C,Λ)=tr(CΣC′)+λtrΣ0−2λtr(CV′)+2tr[(CX−X0)Λ].$

Let ∂ L/∂ C = 0 and ∂ L/∂ Λ = 0, we have

$CΣ−λV+Λ′X′=0,X′C′=X0′,$

namely

$ΣXX′0C′Λ=λV′X0′,$(3)

and

$C′Λ=ΣXX′0−λV′X0′.$

By Lemma 2.4, we obtain C = X0(XT+X)XT+ + λ VT+ (IX(X T+ X) X T+). Let β͠ = (X T+ X)X T+y, thus δ͠BLUP = Cy = X0β͠ + λ VT+(yXβ͠). □

#### Corollary 2.7

If Σ > 0 and rk(X) = p in model (1), then the BLUP of δ is

$δ^BLUP=X0β^BLUE+λVΣ−1(y−Xβ^BLUE),$

where β̂BLUE = (XΣ–1X)–1XΣ–1y.

#### Proof

If Σ > 0 and rk(X) = p, then XΣ–1X is nonsingular. Since

$ΣXX′0=ΣX0−X′Σ−1X=−|Σ||X′Σ−1X|≠0,$

then $\begin{array}{}\left(\begin{array}{cc}\mathit{\Sigma }& X\\ {X}^{\mathrm{\prime }}& 0\end{array}\right)\end{array}$ is nonsingular. By Lemma 2.4,

$ΣXX′0−1=Σ−1−Σ−1X(X′Σ−1X)−1X′Σ−1Σ−1X(X′Σ−1X)−1(X′Σ−1X)−1X′Σ−1−(X′Σ−1X)−1.$

With similar calculations as in the proof of Theorem 2.6, the solution of (3) gives that

$C=X0(X′Σ−1X)−1X′Σ−1+λVΣ−1(I−X(X′Σ−1X)−1X′Σ−1),$

and therefore δ̂BLUP = X0β̂BLUE + λ –1(yXβ̂BLUE). □

#### Theorem 2.8

For the prediction of (2) in model (1), Eδ̂BLUP = Ey͠0BLUP = 0SPP = Ey0 = X0β.

#### Proof

By Theorem 2.6, E δ̂BLUP = E[X0β͠ + λ VT+ (yX β͠)] = X0β = Ey0. From Lemma 2.5, it is easy to prove that E 0BLUP = 0SPP = Ey0 = X0β. □

#### Remark 2.9

According to Definition 2.1 and Theorem 2.8, δ̂BLUP, 0BLUP and y͠0SPP are all unbiased predictors of y0 or Ey0. Let λ = 1, δ̂BLUP = 0BLUP is the BLUP of y0; Let λ = 0, δ̂BLUP = X0β͠ is the SPP of Ey0. It shows that the function (2) can simultaneously predict the actual value of y0 and its mean value. Since δ̂BLUP = λ0BLUP + (1 – λ)0SPP, then δ̂BLUP can be viewed as a tradeoff between the BLUP of y0 and the SPP of Ey0. By using δ̂BLUP in practical applications, forecasters can provide a more comprehensive predictor by assigning different weights in δ̂BLUP.

As for the choice of λ, usually the weight scalar should be given before predicting. Since λ represents the weight to the prediction of y0 and is not a parameter, then there is no “true” but suitable value of it. One method to select λ is by forecasters’ subjective preferences. For example, if the prediction of y0 and Ey0 are treated equally, then λ = 0.5. Another method to determine λ is by using observed data of (y, X) in model (1). In this paper we recommend to use the leave-one-out cross-validation technique. In order to determine λ, we take δ̂BLUP as the predictor of y0 by Theorem 2.8 since the true β in Ey0 = X0β is unknown. Define δ̂(–j)(λ) to be the predictor of yj when the jth case of (y, X) in (1) is deleted. Denote 𝒯 = {λi|0 ≤ λi ≤ 1, i = 1, 2, ⋯}. The predicted residual sum of squares is defined as

$CV(λ)=∑j=1n[yj−δ^(−j)(λ)]2.$

For each λi ∈ 𝒯, compute $\begin{array}{}\sum _{j=1}^{n}\end{array}$[yjδ̂(–j)i)]2. The choice of λ is the one that minimizes CV(λ) over 𝒯. Simulations in Section 3 indicate the leave-one-out cross-validation technique for the selection of λ is feasible. Forecasters can determine which one of δ̂BLUP, 0BLUP and 0SPP is more “suitable” to be afforded through the selection of λ by observed data.

## 2.2 Efficiency of δ̂BLUP

According to Theorem 2.8, δ̂BLUP, 0BLUP and 0SPP are all unbiased predictors of y0 or Ey0. From the point of view of the linearity and unbiasedness of the prediction, we mainly discuss the performance of δ̂BLUP comparing to 0BLUP and 0SPP in what follows.

#### Theorem 2.10

For model (1),

$Cov(δ^BLUP)≤Cov(y~0BLUP),$

and the equality holds if and only if (1 – λ2) VT+[IT+ X (X T+X)X]T+V = 0.

#### Proof

Denote ε͠0 = λ VT+(yXβ͠) as the predictor of ε0, we have

$Cov(δ^BLUP)=Cov(X0β~+λε~0),Cov(y~0BLUP)=Cov(X0β~+ε~0).$

Since Σ = TXX and X[IT+X(XT+X)X′] = 0, then

$Cov(X0β~,ε~0)=X0(X′T+X)−X′T+Σ[I−T+X(X′T+X)−X′]T+V′=X0(X′T+X)−X′T+(T−XX′)[I−T+X(X′T+X)−X′]T+V′=0.$

Therefore, Cov (δ̂BLUP)– Cov (0BLUP) = (1 – λ2)Cov(ε͠0)≤ 0, and

$Cov(δ^BLUP)≤Cov(y~0BLUP),$

and the equality holds if and only if (1 – λ2)Cov(ε͠0) = (1 – λ2)VT+[IT+X(XT+X)X]T+V = 0. □

#### Corollary 2.11

If Σ > 0 and rk(X) = p in model (1), then

$Cov(δ^BLUP)≤Cov(y^0BLUP),$

and the equality holds if and only if (1 – λ2)–1[IΣ–1 X(XΣ–1X)–1X]Σ–1V = 0.

#### Proof

Corollary 2.11 is easily proved by Lemma 2.4 and Theorem 2.10. □

#### Remark 2.12

Theorem 2.10 and Corollary 2.11 show that δ̂BLUP is better than y͠0BLUP under the criterion of covariance.

#### Theorem 2.13

For model (1), if DT+ VX0(XT+X)XT+ + T+ X(XT+X)$\begin{array}{}{X}_{0}^{\mathrm{\prime }}\end{array}$VT+D≥ 0, where D = IX(XT+X)XT+, then

$E(y~0SPP−X0β)′(y~0SPP−X0β)≤E(δ^BLUP−X0β)′(δ^BLUP−X0β)≤E(y~0BLUP−X0β)′(y~0BLUP−X0β).$

#### Proof

Denote

$C1=X0(X′T+X)−X′T++λVT+[I−X(X′T+X)−1X′T+],C2=X0(X′T+X)−X′T++VT+[I−X(X′T+X)−1X′T+],$

then δ̂BLUP = C1y and 0BLUP = C2y. By the unbiasedness, C1X = X0 and C2X = X0. Therefore,

$E(δ^BLUP−X0β)′(δ^BLUP−X0β)−E(y~0BLUP−X0β)′(y~0BLUP−X0β)=(Xβ)′(C1′C1−C2′C2)Xβ+tr(C1ΣC1′−C2ΣC2′).$(4)

Note that D is a symmetric idempotent matrix and

$C1ΣC1′=C1(T−XX′)C1′=X0(X′T+X)−X0′+λ2VT+DV′−X0X0′,C2ΣC2′=C2(T−XX′)C2′=X0(X′T+X)−X0′+VT+DV′−X0X0′,$

then we have

$C1ΣC1′−C2ΣC2′=−(1−λ2)VT+DV′≤0,andtr(C1ΣC1′−C2ΣC2′)≤0.$(5)

Besides,

$C1′C1−C2′C2=(λ−1)[DT+V′X0(X′T+X)−X′T++T+X(X′T+X)−X0′VT+D]+(λ2−1)DT+V′VT+D≤(λ−1)[DT+V′X0(X′T+X)−X′T++T+X(X′T+X)−X0′VT+D]≤0.$(6)

Substituting (5) and (6) into (4), we have

$E(δ^BLUP−X0β)′(δ^BLUP−X0β)≤E(y~0BLUP−X0β)′(y~0BLUP−X0β).$

Let λ = 0 in (2), then 0SPP = X0β͠ = arg $\begin{array}{}\underset{{\stackrel{^}{y}}_{0}\in \mathcal{L}\mathcal{I}}{min}\end{array}$ E(ŷ0X0β)′(ŷ0X0β) by Theorem 2.6. It is obvious that

$E(y~0SPP−X0β)′(y~0SPP−X0β)≤E(δ^BLUP−X0β)′(δ^BLUP−X0β).$

By Lemma 2.4 and Theorem 2.13, we have

#### Corollary 2.14

In model (1), if Σ > 0, rk(X) = p and DΣ–1 VX0(XΣ–1X)–1XΣ–1 + Σ–1 X (XΣ–1X)–1$\begin{array}{}{X}_{0}^{\mathrm{\prime }}\end{array}$VΣ–1 D ≥ 0, where D = IX(XΣ–1X)–1XΣ–1, then

$E(y^0SPP−X0β)′(y^0SPP−X0β)≤E(δ^BLUP−X0β)′(δ^BLUP−X0β)≤E(y^0BLUP−X0β)′(y^0BLUP−X0β).$

#### Remark 2.15

Theorem 2.13 and Corollary 2.14 show that δ̂BLUP is better than y͠0BLUP under the squared loss function as the predictor of Ey0.

#### Theorem 2.16

For model (1),

$E(y~0BLUP−y0)′(y~0BLUP−y0)≤E(δ^BLUP−y0)′(δ^BLUP−y0)≤E(y~0SPP−y0)′(y~0SPP−y0).$

#### Proof

Denote

$C1=X0(X′T+X)−X′T++λVT+[I−X(X′T+X)−X′T+],C2=X0(X′T+X)−X′T++VT+[I−X(X′T+X)−X′T+],C3=X0(X′T+X)−X′T+,$

then δ̂BLUP = C1y, 0BLUP = C2y and 0SPP = X0β͠ = C3y. By Lemma 2.3, C1X = X0, C2X = X0 and C3X = X0. Since

$E(Ciy−y0)′(Ciy−y0)=trCiΣCi′−2tr(CiV′)+trΣ0,E(Ciy−y0)′(Ciy−y0)−E(Cjy−y0)′(Cjy−y0)=tr(CiΣCi′−CjΣCj′)−2tr(Ci−Cj)V′,1≤i,j≤3,and0≤λ≤1,$

we have

$E(C1y−y0)′(C1y−y0)−E(C2y−y0)′(C2y−y0)=(λ−1)2trVT+DV′≥0,E(C1y−y0)′(C1y−y0)−E(C3y−y0)′(C3y−y0)=[(λ−1)2−1]trVT+DV′≤0,$

which give that

$E(y~0BLUP−y0)′(y~0BLUP−y0)≤E(δ^BLUP−y0)′(δ^BLUP−y0)≤E(y~0SPP−y0)′(y~0SPP−y0).$

By Lemma 2.4 and Theorem 2.16, we have

#### Corollary 2.17

In model (1), if Σ > 0 and rk(X) = p, then

$E(y^0BLUP−y0)′(y^0BLUP−y0)≤E(δ^BLUP−y0)′(δ^BLUP−y0)≤E(y^0SPP−y0)′(y^0SPP−y0).$

#### Remark 2.18

Theorem 2.16 and Corollary 2.17 show that δ̂BLUP is better than y͠0SPP under the squared loss function as the predictor of y0.

## 3 Simulation studies

In this section, we conduct simulations to illustrate the selection of λ in δ̂0BLUP and the finite sample performance of our simultaneous prediction comparing to ŷ0BLUP and ŷ0SPP.

The data are generated from the following model:

$yy0=XX0β+εε0,εε0∼N(0,Σ),$(7)

where $\begin{array}{}\begin{array}{c}\mathit{\Sigma }=\left(\begin{array}{cccc}50& 2& \cdots & 2\\ 2& 50& \cdots & 2\\ ⋮& ⋮& \ddots & ⋮\\ 2& 2& \cdots & 50\end{array}\right)\end{array}.\end{array}$

We assume y is the observation with sample size n = 200 and y0 is to be predicted with sample size m = 1. In Section 3.1 we only need the sample data of y to determine λ, while in Section 3.2 we use all the sample data of y and y0 for comparison with various λ. Elements in corresponding matrices X and X0 are generated from the Uniform distribution [1.1, 30.7].

## 3.1 Selection of λ in δ̂BLUP

We set β to be the one-dimensional parameter with the true value 0.8. The number of simulated realizations for choosing λ is 1000. In each simulation, let λ vary from 0 to 1 with step size 0.001. We use the leave-one-out cross-validation technique (see Section 2.1) to determine λ. Let λ* be the selected value of λ, then

$λ⋆=argminCV(λ)=argmin∑j=1200[yj−δ^−j(λ)]2,0≤λ≤1.$

Simulations show that the relationship between CV(λ) and λ is varying. Three of the simulations are presented to illustrate the relation between λ and log CV(λ) in Figure 1. Subfigure (a) tells that λ = 1 and ŷ0BLUP should be provided when predicting; (b) tells that λ = 0 and ŷ0SPP should be preferred; (c) tells that λ = 0.315 and δ̂BLUP should be provided when predicting. The relationship between CV(λ) and λ also tells us that there are three kinds of λ* in our simulations. Table 1 shows that among 1000 simulations, 267 of them give that λ = 0, 332 of them determine λ = 1 and 401 of them give that 0 < λ < 1. Simulation performance shows that the leave-one-out cross-validation technique for the selection of λ is feasible and give the way to solve the question “ which one of δ̂BLUP, ŷ0BLUP and ŷ0SPP is preferred from the observations”.

Fig. 1

Relationships between λ and log[CV(λ)] in three simulations (a),(b) and (c) and the corresponding selection of λ

Table 1

Frequency of occurrences of three kinds of λ* in 1000 simulations

## 3.2 Finite sample performance of the predictors

Let n = 200, m = 1, p = 3 and the true β = (1, 0.8, 0.2) in (7). λ in δ̂BLUP varies on a grid from 0.1 to 0.9. For each λ, the number of simulations is 1000. In each simulation, we make some comparisons about δ̂BLUP, ŷ0BLUP and ŷ0SPP. Regarding δ̂BLUPy0, ŷ0BLUPy0 and ŷ0SPPy0, the sample means (sms), the standard deviations (stds) and the mean squares (mss) of which are obtained in Table 2. Also, regarding δ̂BLUP-X0β, ŷ0BLUPX0β and ŷ0SPPX0β, the sms, the stds and the mss of which are presented in Table 3.

Table 2

Finite sample performance about forecast precision ofŷ0BLUP, δ̂BLUP(with different λ)and ŷ0SPP

Table 3

Finite sample performance about goodness fit of the model ofŷ0BLUP, δ̂BLUP(with different λ)and ŷ0SPP

From Table 2 and Table 3, we make the following observations:

1. As for the prediction precision, no matter what λ is set to be, the sample means (sms) of these prediction error of ŷ0BLUP, δ̂BLUP and ŷ0SPP are all small. Comparisons of sms can not tell which one of the three predictors is better, yet the standard deviations (stds) and the mean squares (mss) of δ̂BLUPy0 are less than that of ŷ0SPPy0.

2. No matter what λ is set to be, the sample means (sms) of ŷ0BLUPX0β, δ̂BLUPX0β and ŷ0SPPX0β are all small. Comparisons of sms can not determine which predictor is better, yet the standard deviations (stds) and the mean squares (mss) of δ̂BLUPX0β are less than that of ŷ0BLUPX0β.

The above facts imply that for any λ ∈ (0, 1), δ̂BLUP, ŷ0BLUP and ŷ0SPP are all unbiased predictions of y0 and Ey0. δ̂BLUP is more efficient than X0β̂BLUE when predicting the actual value, and is more efficient than ŷ0BLUP when predicting the mean value. Simulation performances verify the results in Section 2.2.

## 4 Conclusion

In this paper, we study the prediction based on a composite target function that allows to simultaneously predict the actual and the mean values of the unobserved regressand in the generalized linear model. The BLUP of the target function is derived when the model error covariance is positive semi-definite. The BLUP is also the unbiased prediction of the actual and the mean values of the the unobserved regressand. We propose the leave-one-out cross-validation technique to determine the value of the weight scalar in our prediction, which can help to provide a suitable prediction. For the efficiency of the proposed BLUP, studies show that it is better than the usual BLUP under the criterion of covariance and dominates it as a prediction of the mean value of the regressand. Besides, the proposed BLUP is better than the SPP as a prediction of the actual value of the regressand. Simulation studies illustrate the selection of the weight scalar in the proposed BLUP and show that it has better finite sample performance. Further researches on simultaneous prediction are in progress.

## Acknowledgement

The authors are grateful to the responsible editor and the anonymous reviewers for their valuable comments and suggestions, which have greatly improved this paper. This research is supported by the Scientific Research Fund of Hunan Provincial Education Department (13C1139), the Youth Scientific Research Foundation of Central South University of Forestry and Technology of China (QJ2012013A) and the Natural Science Foundation of Hunan Province (2015JJ4090).

## References

• [1]

Goldberger A.S., Best linear unbiased prediction in the generalized linear regression model, Journal of the American Statistical Association, 1962, 57(298), 369–375

• [2]

Bolfarine H., Zacks S., Bayes and minimax prediction in finite populations, J. Statist. Plann. Infer., 1991, 28, 139–151

• [3]

Yu S. H., The linear minimax predictor in finite populations with arbitrary rank under quadratic loss function, Chin. Ann. Math., 2004, 25, 485–496 Google Scholar

• [4]

Xu L. W., Wang S. G., The minimax predictor in finite populations with arbitrary rank in normal distribution, Chin. Ann. Math., 2006, 27, 405–416 Google Scholar

• [5]

Gotway C. A., Cressie N., Improved multivariate prediction under a general linear model, J. Multivariate Anal., 1993, 45, 56–72

• [6]

P J G Teunissen P. J. G., Best prediction in linear models with mixed integer/real unknowns: theory and application, Journal of Geodesy, 2007, 81(12), 759–780

• [7]

Xu L. W., Admissible linear predictors in the superpopulation model with respect to inequality constraints, Comm. Statist. Theory Methods, 2009, 38, 2528–2540

• [8]

Searle S. R., Casella G., McCulloch C. E., Variance components, 1992, New York: Wiley. Google Scholar

• [9]

Bolfarine H., Rodrigues J., On the simple projection predictor in finite populations, Australian Journal of Statistics, 1988, 30(3), 338–341

• [10]

Hu G. K., Li Q. G., Yu S. H., Optimal and minimax prediction in multivariate normal populations under a balanced loss function, J. Multivariate Anal., 2014, 128, 154–164

• [11]

Hu G. K., Peng P., Linear admissible predictor of finite population regression coefficient under a balanced loss function, J. Math., 2014, 34, 820–828 Google Scholar

• [12]

Diebold F. X., Lopez J. A., Forecast evaluation and combination, Handbook of statistics, 1996, 14, 241–268

• [13]

Hendry D. F., Clements M. P., Pooling of forecasts, Econometrics Journal, 2002, 5, 1–26 Google Scholar

• [14]

Timmermann A., Forecast combinations, Handbook of economic forecasting, 2006, 1, 135–196

• [15]

Shalabh, Performance of stein-rule procedure for simultaneous prediction of actual and average values of study variable in linear regression models, Bull. Internat. Statist. Inst, 1995, 56, 1357–1390 Google Scholar

• [16]

Chaturvedi A., Singh S. P., Stein rule prediction of the composite target function in a general linear regression model, Statist. Papers, 2000, 41(3), 359–367

• [17]

Chaturvedi A., Kesarwani S., Chandra R., Simultaneous prediction based on shrinkage estimator, in: Shalabh, C. Heumann (Eds.), Recent Advances in Linear Models and Related Areas, Springer, 2008, 181–204 Google Scholar

• [18]

Shalabh, Heumann C., Simultaneous prediction of actual and average values of study variable using stein-rule estimators, in: K. Kumar, A. Chaturvedi (Eds.), Some Recent Developments in Statistical Theory and Application, Brown Walker Press, USA, 2012, 68–81 Google Scholar

• [19]

Chaturvedi A., Wan A. T. K., Singh S. P., Improved multivariate prediction in a general linear model with an unknown error covariance matrix, J. Multivariate Anal., 2002, 83(1), 166–182

• [20]

Bai C., Li H., Admissibility of simultaneous prediction for actual and average values in finite population, J. Inequal. Appl., 2018, 2018(1), 117

• [21]

Toutenburg H., Shalabh, Predictive performance of the methods of restricted and mixed regression estimators, Biometrical J., 1996, 38(8), 951–959

• [22]

Toutenburg H., Shalabh, Improved predictions in linear regression models with stochastic linear constraints, Biom. J., 2000, 42(1), 71–86

• [23]

Dubeand M., Manocha V., Simultaneous prediction in restricted regression models, J. Appl. Statist. Sci., 2002, 11(4), 277–288 Google Scholar

• [24]

Shalabh, Paudel C. M., Kumar N., Simultaneous prediction of actual and average values of response variable in replicated measurement error models, in: Shalabh, C. Heumann (Eds.), Recent Advances in Linear Models and Related Areas, Springer, 2008, 105–133 Google Scholar

• [25]

Garg G., Shalabh, Simultaneous predictions under exact restrictions in ultrastructural model, Journal of Statistical Research (in Special Volume on Measurement Error Models), 2011, 45(2), 139–154 Google Scholar

• [26]

Shalabh, A revisit to efficient forecasting in linear regression models, J. Multivariate Anal., 2013, 114, 161–170

• [27]

Wang S. G., Shi J. H., Introduction to the linear model, 2004, Science Press, Beijing. Google Scholar

• [28]

Yu S. H., Xu L. W., Admissibility of linear prediction under quadratic loss, Acta Mathematicae Applicatae Sinica, 2004, 27, 385–396 Google Scholar

Accepted: 2018-07-17

Published Online: 2018-08-24

Citation Information: Open Mathematics, Volume 16, Issue 1, Pages 1037–1047, ISSN (Online) 2391-5455,

Export Citation