# Best linear unbiased estimation for varying probability with and without replacement sampling

• Stephen Haslett
From the journal Special Matrices

## Abstract

When sample survey data with complex design (stratification, clustering, unequal selection or inclusion probabilities, and weighting) are used for linear models, estimation of model parameters and their covariance matrices becomes complicated. Standard fitting techniques for sample surveys either model conditional on survey design variables, or use only design weights based on inclusion probabilities essentially assuming zero error covariance between all pairs of population elements. Design properties that link two units are not used. However, if population error structure is correlated, an unbiased estimate of the linear model error covariance matrix for the sample is needed for efficient parameter estimation. By making simultaneous use of sampling structure and design-unbiased estimates of the population error covariance matrix, the paper develops best linear unbiased estimation (BLUE) type extensions to standard design-based and joint design and model based estimation methods for linear models. The analysis covers both with and without replacement sample designs. It recognises that estimation for with replacement designs requires generalized inverses when any unit is selected more than once. This and the use of Hadamard products to link sampling and population error covariance matrix properties are central topics of the paper. Model-based linear model parameter estimation is also discussed.

## References

[1] Chambers, R. & Clark, R. (2012). An Introduction to Model-Based Survey Sampling with Applications, Oxford Statistical Science Series, Oxford University Press.10.1093/acprof:oso/9780198566625.001.0001Search in Google Scholar

[2] Chambers, R. & Skinner, C.J. (2003). Analysis of Survey Data, Wiley.10.1002/0470867205Search in Google Scholar

[3] Cochran, W. G. (1977). Sampling Techniques, 3rd edition, Wiley.Search in Google Scholar

[4] Fuller, W. A. (2009). Sampling Statistics, Wiley.10.1002/9780470523551Search in Google Scholar

[5] Gabler, S. (1984). On unequal probability sampling: sufficient conditions for the superiority of sampling without replacement, Biometrika, 71(1), 171-175.10.1093/biomet/71.1.171Search in Google Scholar

[6] Godambe, V. P. (1955). A unified theory of sampling from finite populations, Journal of the Royal Statistical Society B, 17, 269-278.10.1111/j.2517-6161.1955.tb00203.xSearch in Google Scholar

[7] Haslett, S. (2016). Positive semidefiniteness of estimated covariance matrices in linear models for sample survey data, Special Matrices, 4, 218-224.10.1515/spma-2016-0020Search in Google Scholar

[8] Haslett, S. (1985). The linear non-homogeneous estimator in sample surveys, Sankhyä B, 47, 101-117.Search in Google Scholar

[9] Haslett, S. & Puntanen, S. (2010). Equality of the BLUEs and/or BLUPs under two linear models using stochastic restrictions, Statistical Papers, 51, 465-475.10.1007/s00362-009-0219-7Search in Google Scholar

[10] Horvitz, D. G. & Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663-685.10.1080/01621459.1952.10483446Search in Google Scholar

[11] Huang, C., Farewell, D. & Pan, J. (2017). A calibration method for non-positive definite covariance matrix in multivariate analysis, Journal of Multivariate Analysis, 157, 45-52.10.1016/j.jmva.2017.03.001Search in Google Scholar

[12] Jerković, V. M. & Malešsević, B. (2014). Block representations of generalized inverses of matrices, Symposium on Mathematics and Its Applications, Faculty of Mathematics, University of Belgrade, Vol V(1), 10 pages. https://arxiv.org/ftp/arxiv/papers/1509/1509.03458.pdfSearch in Google Scholar

[13] Puntanen, S., Styan, G.P.H. & Isotalo, J. (2011). Matrix Tricks for Linear Statistical Models, Springer.10.1007/978-3-642-10473-2Search in Google Scholar

[14] Rao, C. R. (1968). A note on a previous lemma in the theory of least squares and some further results, Sankhyä A, 30, 259-266.Search in Google Scholar

[15] Rao, J. N. K. (1963). On two systems of unequal probability sampling without replacement, Annals of the Institute of Statistical Mathematics, 15, 67-72.10.1007/BF02865904Search in Google Scholar

[16] Royal, R. M. & Cumberland, W. G. (1978). Variance estimation in finite population sampling, Journal of the American Statistical Association, 73, 351-358.10.1080/01621459.1978.10481581Search in Google Scholar

[17] Särndal, C-E., Swensson, B. & Wretman, J. (1992). Model Assisted Survey Sampling, Springer.10.1007/978-1-4612-4378-6Search in Google Scholar

[18] Skinner, C.J., Holt, D. & Smith, T.M.F. (1989). Analysis of Complex Surveys, Wiley.Search in Google Scholar