Nicole M. Warrington, Kate Tilling, Laura D. Howe, Lavinia Paternoster, Craig E. Pennell, Yan Yan Wu, Laurent Briollais
August 22, 2014
Genome-wide association studies have been successful in uncovering novel genetic variants that are associated with disease status or cross-sectional phenotypic traits. Researchers are beginning to investigate how genes play a role in the development of a trait over time. Linear mixed effects models (LMM) are commonly used to model longitudinal data; however, it is unclear if the failure to meet the models distributional assumptions will affect the conclusions when conducting a genome-wide association study. In an extensive simulation study, we compare coverage probabilities, bias, type 1 error rates and statistical power when the error of the LMM is either heteroscedastic or has a non-Gaussian distribution. We conclude that the model is robust to misspecification if the same function of age is included in the fixed and random effects. However, type 1 error of the genetic effect over time is inflated, regardless of the model misspecification, if the polynomial function for age in the fixed and random effects differs. In situations where the model will not converge with a high order polynomial function in the random effects, a reduced function can be used but a robust standard error needs to be calculated to avoid inflation of the type 1 error. As an illustration, a LMM was applied to longitudinal body mass index (BMI) data over childhood in the ALSPAC cohort; the results emphasised the need for the robust standard error to ensure correct inference of associations of longitudinal BMI with chromosome 16 single nucleotide polymorphisms.