In this paper, we explore competition performance in decathlon based on competition, training and personal data. Our data set comprises 3103 competition results from the decathlon world's best performance lists from 1998 to 2009. The aim of our analysis is to estimate latent factors describing the performance results andat the same timeto model effects of age, season, and year of the competition on the results. Thus, we apply a new statistical method, semi-parametric latent variable models (LVMs), which can be seen as a synthesis between classical factor analysis and semi-parametric regression. LVMs are especially well-suited for modeling decathlon data, because (i) they permit the assumption of latent factors and therefore take the correlation structure between the ten performance results into account, and (ii) they enable us to model (potentially non-linear) relationships between response variables and covariatescontrary to classical factor analysis. In our analysis, we apply LVMs with a semi-parametric predictor allowing for non-linear covariate effects on the latent factors. Thereby, we obtain well interpretable results: four latent factors standing for sprint, jumping, throwing, and endurance abilities, as well as interesting non-linear effects of age and season on these latent factors. We also compare our results from LVMs to those obtained from classical factor analysis.
JQAS, an official journal of the American Statistical Association, publishes research on the quantitative aspects of professional and collegiate sports. Articles deal with subjects as measurements of player performance, tournament structure, and the frequency and occurrence of records. Additionally, the journal serves as an outlet for professionals in the sports world to raise issues and ask questions that relate to quantitative sports analysis.