As indicated by equation (1) we have assumed a common underlying aging function, *f*(*x*), across players as the basis for our fixed-effects regressions. In this specification, differences between players are assumed to be captured by player-specific fixed effects (some players are better than others) and by random errors. However, as emphasized by Albert (1999, 2009) for baseball, individual aging trajectories vary across players – some players peak earlier than others.

We interpret our earlier results only as providing results for a representative or typical player. We would say, for example, that forwards typically peak in scoring performance between 27 and 28, but recognize that some forwards will peak sooner than others. However, it is still important to consider player-specific aging functions for two reasons. First, it is valuable as a robustness check to assess whether allowing for individual aging functions would lead to results that are inconsistent with or at least different from our fixed-effect regression results. Second, it is of interest to ask how much variability in the age-performance relationship is present across players.

In this section we use a method suggested by a reviewer that allows for variation across players in the age-performance relationship. The basic method is to estimate an aging function for each player [using points per game (PPG) as the performance measure]. Thus, for each player *i* we estimate

$${\text{PPG}}_{it}={f}_{i}\mathrm{(}{x}_{it}\mathrm{)}+{e}_{it}\text{\hspace{1em}(3)}$$

Thus player *i* has his individual *f*_{i} aging function, and there is no fixed effect in the regression. To save space we report results only for a quadratic functional form for *f(x)*, which can be written as *f*_{i}(*x*_{it}) = β_{0}*i* + β_{1}*i**x*_{it} + β_{2}*i**x*_{it}^{2} + *e*_{it}.^{8} Thus each player has his own coefficients and hence his own aging trajectory. From the estimated quadratic function for player *i* we then obtain his estimated peak age. From the distribution of peak ages we can determine the mean to compare with the peak age estimated using fixed-effects regressions and other methods. We can also assess the extent of variation across players in the age of peak performance.

This method is a simplified version of the method used in Albert (1999), Albert (2009), and Lavieri et al. (2012). In those papers, individual aging functions are jointly estimated for all subjects in the data under the assumption that the regression parameters of the individual aging functions are drawn from a particular common prior distribution. Imposing this prior distribution has the effect of smoothing individual trajectory estimates toward the average trajectory while incurring the cost of imposing more external structure on the estimation process. The simpler version we use is sufficient to provide a good indication of the extent of individual variation and the effect of allowing for individual variation on estimated peak ages.

The primary difficulty with this method is that for many players we have only a few observations – not enough to estimate a quadratic function with much reliability. For example, there are some players who first played in 2010–2011 and then played again in 2011–2012 – the final year of our sample period. As we have only two observations for this player we cannot meaningfully estimate an individual quadratic age-performance relationship for him. STATA will provide an estimate – defaulting to a straight line relationship that “fits” the two points perfectly. If the player improves over this period the implied optimal age is then infinite.

We clearly need some minimum number of years for each player that will allow meaningful estimation of a quadratic function. Requiring more years provides more reliable individual estimates but reduces the number of eligible players. We find that 5 years balances this trade-off effectively and there is little change in the results if we use 4 years or 6.

With a minimum time requirement of 5 years, player trajectories for most players can be reasonably estimated. However, there are some players whose trajectories cause a problem. There are several ways to handle such trajectories and they are few enough in number that our results are not significantly affected regardless of the method we use. In addition, as a check, we use the approach suggested in Lavieri et al. (2012) of comparing estimated peak ages with the actual ages at which players achieve their peak performance. Most are within 2 years in one direction or the other.

There are three general classes of problem cases. One such class arises when the estimated quadratic function is convex rather than concave, implying a peak age that approaches infinity. A closely related problem arises if the estimated trajectory is concave and increasing but is close enough to being linear that a player’s implied peak age is absurdly large – perhaps in his 50s or 60s. The third possible problem is that the estimated trajectory may be concave but decreasing, implying an absurdly young peak age. For all three of these types of problem we use the player’s actual age of peak performance instead of the estimated age of peak performance. (Alternatively, we can just drop all these cases, which yields very similar results.)

Figure 7 shows two typical player trajectories for points per game. The dots show the actual performances and the solid lines show the estimated quadratic performance trajectories. Brendan Morrison (red dots) was a very good player who broke into the league at age 23, improved over the next few years, had his best year at age 27 (when he was 27.4 as of January 1) and then declined. His estimated peak age is 27.97. Kevyn Adams (blue) was a good player but not as successful as Morrison, entering the league later, retiring earlier, and scoring less. His estimated peak age is, however, very similar at 27.60.

Figure 7Typical performance trajectories.

Figure 8 illustrates two types of trajectory that cause a problem in estimating a peak age. Eric Dazé (blue) was a young player at the start of our sample period (22.5 as of Jan. 1, 1998) who improved steadily over the next few years. However, he had continuing back problems, with three back surgeries in 5 years. He played a full year at age 27.5 but his worsening back problems allowed him to play only a small number of games the following season – not enough to be in our data set for that season – and he retired shortly after. His trajectory is only slightly concave, leading to an implausible implied peak age of 54.36. We therefore use the age of observed peak performance (26.50) instead of the predicted peak age.

Figure 8Atypical performance trajectories.

The other player is Marek Svatos (red), regarded as a future star during his rookie season at age 23.5 (as of January 1). His second season was disappointing, partly attributed to nagging groin injuries, but his performance continued to be disappointing and he was dropped from being a “top 6” forward (top two lines) to being in the “bottom 6” (3rd and 4th lines) with correspondingly reduced ice time. He retired from the league after his 5th season. His estimated trajectory is slightly convex so there is no interior peak age. There are also a very few players who, like Svatos, have a downward sloping trajectory, but whose estimated trajectories are slightly concave instead of slightly convex, which implies a finite but absurdly young peak age. For both these types of trajectory –convex estimated aging functions or absurdly young peak ages – we use the age of actual peak performance in the peak age distribution analysis.

A reader might ask why we do not use the full distribution of actual peak ages instead of the estimated peak ages. In fact, the distribution of actual peak ages is not very different from the distribution of estimated peak ages. However, using estimated ages is better for two reasons. First, the actual peak age uses just one data point for each player. To the extent that one particularly good year (a “career year”) is partly the result of good luck (a positive error), this apparent peak age can be misleading. The *estimated* peak age is based on a player’s entire career trajectory, which is at least 5 years and is often 10 years or more. Thus the estimated peak ages are based on much more information and are less prone to random errors.

This advantage of using estimated peak age is illustrated by Kevyn Adams in Figure 7. It is unlikely that Adams’ true ability rose sharply from age 25 to 26 then fell sharply at age 27, then rose again at ages 28 and 29. It is much more likely that his underlying ability followed a path close to the smooth estimated trajectory and that deviations from this trajectory were the results of random events – a few lucky or unlucky bounces, playing with better or worse teammates, changes in how the coach used him, etc.

Second, using estimated ages allows the use of more players. One problem with actual ages is that players in the data might not yet have reached their peak. For example, a player who enters the league at age 20 and plays 5 years – through age 24 – up to the end of our sample period (2011–2012) will typically not yet have reached his peak. In our data his apparent peak age will necessarily be 24 or less. Similarly, someone who was past his peak in the first year of our sample period (1997–1998) will have an apparent peak age that is older than his true peak age. To guard against this we would have to drop a large number of players. Using estimated trajectories has the advantage of allowing us to “predict” a peak age that has not yet been achieved or that was achieved before the sample period began.

summarizes the main characteristics of the distribution of peak scoring ages for forwards and defencemen and of peak save percentage performance for goalies.

| Number | 25th pc | Median | Mean | 75th pc | St. Dev. |
---|

Forwards | 566 | 25.2 | 27.6 | 27.7 (0.143) | 29.9 | 3.32 |

Defence | 309 | 25.6 | 28.0 | 28.2 (0.254) | 31.0 | 3.88 |

Goalies | 71 | 26.0 | 28.6 | 29.0 (0.444) | 31.7 | 3.89 |

Table 9Age of peak performance based on individual aging functions bootstrapped standard errors for the means are shown in parentheses.

shows that the average ages of peak performance obtained when allowing for individual aging functions are very similar to those obtained using fixed-effect regression methods. The results reinforce the finding that forwards peak in points per game between 27 and 28. For defencemen this individual player method provides an estimated mean peak age of 28.2 rather than the estimate of 28.9 obtained with our fixed-effects quadratic regression.

This approach indicates a difference in the trajectories of forwards and defencemen that is broadly consistent with our earlier analysis. There are small differences between these positional categories at the 25th percentile and at the median, and the differences open up more at the 75th percentile, capturing the fact that quite a few defencemen peak in their 30s, while that pattern is much less common for forwards. In Figure 9 we show the distributions of estimated peak ages for forwards and defencemen.

Figure 9Histograms of estimated scoring peak ages for forwards and defencemen.

(The dark line is the estimated kernel density for the peak age distribution.).

Estimated individual player trajectories also allow us to revisit the question of whether better players peak later. We consider each player’s peak scoring performance, estimated peak age, and position. The estimated regression is PeakPPG =0.251+0.0062(Peak Age)+0.252(Forward), where Forward is a dummy variable that takes on value 1 for forwards. The *t-*statistic for Peak Age is 2.45, indicating statistical significance at better than the 0.05 level. Thus peak age has a small but positive and statistically significant effect on maximum performance. (We obtain essentially the same result using a player’s average scoring performance over his career as the dependent variable instead.)

This finding is also consistent with the results from the participation method (Section 3). The participation method identifies the peak age using the performance of marginal players. If the weakest NHL players (i.e., the marginal players) peak earliest, this is a partial explanation for the younger peak ages suggested by the participation method than by the regression methods.

There are many additional things that could be done to investigate individual variation.^{9} However, our analysis is sufficient to show that allowing for individual aging functions yields similar estimated peak ages to our fixed-effect regression methods. To the extent that the methods do differ, the fixed-effect regressions have the advantage of being able to use more data. The alternative of estimating individual player functions, as done in this section, requires enough observations for each player to estimate such a function – and we settled on 5 years as the best number. This 5 year requirement forces us to drop just over half the players in the data and we go from analysis using 2033 players down to analysis using just 946. In principle, a player who plays for only 2 or 3 years provides relevant information regarding how much a player improves from 1 year to the next. The fixed-effects methods allow us to include all such information in the analysis.

## Comments (0)