Temporal disaggregation or interpolation models are used to produce high-frequency estimates benchmarked on a low-frequency time series using high- frequency indicators. In particular, these models are currently used in numbers of countries (e.g. France, Italy, Spain, Portugal, Switzerland) to compute quarterly national accounts from annual estimates and quarterly or monthly indicators.
The models currently used are mostly static, i.e. do not include lags of the unobserved high-frequency variable: . Generalizing the solution to static models by maximum likelihood to the case of dynamics models (i.e. including lags of the unobserved high-frequency variable) poses some technical difficulties due to unobservability. In this paper I show that these difficulties can be solved and that the closed form likelihood of the model can be expressed and maximized (without using a Kalman filter).
Because the left-hand side of the model is observed only at low frequency, the estimated high-frequency residual has non-standard properties. These properties depend on the autocorrelation structure assumed for the residual and some assumptions imply inconvenient statistical artifacts.1In this paper, I investigate the statistical properties of the high-frequency residual in the case of dynamic models and highlight similar results and recommendations as in the static case. I show (recall) that it is impossible to find a residual that fits the stochastic structure assumed for it. Ex-post, the dynamic structure of the residual combines the assumption made for it with the autocorrelation structure of the model. This is a property of optimal signal extraction in unobserved component models. It does not disqualify the estimation procedure but should be kept in mind when using this method. A similar recommendation as with static models ensues: for practical reasons, even in a dynamic models, errors should not be assumed to be white noise. Estimating a model with high autocorrelation of the residual limits the impact of the unexplained component on the high-frequency profile of the result.
As I show with an example, an appealing application of dynamic models is the production of quarterly accounts of stock variables such as productive capital, benchmarked on annual accounts. In this application example, I interpolate annual stocks of computers and communication equipments in non-financial corporations in France.
Temporal disaggregation or interpolation problems have been studied by (Friedman 1962) in the case of commercial banks’ stock of money, but more formally introduced in a seminal paper by Chow and Lin (1971). The question of the dynamic structure of the model’s residual has been central in the development of this literature by Bournay and Laroque (1979), Litterman (1983), and Fernandez (1981). More recent developments (Grégoir 1994; Salazar, Smith, and Weale 1997; Santos Silva and Cardoso 2001; Proietti 2006) allow for a richer dynamic in the model, introducing lags of the unobserved high-frequency variable (dynamic models). However, solving dynamic models within the Chow-Lin framework is no easy task. Di Fonzo (2003) notes for the earliest solutions that the algorithms needed to calculate the estimates and their standard errors seem rather complicated and not straightforward to be implemented in a computer program. And further demotivating the potential user, Liu and Hall (2001) show on US data that the gains of complexity may in fact be limited. Santos Silva and Cardoso (2001) overcome this difficulty and provide a straightforward solution to dynamic models under the assumption of white noise residuals. Proietti (2006) provides another solution based on state-space models and the Kalman filter numerical approximation of the likelihood. This tool is flexible enough to encompass most of the pre-existing models and allow further developments such as models in logarithms.
The present paper describes a general solution for dynamic models where the residual can be autocorrelated but the closed-form solution of the likelihood can be expressed. In this sense, the method proposed fills a gap between (Santos Silva and Cardoso 2001) and (Proietti 2006).
The remainder of this paper is organized as follows: Section 2 presents a general solution to dynamic models, Section 3 investigates the ex post stochastic properties of the model’s residual and Section 4 presents an application of this method on National Accounts data.
2 A Framework for Dynamic Models
2.1 Dynamic Models: An AR(1) Introduction with Stocks
Dynamic models are an intuitive framework to describe stocks dynamic. Macroeconomists for instance describe the dynamic of capital stock with following equation:
with the flow of investment and a depreciation rate.
An econometrician can adapt such dynamics to compute a stocks at a higher frequency:
with a proxy for the flows in and out of this stock2 and a depreciation rate ().
Let denote the periodicity of the flows (e.g. 12 months or 4 quarters) and denote the number of years for which the stocks are measured. By assumption, only is known (i.e the stock every f periods), while flows are measured every period . The purpose of stocks interpolation is to estimate (every period) using eq. (2) and , given the sub-sample .
To estimate the model by maximum likelihood, one needs to abstract from the unobserved variable (the high-frequency capital stock) and summarize the constraint imposed on the residuals by the model. To do so, one can iterate eq. (2) in the following way: (3)(4)
Let denote , one then isolates a combination of residuals on the one side and a combination of observed variables on the other:
Matrix notation Let denote the vector of annual stocks (from year 0 to N) and the vectors of flows. Let denote the vector . And let denote the Kronecker product of by . Equation (5) becomes:
with the identity matrix of size . With straightforward notations, I can write (7)
Maximum likelihood To compute the likelihood of the model, one should assume a stochastic structure for the high-frequency residual . Let denote its variance-covariance matrix. The maximization of the likelihood of the model is:
with the vector of and the simplifying notation . Note also that depends on the parameters of the model and the observed variables.
A formal solution for
For any value of the parameters to be estimated, the optimization with respect to yields the following solution: (10)
Concentrated likelihood of this problem Knowing the functional form of , to estimate the model’s parameters, problem (8) can be summarized into: (11)
with the variance-covariance matrix of .
In problem (11), depends on the low frequency variables and the parameters and , is a variance-covariance matrix which is by assumption regular and depends on a limited set of parameters (e.g. innovation’s variance and autocorrelation) and depends on and . As a consequence, the analytical formula for the likelihood can be written and maximized directly. It is then not necessary to use a state-space representation and the numerical approximation of the likelihood by the Kalman filter to find the optimal values for the parameters. Note that, although it is close to the likelihood function of , the objective function to be maximized is slightly different (the determinant is that of not ).
2.2 General Solution
The use of dynamic models in the case of flow variables may seem less intuitive than for stock variables. However, models in first differences or growth rates tend to be more accurate in some flow cases (Eurostat 2013, 5.C31), therefore a dynamic framework can be useful also for flows.
The dynamic model for flows can be treated in the same framework as exposed above. In this section, I derive the general case for a dynamic model with lags which can be applied to either stocks or flows.
with an account and high frequency indicators. By assumption, only is known (i.e the cumulated flow every f periods), while the indicators are measured every period .
Using vector notations
Isolating the LHS vector in the RHS of eq. (13) yields the following:
With straightforward notations,
Summing the subperiods of each year with matrix allows to isolate a combination of high-frequency residuals as a function of the observed variables and the parameters.
Equation (18) summarizes the constraint imposed by the model on the residuals. From this expression, propositions 1 and 2 can be directly applied to estimate all parameters (, and the variance and autocorrelation of ) and both the high-frequency account and residual.
Passing from flows to stocks requires to define the low-frequency data not as an aggregation of the high-frequency ones but as observing the high-frequency data every periods. This is simply done by changing matrix by in eq. (17) to define the aggregation constraint (18). Propositions 1 and 2 can then be applied to estimate the parameters of either models by maximum likelihood.
In the introductory example with a stock variable and only one lag, the initial points () can be taken from the observed low-frequency data. With a flow case or more than one lag, the initialization values are unobserved have to be passed as additional parameters, a situation identical to standard moving average models.
3 Properties of the Estimated Residual at High Frequency
3.1 Theoretical Properties
Once the model is estimated, only some information at low frequency on the residual (the combination ) is known. Equation (10) allows to smooth this residual at the higher frequency. As is an exogenous and unexplained component, in practice one may want to limit its influence on the profile of the interpolated stock within the years. Otherwise, interpretation or econometric results based on interpolated stocks could be discredited by potential statistical artifacts. The typical issue for quarterly national accounts is a residual with a jump every first quarter. As I show in Section 3.1 and the example in 3.2, the statistically optimal distribution of the high-frequency residual in dynamic models can induce undesirable features such as this jump every first quarter.
Theoretical variance-covariance matrix of the estimated residual Given formula (10), the estimated residual is equal to while by definition of , . Thus the theoretical variance-covariance matrix of the estimated process is:
It has the following property:
Hence, does not have the same stochastic properties as , but combines the assumption made for (through the matrix ) with the dynamics of the model ().
An application of the Wiener-Kolmogorov optimal signal extraction theory The previous result can be linked to the signal extraction framework developed by Wiener and Kolmogorov.
Conditional on the values of the parameters to be estimated, the problem can be written as follows: (22)
observed every period is the sum of signals (Figure 1).
On a two-sided infinite sample, the optimal filter to extract each of these signals takes the form:
with , the lag operator for annual data.
We know from signal extraction theory (e.g. Whittle 1963) that depends on the covariance generating functions of the processes .
In the most simple case where is an i.i.d white noise of variance the formula simplifies to:
Given the i.i.d hypothesis, this result is identical to the solution in finite sample detailed in Section 3.2.1.
Since if then , the autocovariance generating function of is:
(29) (30)from which the property is not true in general.
The covariance generating function of with is: (31) (32) (33)
and here again, the property is not true in general.
In particular in the white noise case, even though we assumed that there is no correlation between sub-periods within and across the years (), the estimated process shows correlation within the years ().
This property does not invalidate the estimation procedure but is a general result of optimal signal extraction in unobserved component models. This property is also not due to the sample size.
3.2 Three Examples for the High-Frequency Residual
To estimate the value of the depreciation rate and the vector , one should assume a stochastic structure for the high-frequency residual in order to maximize the concentrated likelihood eq. (11).
3.2.1 White Noise Residuals ε
It is straightforward to see that is a linear combination of . It is independent of when is white noise (see Figure 1).
If one assumes , then . The log-likelihood to be maximized then becomes:
Given the annual discrepancies , the optimal values for the residual at high frequency () is given by eq. (10), which simplifies into:
As a consequence, the sequence of is
which is identical to the result in infinite sample eq. (28).
Ex-post, the high-frequency residuals, although assumed to be white noise, are not stationary within each year (but follow a geometric series with coefficient ). Indeed, the variance covariance matrix of the process once reconstructed at high frequency reads
One can recognize the Kronecker product of the variance covariance matrix of an i.i.d white noise at low frequency and an autocorrelated but not stationary process at high frequency.
Moreover, estimated shocks exhibit breaks every periods. The magnitude of these breaks increases with the variance of . Figure 2 illustrates this undesired property on a simulated sample of 15 years.
Figure 3 shows that the same result holds for the optimal method applied to flow variables in a static model -when - which justified the developments proposed by Bournay and Laroque (1979), Fernandez (1981), and Litterman (1983). Figure 3 also shows that even in a dynamic model for flows the solution with a white noise hypothesis exhibits jumps every first quarter. Using state-space models on Swiss data to estimate a monthly GDP, Cuche and Hess (2000) may have wrongfully disregarded dynamic specifications for this reason: they only consider white noise residuals in these cases and find that the outcome of such models generates an unexplained cyclical pattern. Although residuals are white noise in both cases, Figure 3.a is different from Figure 2 because in a stock model only is observed while in a flow model is. In particular, one can check that when the dynamic component of the model diminishes (), the treatment of the discrepancy tends to simply divide the annual residual by the number of sub-periods (4 quarters or 12 months), as it is the case for the static model. In other words, it is only with highly autocorrelated models that the impact at higher frequency of the estimated residual on the final estimates can be minimized.
3.2.2 Autocorrelated Residuals (AR(1), Generalization of (Chow and Lin 1971))
I now assume that the high-frequency residual follow an AR(1) process:
with a white noise.
Figure 4 shows that when the high-frequency residual is assumed to follow an AR(1) process, the estimation yields to a much smoother result. However, while the underlying AR(1) is slightly autocorrelated (), the estimated AR(1) is markedly so. As it is expected from formula (19), the estimated shocks at high-frequency encompasses both dynamics, that of the theoretical shock and that of the model.
3.2.3 Random Walk Residual, Generalization of (Fernandez 1981)
I assume that:
with a stationary process of innovations. In this particular case, the likelihood to be maximized should be modified into: (41)
with the variance–covariance matrix of the innovation and the first difference operator.
The proof for this proposition is given in appendix, see also Insee’s Quarterly National Accounts methodology (Insee, Quarterly National Accounts Division 2012).
A tractable approximation for this solution is to use a square version of so that is invertible :
in which case the solution for is:
Following (Fernandez 1981) I assume to be white noise, , this solution simplifies into:
Fernandez 1981 shows in the case of static models that assuming the residuals follow a random walk is equivalent to distribute an annual discrepancy as in Denton (1971). Equation (46), analogous to Denton ’s result, provides a similar conclusion for dynamic models. The practical advantage of this assumption is that, by construction of the method proposed by (Denton 1971), it minimizes the variations of the residual from period to period (see Figure 5). Hence, ex-post, the profile of the high frequency estimate is impacted as little as possible by the unexplained component of the model.
3.2.4 ARIMA(1,1,0), Generalization of (Litterman 1983)
I assume that:
with a stationary process of innovations and the first difference operator.
Following (Litterman 1983) I assume that the residual follows an ARIMA(1,1,0), so that the solution for E is:
with the covariance matrix of a first-order autocorrelated process. A simulation result is presented in Figure 6.
4 Example: Non-financial Corporations’ Capital in Computers and Communication Equipment
Using annual data for non-financial corporations from the annual accounts and quarterly investment from the quarterly accounts, I estimate the following model:
with the stock of capital in computers and communication equipment and the GFCF in computers, electronic and optical products.
Parameter is equal to one minus a constant quarterly depreciation rate.
Parameter accounts for the fact that investment and capital are taken in slightly different nomenclatures and also that there is depreciation within the first period when an equipment is bought. Parameters and are the autocorrelation of the high-frequency residual (or its first difference in the I(1) case) and the standard error of its innovation.
Rationale for testing a constant depreciation rate. The annual accounts for capital are built using the perpetual inventory method at a very detailed level. This method assumes that the life length of an equipment follows a truncated log-normal distribution calibrated to verify some information on the average duration of the equipment. Between its purchase and its destruction, the equipment is also assumed to be linearly depreciated. Although this method does not assume a constant depreciation rate, the combination of the two preceding hypothesis yields depreciation coefficients which are decreasing and convex and thus can be approximated by a geometric series.
For computers and communication equipment, the assumptions made in the permanent inventory method in France are best approximated using an OLS estimation with a quarterly depreciation rate of and respectively.
Using quarterly accounts for investment, it is not possible to rely on a permanent inventory method since the data are not available at the same detailed level. Also, this method requires to initialize the time series by “sacrificing” as many points as the maximum duration of the equipment (20 years for communication equipment and 10 years for computers) while with the present method the quarterly time series is initialized by the first annual value available.
I estimated the dynamic model with the following constraints on the parameters: , , and .
The results from three different specifications of the error term in eq. (49) are gathered in Table 1. Based on the likelihood criteria, the favoured model is (ii), with I(1) errors and autocorrelation in its innovations (see optimisation results in Appendix B). The estimated time series is displayed on Figure 7. The depreciation coefficient equals which is consistent with the value estimated for communication equipment from the annual hypothesis of the permanent inventory method. The standard deviation of the errors is smaller than 3 (million euro of 2005) while for comparison, the standard error of the quarterly changes in investment is larger than 40 (million euro of 2005). Figure 8 shows the contribution of the error and the investment to quarterly changes in capital. The contributions correspond to the three terms in the general solution (16), respectively the contributions of the flow indicator (investment), the residual and the initial value. The contribution of the residual is consistently negative yet the underlying innovation has a mean statistically not different from 0. This contribution is due in part to the high autocorrelation of the residual itself but also to the high autocorrelation of the model itself which affects the contribution through the multiplication by the inverse of in eq. (16). As for the contribution of the initial value, similarly to a first order moving average it is gradually declining to 0. It is compared with a counter-factual interpolation without indicator. As expected the error accounts for a smaller share of the result and its volatility than investment.
Quarterly investment in divisions 26 and 27 of NAF rev2 (classification of products) is 5 to 10% smaller than the annual investment time series corresponding to computers and communication equipment (assets AN.111321 and AN.11322 in Eurostat’s classification of assets). One would thus expect to be larger than one. Yet, the estimated value of is close to one, indicating that there is depreciation of the equipment during the first period when it is purchased, which is consistent with the method used for the annual counterpart. This result shall be kept in mind by macro-economists when writing the dynamics of capital in a model.
Proof to proposition 1 The optimization program can be rewritten:
The first-order condition of this program reads, with the corresponding Lagrange multiplier:
From this system, one can isolate and consequently .
since and are symmetric.
Proof to proposition 3 The program can be rewritten:
The Lagrangian of this program is:
with the first column of , i.e. a vector of and the other columns. The minimization yields :
From this system, one can isolate and . Once and are known, it is straightforward to compute using matrix .
Chow, Gregory C., and An-loh Lin. 1971. “Best Linear Unbiased Interpolation, Distribution, and Extrapolation of Time Series by Related Series.” The Review of Economics and Statistics 53 (4): 372–372. DOI: . CrossrefGoogle Scholar
Cuche, N., and M. Hess. Estimating Monthly GDP in a General Kalman Filter Framework: Evidence from Switzerland. Economic and Financial Modelling 2000:153–193. Google Scholar
Denton, Frank T. Adjustment of Monthly or Quarterly Series to Annual Totals: An Approach Based on Quadratic Minimization. Journal of the American Statistical Association 1971:99–102. DOI: . CrossrefGoogle Scholar
Di Fonzo, T. Temporal Disaggregation of Economic Time Series: Towards A Dynamic Extension. European Commission (Eurostat) Working Papers and Studies 2003, http://www.oecd.org/std/21781422.pdf. Google Scholar
Grégoir, S. I. 1994. “Note on Temporal Disaggregation with Simple Dynamic Models.” Workshop on Quarterly National Accounts Proceedings 141–166. http://ec.europa.eu/eurostat/documents/3888793/5815741/KS-AN-03-014-EN.PDF/284f1001-fd36-4999-b007-a22033e8aaf9. Google Scholar
Insee, Quarterly National Accounts Division. 2012. “Méthodologie des Comptes Trimestriels.” Insee méthode 126. https://www.insee.fr/fr/information/2571301. Google Scholar
Salazar, E., R. Smith, and M. Weale. 1997. “Interpolation Using a Dynamic Regression Model: Specification and Monte Carlo Properties.” National Institute of Economic and Social Research Discussion Papers 126. https://www.niesr.ac.uk/publications/interpolation-using-dynamic-regression-model-specification-and-monte-carlo-properties. Google Scholar
Whittle, P. Prediction and regulation by linear least-square methods, English Universities Press. 1963, http://www.jstor.org/stable/10.5749/j.ctttsphx. Google Scholar
The most commonly known artifact is with white noise residuals, a jump every first quarter, which is recalled by Figure 3b.
may encompass more than one flow indicator (e.g. new registrations of trucks and a measure of purchases of new machineries) and can imperfectly measure the flows (hence the vector of coefficients in eq. 2).
If both and were observed at high frequency, one would use eq. (2) to isolate the residual and maximize its likelihood. In an interpolation (or disaggregation) problem, only eq. (5) can be known about the residual and its likelihood maximization has to be adapted accordingly.