On correlated measurement errors in the Schwartz – Smith two - factor model

: The Schwartz – Smith two - factor model is commonly used for pricing of derivatives in commodity markets. For estimating and forecasting the term structures of futures prices, the logarithm of commodity spot price is represented as the sum of short - and long - term factors being the unobservable state variables. The futures prices derived as functions of the spot price lead to the simultaneous set of measurement equations, which is used for joint estimation of unobservable state variables and the model parameters through a ﬁ ltering procedure. We propose a modi ﬁ ed model where the error terms in the measurement equations are assumed to be serially correlated. In addition, for comparative analysis, the modelling of the logarithmic returns of futures prices is also considered. Out - of - sample prediction performances of two proposed models were illustrated using European Unit Allowances ( EUA ) futures prices from January 2017 to April 2021. Historically, this period corresponds to the second half of Phase III, and the beginning of Phase IV of the European Union Emission Trading System ( EU - ETS ) .


Introduction
Stochastic processes have been commonly used for pricing of commodity derivatives for almost 50 years. The risk-neutral pricing theory for commodity derivatives was first developed in [7], which has become known as the Black-Scholes-Merton framework, where the commodity spot price is represented as a geometric Brownian motion (GBM), and for further details, see [8] and [21]. The principles of Black-Scholes-Merton's framework laid the foundation for asset pricing theory. Since then, many models were developed by considering a number of factors as stochastic processes, which reflect the specifics of the commodity market. The mean-reverting process, or the Ornstein-Uhlenbeck (O-U) process, is often used for pricing of commodity derivatives. For example, in the two-factor oil contingent claims pricing model in [15], a mean-reverting factor and GBM were employed for modelling of the convenience yield and correlated oil spot price, respectively.
In the Schwartz-Smith two-factor model [23], the sum of a short-term and a long-term factors, incorporating short-term deviations and long-term equilibrium price level, respectively, is equal to the spot price  of a commodity. The short-term factor is assumed to tend towards zero, as it reflects short-term variations in prices from temporary changes in demand, supply, and other current market conditions, which will be corrected as the market responds over time. In addition, it is assumed that the dynamics of the long-term factor follows a Brownian motion with drift, which reflects expected permanent changes in the equilibrium price level, which can be explained by the advancement in technology for production, or any regulatory changes. The spot price of a commodity is then used to price futures contracts of different maturities jointly, under the risk-neutral probability measure. Studies in [5] and [12] develop models under the Schwartz-Smith model framework assuming both latent factors to follow an O-U process, with an additional constraint, which remedies the parameter identification problem.
In this article, we present the model that incorporates dependence between futures contracts with different maturities. The novelty of our approach includes the introduction of correlations between measurement errors of different futures contracts, as well as allowing for serial correlation in each marginal measurement error. The correlations of the measurement errors along with other unknown model parameters will be jointly estimated with state variables using the Kalman filter. For illustration, we use the daily prices of European Union Allowance (EUA) futures contracts from January 2017 to April 2021, which were obtained using the Macquarie University access to Refinitiv Datascope.
The European Union Emission Trading System (EU-ETS) was launched in 2005, with its aim to reduce greenhouse gas emissions from a variety of different sectors, such as agriculture, aviation, energy, and manufacturing industries across registered European nations. The implementation of the system puts obligations on those sectors to surrender one unit of EUA in order to emit one tonne of CO 2 or equivalent gases. The history of the EUA market is relatively short, compared with other classic commodity markets such as crude oil, metal, and gas. We choose the selected period to study the recent dynamics of the EUA futures market. The selected time period covers the second half of Phase III and the beginning of Phase IV of the EU-ETS. The EU-ETS initiation and Phase II data were used in the following studies, see [4,25]. Also, the study by [13] used intra-phase and inter-phase futures data and accommodated specifics of each type contract by continuous-time diffusion models with jumps.
The remaining sections are organised as follows. Section 2 reviews previous studies on modifications of the Schwartz-Smith two-factor model used for pricing of commodity derivatives, and studies on different approaches to pricing of EUA derivatives. Section 3 presents the main model that deals with both serial correlations and inter-correlations in measurement errors of logarithms of futures prices or their logarithmic returns. In Section 4, the results of simulation study are summarised, where we validate our approach to estimation of the parameters and state variables in case when both inter-correlations and serial correlations between measurement errors of different contracts are present. In Section 5, we present the results of the calibration of the two proposed models relative to the extended Schwartz-Smith model using historical daily EUA futures prices. Section 6 concludes with overall discussion of results of this study. For reference, the detailed setup of the Kalman filtering procedure in the Schwartz-Smith two-factor modelling framework is presented in Appendix A.
where the Kalman filter procedure has been modified to incorporate heteroscedasticity of prices and to estimate time-varying risk premium. For pricing of agricultural commodities, the performance of the Schwartz-Smith two-factor model has been studied in [24], using Fourier series as a seasonal component. An attempt at estimating covariances of measurement errors was also made, using a parametrised function of the time to maturity, but claimed that a substantial improvement in the model could not be seen. The study in [2] extended the two-factor modelling framework by incorporating explanatory variables/regression structure into the drift terms of the latent factors. The three-factor model was studied in [14], where the study allowed the deterministic seasonal component in the volatility of latent factors and used a function of inverse inventory as the third-state variable in the model. A step function was used in [22] as a seasonal component for the calibration of commodity spot and futures prices in a general multi-factor model, and a multi-factor model of commodity futures has been developed with stochastic seasonality in [19]. Under the same setup as in [23] and [24], instead of optimising the sample likelihood function, the study in [16] proposed a different estimation method, so-called a two-step least-square estimation method, which involves minimising the sum of squared residuals from the state equation.
For pricing of EUA futures, the non-compliance event in terms of the total normalised emission was considered, along with the level of penalty in [9]. They used the digital nature of the terminal allowance price as the basis for modelling of the spot price process, and hence pricing of European options on EUA futures. The study in [26] developed a bivariate model in state-space form for parameter estimation through the Kalman filter, using December-maturity futures contracts from 2005 to 2012. In a recent study by [3], they have evaluated the term structure of EUA futures prices and compared performances of a single-factor GBM model by [1] and the original Schwartz-Smith two-factor model.

Main model
In this section, we introduce the modifications for modelling of the logarithmic prices and logarithmic returns of futures prices, incorporating serial correlation and inter-correlation in measurement errors of different contracts, within the Schwartz-Smith two-factor modelling framework.
The risk-neutral dynamics of short-term and long-term factors, notated as χ t and ξ t at time t, are expressed as the following stochastic differential equations: where > κ γ , 0 are the speed of the mean-reversion for χ t and ξ t , respectively. > σ σ , 0 χ ξ are instantaneous volatilities of two latent variables, and λ χ and λ ξ are the risk premia adjustments for χ t and ξ t that appear after transforming the model from the real probability measure to the risk-neutral (note that, the risk-neutral process is used for deriving the futures price). * W t χ and * W t ξ are correlated standard , where ρ χξ is the correlation coefficient of two stochastic processes. By setting up the pricing model as a linear state-space model, two latent variables are expressed in the state equation, and the relationship between state variables and futures prices is expressed in the measurement equation. Then, we implement the Kalman filter to estimate values of latent variables and the marginal likelihood function, which are used for the estimation of model parameters. The readers are referred to [5] for a detailed setup under the assumption of measurement errors being independent for each contract.

Correlations in measurement errors
Consider the following linear state-space form for the Schwartz-Smith model: where, for = … t n 1, 2, , and for N contracts with different maturities < <…< Here, ϕ is a diagonal matrix that consists of autoregressive (AR) coefficients for each marginal measurement error of different contracts, and t Δ is the time difference in years between − t 1 and t. We may generalise AR process in measurement errors v t with order p, and introduce additional ϕ m matrices for = … m p 1, , in (4). For state and measurement errors, denoted as w t and v t , we assume that they are independent of each other, and We denote s jj 2 to be variances of measurement errors for contract j, and s jk to be covariances of measurement errors between contracts j and k, where ( ) = … j k N , 1, , , and ≠ j k. For estimation of covariances of measurement errors, we follow the estimation method in [18], where we estimate correlation coefficients of measurement errors between different contracts, and convert them back to covariances. We use the estimation approach introduced in [17] for correlation coefficients, which is often used in credit risk modelling. Let z j be the normalised prices of the contract j, so that the vector of prices consists of ( ) for N contracts. We assume that On correlated measurement errors in the Schwartz-Smith two-factor model  111 where In this setup, we have the following correlation matrix structure: is a diagonal matrix that consists of volatilities of measurement errors, the estimation will involve estimating both D and R. Hence, the modified covariance matrix V is applied in the Kalman filter and in the estimation procedure.

Modelling of the logarithmic returns
In this section, we develop the measurement equations for the logarithms of relative returns on futures prices. Since the logarithmic returns are differences of the logarithmic prices at time t and − t 1, we set up a linear state space model in the following way.
For = … − t n 1, 2, , 1, state and measurement equations are written as follows: Constant vectors, transition matrices, and measurement errors are from the original model setup shown in (5) and (7), with I being the identity matrix. Note that V r is the covariance matrix of the measurement errors of logarithm of returns, instead of logarithm of prices. If v t r follows the AR process, then we can also set our measurement errors similar to Section 3.1. We can proceed with the standard Kalman filter and maximise the likelihood function using new notations accordingly.

Parameter estimation
The unknown parameter set ( ) = ϕ ψ κ σ λ γ μ σ λ ρ V , , , , , , , , , is estimated by optimising the log-likelihood function of y, the joint distribution of ( ) … y y y , , , n  (14) can be re-expressed as follows:  (15) with respect to ψ jointly. To obtain the quantity for the log-likelihood function, latent state vectors and their covariance matrices at each time t need to be estimated through the Kalman filter. [5] detected the parameter identification problem within the log-likelihood function in the Kalman filter, and hence, the constraint ≥ κ γ is considered in the optimisation procedure.

Simulation study
In this section, we perform a simulation study to validate the new approach described in Section 3.1. We focus on validating serial correlation and inter-correlation assumptions in measurement error considered under the Schwartz-Smith model framework, to see how our novel approach performs at estimating parameters and state variables.
The steps for this simulation study are as follows. . Then, obtain the state and measurement variables x t and y t by simulating error terms w t and v t . v t is assumed to follow an AR(1) process. (2) Choose appropriate initial values, and determine appropriate feasible bounds for each parameter.
(3) Conduct the optimisation procedure through the Kalman filter, which involves jointly estimating state variables and parameter estimates. Obtain parameter estimates, and estimated values for state variables.
We simulate for = n 200, 500, 1,000, 2,000, 5,000, with five different futures contracts maturing in 1, 2, 3, 4, and 5 months. The parameter estimates are presented in Tables 1-3, with standard errors computed using Monte Carlo approach, to assess the accuracy of parameter estimates. The MATLAB code for the simulation study is available at https://github.com/Junee1992/EUA_Futures_Pricing/tree/main/Serial-Correlation.
Overall, parameter estimates are quite close to their true parameters, except for κ λ , χ and λ ξ . κ γ , and ρ χξ tend to fluctuate for ≤ n 1,000 ; however, it stabilises as the sample size increases. Estimated AR coefficients, volatilities, and correlation coefficients are close to the true values, indicating that the model is able to capture dependencies between measurement errors of contracts with different maturities as well as their serial correlations. The estimation errors of first eight parameters are summarised in Figure 1.
The simulated and estimated state variables are shown in Figure 2, along with mean absolute errors (MAE) calculated for two latent variables in each panel, showing that estimation of state variables improves as n increases.  Figure 3 illustrates the historical daily futures price dynamics through the trading phases of the EU ETS (Phases I-IV). The graph clearly illustrates the necessity of developing sufficiently versatile models to be able to accommodate the intricacies of EUA futures price dynamics across the different phases, including Phase IV. In addition, Figure 4 shows the term structure of EUA futures prices, which matures in December annually. The main difference between commonly traded commodities and EUA is that futures curves tend to increase smoothly for as we increase the maturity of futures contracts over time.
In the two works by [23] and [5], measurement errors are assumed to be independent. However, based on the belief that the price movement for each contract must be correlated with other available contracts during the same period within the same commodity market, we assume that measurement errors have the full covariance matrix. After investigating the statistics of measurement errors for EUA future prices, we found that measurement errors of contracts with different maturities are highly correlated with each other, as close to 1. The correlation matrix of measurement errors of EUA future prices using Model 1 is as follows: We also observe that each series of measurement errors follow AR(1) process, with all AR coefficients being highly significant (with p-value <0.0001). Hence, the price data of EUA futures contracts are suitable to test the model that is developed in this study.
For comparative analysis, we use two different models. The following two models consider estimating inter-correlation between measurement errors in different settings: Model 1 -Modelling of logarithmic returns; Model 2 -Modelling of logarithmic returns with serially correlated measurement errors.  Table 3: Estimates of standard deviations and factors of correlation coefficients of the measurement errors.    For goodness-of-fit assessment, we present the performance of each model using root mean-squared error (RMSE). The results are summarised in Table 4 for futures contracts C j , = … j 1, 2, , 7. In both models, the logarithmic returns are converted to the logarithm of prices. In temrs of RMSE solely, Model 1 performs better than Model 2. However, using model selection criteria (Akaike and Bayesian information criterion), Model 2 is preferable to Model 1.
Next, we provide results of out-of-sample predictions using 30/50-day out-of-sample windows in Tables 5 and 6. In each setting, we considered four different scenarios, where we repeat parameter estimation and out-of-sample prediction for every 1 day, 5 days, 10 days, and 30 or 50 days. We use the deseasonalised price data from January 30, 2017, to February 18, 2021, = n 1,040 business days. From Table 5, we found that, in general, Model 2 performed better than Model 1, although differences in RMSE were minimal. We used Diebold-Mariano test for detecting a significant difference in out-of-sample   In Table 6, we found that Model 1 performed better for out-of-sample predictions for = h 1, 10 day, whereas Model 2 obtained the lower RMSE for = h 5, 50. The Diebold-Mariano test showed no significant difference in 50 days out-of-sample predictions between the two models again. This pattern persisted in other subsets of data from Phases III and IV.

Conclusion
In this article, we have presented the modified Schwartz-Smith two-factor model, which can be used for modelling of logarithmic returns of futures contracts, incorporating serial correlation and inter-correlation of measurement errors. We compared the two different models that use the logarithm of futures prices and the logarithmic returns, applying them to EUA futures price data from Phase III and Phase IV of EU-ETS.
The simulation study has illustrated that our novel approach was able to jointly estimate both parameters and state variables when serial correlations and inter-correlations were present in measurement errors. We have illustrated that the parameter and state variables estimates converged as the sample size increases. The maximum likelihood method was used for the estimation of coefficients of Gaussian AR processes used for modelling of serial correlations in measurement errors.
Finally, we discussed the results of the comparative analysis of the two models in the context of EUA futures data. The goodness-of-fit and out-of-sample performances in predicting futures prices were discussed for each model. Overall, models for logarithmic returns showed a good performance in terms of RMSE for out-of-sample predictions. The full Model 2, for logarithmic returns with serial correlations, performed better than its reduced-form Model 1 for calibration of data in terms of AIC and BIC, and for predicting for 30 days out-of-sample window.
The empirical results of the study emphasised the necessity of considering both serial correlations and inter-correlations of measurement errors for modelling of futures prices. For parameter estimation in Model 2, we used the two-step approach. First, we obtained the parameter estimates similar to Model 1, then we proceeded with estimating parameters of AR processes used for modelling of serial correlations. By using the real data through the incorporation of serial correlations in the measurement errors, we showed that the proposed models for logarithmic return capture the price movement more reliably. Once we obtain the log-likelihood function, we maximise it to obtain relevant parameter estimates. Depending on the model assumption, elements in c, G, W, d_t, F t , and V will differ.