Numerical Development and Evaluation of an Energy Conserving Conceptual Stochastic Climate Model

Abstract In this study we aim to present the successful development of an energy conserving conceptual stochastic climate model based on the inviscid 2-layer Quasi-Geostrophic (QG) equations. The stochastic terms have been systematically derived and introduced in such away that the total energy is conserved. In this proof of concept studywe give particular emphasis to the numerical aspects of energy conservation in a highdimensional complex stochastic system andwe analyzewhat kind of assumptions regarding the noise should be considered in order to obtain physical meaningful results. Our results show that the stochastic model conserves energy to an accuracy of about 0.5% of the total energy; this level of accuracy is not affected by the introduction of the noise, but is mainly due to the level of accuracy of the deterministic discretization of the QG model. Furthermore, our results demonstrate that spatially correlated noise is necessary for the conservation of energy and the preservation of important statistical properties, while using spatially uncorrelated noise violates energy conservation and gives unphysical results. A dynamically consistent spatial covariance structure is determined through Empirical Orthogonal Functions (EOFs). We find that only a small number of EOFs is needed to get good results with respect to energy conservation, autocorrelation functions, PDFs and eddy length scale when comparing a deterministic control simulation on a 512 × 512 grid to a stochastic simulation on a 128 × 128 grid. Our stochastic approach has the potential to seamlessly be implemented in comprehensive weather and climate prediction models.


Introduction
The dynamics of the atmosphere and the oceans are by nature complex. Processes with di erent time and length scales interact with each other a ecting the system as a whole. While climate and ocean models have considerably improved over the last few decades, we still cannot resolve all important physical scales and processes, see for instance [20,6,22]. The discretization of the continuous governing equations of motion is limited by the model resolution, which determines the size of the smallest resolvable scale. Despite the continued increase of computer power and, thus, of resolution, there are still many important processes in the atmosphere and in the oceans that cannot be explicitly resolved. These include turbulent motions with scales ranging from a few centimeters to the size of the model grid box, as well as processes that occur at a molecular scale, like condensation and evaporation. Any numerical forecaster or modeller has to make a decision, based on the targeted objectives, regarding the spatial and temporal scales to resolve. As a consequence of this decision, each numerical scheme inevitably fails to resolve subgrid-scale processes.
These unresolved processes and scales cause many of the observed di erences between models and observations. In order to represent these unresolved processes, so-called parameterizations are necessary which take into account the in uence that the unresolved have on the resolved processes, if they would be resolved in high-resolution simulations [20]. Most parameterizations, however, are damping and do not take account of the energy and momentum uxes from the unresolved to the resolved scales [34,57]. This is a likely source of many of the observed biases in climate and ocean models. Without the added dissipation, however, energy and enstrophy would accumulate at the truncation scale and lead to a blow up in nite time. Hence, it is of fundamental importance to nd systematic ways to parameterize the unresolved scales and processes of models, and to improve the model performance and reduce the model biases at coarser resolutions.
As suggested already in 1976 by Hasselmann [31], fast waves can be considered as noise with respect to the slowly evolving large-scale modes and, therefore, can be parameterized by stochastic processes [23]. Hence, to ameliorate this problem of too large damping, stochastic parameterization schemes have been developed (see recent reviews [20,6,22,27]). There are several advantages in using stochastic parameterizations; the most important are: gain in computational time compared to higher resolved simulations; reduction of model errors and systematic representation of uncertainties and model errors. Most operational stochastic parameterization schemes are rather ad hoc developments [7] and do not conserve energy or momentum. As a consequence, current schemes have the disadvantage of the forfeiture of conservation laws and a likely loss of important symmetries in the model equations. For climate simulations conservation properties are of importance because they are leading to stable and realistic climate simulations, and should be considered also for stochastic parameterizations, not only for stability reasons, but also to respect the underlying dynamics of geophysical uid ows.
From a mathematical perspective, there has been a growing interest over the last few decades in nding appropriate techniques to develop systematic methods to accurately and e ciently represent fast variables in multi-scale systems. Much fundamental work has already been done, e.g. Mémin [37] derived energy conserving geophysical uid equations assuming that the velocity can be written as a mean state plus some perturbations, while Holm [32] used stochastic variational principles to obtain new stochastic uid equations conserving helicity and the Kelvin circulation theorem. Numerical models following these theoretical approaches have been developed and show good performance and improved results with respect to the deterministic counterpart at the same resolution, see for example [51,52] for numerical implementations and results of [37] and [12,11] for applications of [32].
Furthermore, Majda, Timofeyev and Vanden-Eijnden [43,42,44,39,38] developed a systematic strategy for stochastic mode reduction starting from the assumption that the explicit nonlinear self-interaction of the fastest scales involved can be represented by a linear stochastic operator. This procedure is mathematically justi ed only for large time scale separation but showed good performances also in case of a less pronounced time scale separation. For its application to more complex atmospheric models see also [26,19,18,50,41]. A less theoretical, but still e cient, approach is given by the stochastic kinetic energy backscatter scheme (SKEBS) which is often used to represent model uncertainty arising from unresolved subgrid-scale processes and their interactions with larger scales [56,5,22,47,21]. Connected to the idea of backscatter, di erent deterministic and stochastic parameterizations aiming at representing the upscale turbulent cascades in eddypermitting simulations have been developed. Among others, noticeable examples are given by [33,49,62,28,15], which also showed that the stochastic backscatter is, in general, a more e cient eddy parameterization than its deterministic counterpart.
In this paper we systematically further develop the projector approach outlined by Frank and Gottwald [16]. Here the stochastic noise is projected onto the energy manifold. This approach has the main advantage that it can be straight forwardly implemented in existing models while the above approaches [37,32,43] derive new equations of motions which are harder to implement in already existing and operational models. Frank and Gottwald [16] tested their scheme with a 4-dimensional toy-model for the large-scale dynamics of the atmosphere by means of a Lagrangian description of the dynamics. Here instead we apply it to a high-dimensional conceptual climate model, i.e. the inviscid 2-layer QG model, in a Eulerian framework. The purpose of our study is a proof of concept whether this scheme can also be applied to high-dimensional models. Thus, the main research questions we aim to address in this study is: (i) Can we accurately conserve energy in a high-dimensional QG model? and (ii) What conditions do we need to impose on the spatial noise covariance matrix for this? Hence, in this work we focus on the technical aspects of this approach, analyzing potential issues due to the discretization of the continuous equations or to the numerical implementation in general. With this intention, we choose to apply this projection operator scheme to an energy conserving QG model as a hard numerical test case. While our particular set up might not seem interesting from a geophysical uid dynamics point of view, we still consider it numerically challenging and hence a valuable benchmark in testing the numerical aspects and accuracy of our stochastic system in a high-dimensional geophysical model.
The remainder of this paper is organized as follows: in Section 2 we present the inviscid 2-layer energy conserving QG model both in its deterministic and stochastic formulations. Details on the numerics, like the choice of the numerical solvers and the de nition of the spatio-temporal noise structure are provided in Section 3. Section 4 displays and discusses the outcomes of our stochastic model experiments. Finally in Section 5, we give a brief summary of our ndings and outlook of future research directions.

The 2-Layer QG Model . The deterministic equations
As mentioned above, we start from the non-dimensional inviscid 2-layer QG equations presented in [59] on a β-plane with double-periodic boundary conditions where q B = q B (x, t), ψ B = ψ B (x, t) and q T = q T (x, t), ψ T = ψ T (x, t) represent, respectively, potential vorticity (PV) and streamfunction of the barotropic and baroclinic mode on the horizontal plane x ∈ R at time t ∈ R, ∇ is the horizontal gradient, ∇ the Laplacian and J the Jacobian operator Since we employ a non-dimensional description, the domain has been rescaled to a π × π square. Double periodic QG models on a β-plane have been widely used in theoretical studies [8,9,29,45]. Here we consider layers of equal thickness and the parameter k d (given by the relation k d / = f /Nh where N = . · − is the Brunt-Väisälä frequency, h the mean depth of the layers and f ≈ f + βy the approximate Coriolis term where f = and β = . ) determines the strength of the shear between the two layers and hence also the intensity of the baroclinic instability. For most simulations in this study we are using a deformation radius of about .
non-dimensional units and, thus, are in an ocean like eddy-permitting regime. In this setting, one non-dimensional time unit corresponds to roughly . days.
Since we want to focus on the energy conservation properties of our numerical scheme we do not introduce terms accounting for external forcing or eddy dissipation, and instead consider an initial value problem. We want to stress, that we want to focus in this study on the numerical and accuracy aspects of energy conservation in a stochastic setting and not on geophysical ow realism (see section 2.2 below).
The system given by equations (1)-(2) conserves its total energy E and enstrophy Z: Details about conservation properties and how to derive them can be found in [59].
The Hamiltonian H of the system is given by its total energy, thus it reads It can be shown that For a general review of Hamiltonian mechanics and its application to geophysical uid dynamics see [2] and [55]. The following notation will be employed A : B = a ij b ij = Tr(AB T ) .

. The stochastic formulation
In this section we derive a stochastic energy conserving version of the 2-layer QG equations. In our formulation we include unresolved fast sub-grid processes by means of a stochastic forcing, modeled as an Ornstein-Uhlenbeck process, which we assume to act rst on the baroclinic mode and then, because of the coupling between the two modes, to a ect also the slower barotropic mode. For this choice we relate to the idea of backscatter, where energy goes from the smaller scales back into the larger processes. Therefore, we add a simple 2-dimensional stochastic eld only to the fast baroclinic mode. The source terms so introduced would lead the dynamics to leave the manifold of constant energy on which the deterministic model (1)-(2) evolves. In order to balance the stochastic uctuations that would a ect the aforementioned manifold, we introduce an auxiliary 2-dimensional stochastic process dY t . This procedure follows the method introduced by Frank and Gottwald in [16], with the di erence that we consider a high-dimensional system in a Eulerian framework, and not a 4-dimensional system with a Lagrangian description. The following set of equations is therefore proposed: where dW t denotes a 2-dimensional Wiener process, Γ, Σ, S t ∈ R × and B t ∈ R . The choice of adding the stochastic terms on the equation of the baroclinic PV not only reconnects to the concept of backscatter, but allows potentially also the application of stochastic mode reduction, as Frank and Gottwald did in their work [16]. For instance, one could derive a reduced order stochastic model for just the barotropic modes [43,42,44,19,18]. Instead of dealing with two di erent stochastic processes, we want to write S t and B t as functions of Σ and Γ. For that purpose, we write the increment of H as a sum of two parts, a deterministic part µ H including all the terms multiplied by dt, and a stochastic part σ H containing those with the Wiener process. By Ito's theorem we have where the transposed is denoted by the superscript T , and Since we want to conserve the total energy, dH has to be zero. Therefore we impose both µ H and σ H to be zero. Following the reasoning outlined in [16], the auxiliary stochastic process dY t should not perturb the dynamics on the tangent space and should be constructed only to counterbalance those components of the OU process which are orthogonal to the manifold of constant energy, thus we de ne a projection operator P ∈ R × . Since the Wiener process a ects only the evolution equation of the baroclinic PV, it will be su cient to project with respect to the manifold of constant baroclinic energy: where I ∈ R × stands for the identity matrix. Since P (∇q T H) = , P projects onto the tangent space of the baroclinic kinetic energy surface. Consequently, we want S t and B t to satisfy From the assignment σ H = we can easily determine S t . In fact, since ∇q T H is in the kernel of P, imposing σ H = is equivalent to requiring Σ + S t = P(Σ + S t ). Thus, using Eq. (6), we obtain Substituting Eq. (7) into Eq. (5) and considering only the terms arising from the inclusion of the stochastic processes into the deterministic set of equations (since the deterministic model conserves energy, the other terms do not contribute to the variation of total energy), we can determine B t from the requirement µ H = : Placing Eq. (7) and Eq. (8) into Eq. (4), after some manipulations, we get the following set of equations Equations (2)-(9) constitute our stochastic energy conserving 2-layer QG system. A detailed derivation of equations (8) and (9) is reported in Appendix A. As can be seen, the resulting set of equations contains multiplicative noise and nonlinear damping, due to the speci c de nition of the projection operator. The multiplicative noise is in fact a correlated additive multiplicative (CAM) noise [38,53,17].

Numerical implementation
Since we aim to analyze possible applications of this approach to climate and ocean models, which are typically formulated in terms of nite volumes or nite di erences, we discretize equations (2)-(9) in terms of nite di erences in the framework of a grid-point based code. Our discretization of the QG model is based on the energy and enstrophy conserving discretization scheme by Arakawa [1]. This scheme ensures that energy and enstrophy are conserved for all truncations. Especially this scheme does not require any numerical di usion or dissipation for numerical stability. This property a ects the energy and enstrophy cascades by automatically redistributing the energy and enstrophy at the truncation scales, making the model simulations unrealistic. However, using this discretization scheme will allow us to focus on the accuracy of the energy conservation of our stochastic approach.
For the time stepping we employ explicit Runge-Kutta (RK) methods (whose order will be a matter of discussion in the following section), and we use a Fast Fourier Transform (FFT) to invert the Laplacian operator and obtain the streamfunctions from the corresponding PV. Since FFT is an exact numerical method and the Arakawa scheme is designed to conserve energy and enstrophy for any truncation, the only spurious e ect on the energy due to the numerics is given by the RK method, which is known to be to a small extent dissipative in time. When dealing with the stochastic terms, we integrate them using either the Euler-Maruyama or Milstein schemes [48]. Finally, the initial distributions of the barotropic and baroclinic streamfunctions are generated using a pseudo-random number generator, i.e. no a-priori structure is given as input, and we de ne the corresponding PV by equations (2). We do not change the seed when de ning the initial condition, thus all simulations at resolution × start from the same initial condition. Once the initial condition is de ned, we set a new seed for the noise generator. Even though we do not have any forcing the model does not settle into a barotropic regime; the baroclinic modes are still active with active barotropic and baroclinic mode interactions (not shown). Furthermore, the probability density functions of the barotropic-baroclinic energy transfer terms are symmetric (not shown), suggesting an active interaction between barotropic and baroclinic modes.

. Deterministic model
Before moving to the stochastic set of equations and related results, we test di erent orders of accuracy of our numerical scheme in the implementation of the deterministic 2-layer QG model (equations (1)-(2)) in order to nd the optimal balance between accuracy and computational time.
In our code, we solve the above evolution equations (1) for the PVs and then we compute the corresponding streamfunctions through equations (2). We use explicit Runge-Kutta nd and th order methods for the time integration, Arakawa nd and th order discretizations of the Jacobian [1] and a Fast Fourier Transform to invert the Laplacian operator. While performing our tests, we also consider di erent values of the mean depth of the uid h; in particular we consider the cases h = , , . These tests are performed over a × spatial grid with a time step of ∆t = − . We do not report here all the statistics and energy graphs obtained with the di erent combinations of solvers, but show only those motivating our choice to employ RK4 and Arakawa 4 th order in the following.      It can be seen that also after the initial spin up period, which has been here neglected, energy increases in time when solving with RK2 and 4 th order Arakawa. Figure 1 shows that the lower order combination of RK2 with Arakawa 2 nd order does not reproduce accurately the autocorrelation function (ACF) in the case h = and that just increasing the order of Arakawa's discretization is enough to capture correctly the ACF. However, when combined with RK2 it does not conserve energy in the case h = also after the initial spin up period (Fig. 2, where the spin up period has been neglected). More generally, we found that RK4 with Arakawa 4 th order is more reliable and that the scenario with h = has a less discernible spin up period and it is the best reproduced case study with all the con-sidered solvers, therefore we decided to employ this higher order numerical implementation and we xed h = . As a consequence of this choice, the Rossby deformation radius /k d is approximately . and we are in a ocean-like regime with small eddies. Furthermore, while the energy uctuates around a mean value the uctuations are relatively small; the energy uctuation amplitude is less than 1% of the total energy. With the chosen numerical solvers also enstrophy is conserved by the system to a similar accuracy, as in the continuous scenario (not shown).

. Stochastic equations
As in the deterministic case, we rst solve the stochastic evolution equations (9) for the PVs and then we get the corresponding streamfunctions through equations (2). As a consequence of the analyses in the previous paragraph, we use an explicit RK th order method for the time integration, Arakawa th order discretization of the Jacobian [1] and FFT to invert the Laplacian operator. The stochastic terms are integrated using either the Euler-Maruyama or Milstein schemes [48]. Later we will analyze di erences in the outcomes due to the stochastic solver. In the stochastic simulations we employ a × spatial grid with a time step of size ∆t = − . As a consequence ∆x ≈ . and the model is in the eddy permitting regime. As we will demonstrate below, for the dynamical consistency between deterministic and stochastic models it is crucial to consider spatially correlated noise. To demonstrate this, we consider two scenarios: in the rst we assume that the noise on each grid point behaves as independent and identically distributed (iid) random variables, while in the second we allow for correlation between di erent grid points. In the following subsection a more detailed description of how the correlation matrix of the noise is constructed can be found. Finally, in order to generate the noise, we rst produce uniformly distributed random numbers using the Mersenne-Twister algorithm [46], and then we reshape them through the Box-Muller procedure in such a fashion that they are sampled from a Gaussian distribution with the desired mean and variance, which in our case is N µ = , σ = ∆t . We compare the outcomes with a reference solution given by a deterministic simulation with × grid points and ∆t = − . For a fair comparison, we project the ne grid data onto a grid with the same resolution as for the stochastic simulations.

. Spatial noise structure
For allowing spatial correlations among di erent grid points, we need to determine the elements of the matrix Σ. We do this using eigenvectors obtained from a dimension reduction technique. Here we employ Empirical Orthogonal Functions (EOFs) [58]. We derive the EOFs from the high-dimensional deterministic control simulation.
Once the eigenvectors and the corresponding eigenvalues are computed, we select a number of EOFs and de ne Σ as a convex combination of the chosen eigenvectors E i . Such technique has been applied already in [25,24]. A more general linear combination could be used and it would be easy to modify this constraint in order to attribute a stronger (or weaker) amplitude to the noise. The weights ω i are selected as uniformly distributed random numbers, i.e., where the eigenvectors are matrices with dimensions equal to the grid size. Coe cients related to the redundant eigenvectors are set to be zero. In view of the fact that the noise is only in the equation of the baroclinic mode, we use ψ T -EOFs computed with respect to the L norm using the data of the high resolution run projected onto the coarser grid. Considering that the weights ω i are chosen randomly in each simulation, no particular direction is preferred with respect to the others reducing possible biases in the results due to how the EOFs have been computed.

Results of the stochastic simulations . Space-time independent noise
For reasons that will become evident later, in this scenario we perform the stochastic integration only with the Euler-Maruyama scheme. As assessment criteria, we rst look at the conservation of energy and then at other statistical properties like the autocorrelation function (ACF) and probability density function (PDF).
Here we consider iid white noise with zero mean and variance equal to the time step. Therefore, the matrices Σ and Γ can be written as where I ∈ R × stands for the identity matrix and σ, γ ∈ R. We show the results for the case σ = and γ = .

. . Numerical results
In this case, after a positive jump at the beginning of the simulation (because of the forcing, the system moves to the closest stable state, which has a di erent amount of total energy), energy is conserved (see Figure 3a) with very small uctuations in time (see Figure 3b). In fact, when compared to a deterministic run at the same resolution (see Figure 3c), quite unexpectedly, variations in the total energy graph are smaller in the stochastic simulation. Still, looking at Figure 3b, it is possible to notice a small decreasing trend suggesting that the parameterization is damping in the long run which is likely due to the Runge-Kutta scheme, which is known to be damping for.  A less reassuring result is given by the contour plot of the baroclinic mode. In Figure 4 we show the baroclinic streamfunction (but a very similar result can be observed also for the baroclinic PV) at time t = (left) and the plot of the same eld at the same time given by our reference solution (right). What immediately stands out is the di erent pattern presented by the two gures. Furthermore, at a closer look it can also be noticed that the di erent colors in Figure 4a represent di erences in the order of − and they become even smaller when looking at the contour plot for later times (not shown), meaning that the eld is moving towards a constant state in space. This explains the smaller amplitude of the uctuations in the energy graph with respect to the deterministic scenario, and it is re ected also in the ACF and in the PDF (see Figures 5-6). The former displays longer decorrelation times in general and, more speci cally, the baroclinic streamfunction T at t=200  and by the high resolution deterministic simulation (right). The reader will immediately notice the di erent patterns displayed by the two pictures and, at a closer look, that the colors in the left graph represent di erences in the order of − , which get even smaller with the developing of the simulation, implying that the eld is moving towards a constant state in space.
seems to require a longer decorrelation time with respect to the barotropic streamfunction which is in contrast with the physics. The latter instead shows no Gaussian distribution for the baroclinic mode and smaller variance for the barotropic mode.   In their paper [16] Frank and Gottwald used iid noise obtaining conservation of energy and physically reasonable outcomes. As already stated earlier, they employ a Lagrangian discretization of the system, while we consider the dynamics from a Eulerian point of view. Since Lagrangian descriptions of motion follow the trajectories of the single particles, and not the uid as a whole in a xed domain, in this frame the main purpose of the noise is simply to perturb the trajectory while remaining on the manifold of constant energy. Hence any spatial information added to the noise is not strictly necessary. On the other hand, Eulerian descriptions focus on what happens inside a well-de ned domain and do not care about the behavior of the single particles. Thus in this framework spatial iid noise means that each grid point does not feel the in uence of its  neighbors, hence the noise would represent phenomena which fully evolve and decorrelate inside the cell; the gap between the large resolved scales and such small phenomena is too big to be correctly resolved by the numerics. Therefore it appears crucial to de ne a spatial structure of the noise in order to characterize how the noise should behave inside the domain and interact with the deterministic dynamics. The next section discusses in more detail our results.

. Space-time correlated noise
To ease computations, we neglected the Γ term in equation 9b, i.e. Γ = . For both the Euler-Maruyama and Milstein schemes, we run an ensemble of 40 simulations using a convex combination of the rst two EOFs to build the covariance matrix Σ. We tried also combinations with a di erent number of EOFs. When considering up to the rst EOFs, similar results to those we report here are obtained. With or more EOFs we noted slightly worse performances of the scheme. Because of the constraint i ω i = , when considering relatively many EOFs, each of them has a small amplitude and then the patterns contrast with each other resulting in a not well-de ned structure. On the other hand, a combination of a smaller number of EOFs can still maintain the individual patterns while allowing interaction with each other. In what follows, we opted for using only the rst two main patterns.
As evaluation criteria, we employ the same analyses as before. Regarding the PDF, we also compute the rst and second moments of the centers in order to investigate the ensemble variance. In addition we compare to the reference solution: the total variance and eddy length (computed through space correlations as presented in [3]).

. . Energy conservation
In each simulation the total energy uctuates in time around a constant value. Di erently from the previous case, there is no jump to a di erent stable state at the beginning of the time integration, meaning that our stochastic system keeps its evolution on the manifold de ned by the initial condition. In both ensembles, if we compare the amplitude of the anomalies A Anom with respect to the mean value of the energy µ En , we see that A Anom is, for most of the running time, around . % of µ En with spikes no greater than . %. We would like to point out that, even though for each individual simulation the evolution of the eld variables is different (and this is shown by the fact that the PDFs of each individual run are centered in di erent locations), the total energy of the system is almost the same at each time step, with di erences in the order of − , for each ensemble member. This implies that, starting from the same initial condition the model is exploring di erent possible con gurations available with the de ned amount of total energy. In Figure 7 we show the time-evolution of the energy anomalies for an individual Euler-Maruyama (Milstein) ensemble member together with the total energy anomalies graph of a deterministic simulation at the same resolution. It can be seen that uctuations are roughly of the same amplitude for both the deterministic and the stochastic system. Thus, the energy uctuations are mainly a result of the deterministic numerical scheme and not of the used stochastic scheme. This shows that the projection operator works very well in high-dimensional models and suggests that one should improve the discretization of the deterministic part in order to ameliorate energy conservation of the stochastic system.

. . ACF
Every ensemble member shows roughly the same ACF pattern, independent of the stochastic solver. Di erently from the previous case with iid white noise, in both ensembles we obtain decorrelation times very close to the reference. The barotropic streamfunction displays a longer decorrelation time with respect to the baroclinic streamfunction, suggesting that in future work a stochastic mode reduction might be performed for eliminating the baroclinic modes and having a stochastic barotropic model as in [43,42,44,19,18]. In Figure  8 we show the ACF for one stochastic simulation of the Euler-Maruyama (Milstein) ensemble together with the reference solution.

. . PDF
In contrast to the case with iid noise, in each run of both ensembles we recover the Gaussian behavior of the baroclinic mode displayed by the reference solution and more variance for the barotropic, see Figure 9 for the PDF graph of an individual Euler-Maruyama (Milstein) ensemble member and the reference solution. On the other hand, the PDF of an individual stochastic run shows less variance with respect to the reference solution except for the barotropic streamfunction which, in the comparison, shows more (this can be noticed in Figure  9). Hence we decided to investigate the variance of the ensemble by looking at the rst and second moment of the center of the PDFs of the ensemble members. While we are not too much interested in the exact value taken by the rst moment, due to the chaotic nature of the system, we would like to point out that, the Milstein ensemble displays more variance with respect to the Euler-Maruyama scheme. In Table 1 we report the %  con dence interval (CI) of the centers of the PDFs; those related to the Milstein scheme span a wider range of values. We would like to remind the reader that the Euler-Maruyama and Milstein schemes have the same order of weak convergence (i.e. it is for both schemes), but di erent order of strong convergence ( . for Euler-Maruyama and for Milstein). Hence, since we are considering long time simulations, statistical properties of the eld variables are more sensitive to weak convergence, while the evolution of trajectory paths is more sensitive to strong convergence. This explains why both ensembles catch the right shape of the PDF but at the same time the Milstein ensemble displays more variance. It could be argued that . might not be a meaningful di erence; on the other side, high order stochastic integration methods include complicated correcting terms which might be hard to implement, see for instance [54] for an example of necessary conditions that have to be satis ed by a class of stochastic integration methods with (strong) order . . Hence here we tried to analyze two of the most likely employed methods in complex climate models. We also checked the total variance of the stochastic ensemble and compared to a deterministic run at the same resolution and to the high resolution deterministic simulation. Unfortunately our stochastic parameterization is still not able to mend for the variance lost with the coarsening of the grid (see Table 1).   Table 1).

. . Eddy length
When looking at the contour plots for the baroclinic (but also the barotropic) mode, we can see that in this scenario the stochastic simulations reproduce a similar pattern as for the case of the reference solution (see Figure 10 for a representation of the barotropic and baroclinic streamfunctions in the di erent setups). In order to have a more objective comparison criterion, we computed the eddy length for the two streamfunctions as a measure of the correlation among di erent grid points at a xed time and looked at the e-folding scale; see for instance [3] for a more detailed description. In Table 1 we report the outcomes for the reference solution, a coarse deterministic simulation, Euler-Maruyama and Milstein ensemble. When considering the baroclinic mode, our stochastic parameterization does not improve the eddy length. In both ensembles it is about half of the baroclinic eddy length displayed by the high resolution simulation, remaining close to the outcome of a coarse deterministic run (see Table 1). A di erent conclusion is valid for the barotropic mode. In fact, in the case of the low resolution deterministic run we get an eddy length of circa . both in zonal and meridional direction, while in the stochastic Euler-Maruyama (Milstein) ensemble it can vary between ≈ .
). Comparing these results with the reference solution, we can notice that we still did not manage to reproduce the high resolution eddy length (≈ . − . ) but we obtained an improvement of circa % with respect to the low resolution deterministic simulation (see Table 1). This result suggests that the perturbation induced by the noise is still not strong enough, but we are heading in the right direction. Furthermore, as has already been shown in [60] and references therein, the error dynamics considerably depends on the speci c scale at which it is introduced, with a faster growth when located at small spatial scales. Here the noise structure is built with the rst two EOFs, hence it can be regarded as a perturbation on the large spatial scales, which is in good agreement with the improved eddy length for the slower mode. Nevertheless, as stated in [58], if the rst EOF can be associated with a de nite physical process, this is more di cult already with the second (and even harder for higher-order) EOF because of the orthogonality constraint. On the other hand, real-world processes might not have orthogonal patterns. In fact, the patterns that most e ciently represent variance do not necessarily have anything to do with the underlying dynamical structure.

Conclusions and perspectives
We described the numerical implementation and evaluation of an energy conserving high-dimensional stochastic conceptual climate model. Our main focus here is the proof of concept whether the projection operator approach [16] can be applied to high-dimensional complex geophysical ow systems in a Eulerian setting ( [16] developed this approach in a Lagrangian setting for a low-order model). Furthermore we investigate which assumptions regarding the noise should be considered in order to obtain not only energy conservation but also dynamically consistent results. For this purpose we used our QG model without forcing, dissipation and hyperdi usion. Even though, the resulting circulations are less realistic when compared with the atmosphere and the oceans, this setup provides an ideal test bed for the numerical evalution of energy conservation of our numerical scheme.
In their paper, Frank and Gottwald [16] derived an energy conserving stochastic formulation for a 4dimensional multi-scale toy model of the atmosphere. In order to preserve the conservation of energy they projected the noise with respect to the manifold of constant energy in such a fashion that those components of the noise, which would lead the trajectory to leave this manifold, are eliminated. In this paper we brought forward this approach and applied it to the high-dimensional 2-layer QG model through its Hamiltonian formulation. With the idea of analyzing the applicability of this procedure, not just to simple models but to a wider range of models with di erent degrees of complexity, we discretized the evolving equations in a Eulerian framework by means of nite di erences, i.e. in a similar setup as most climate and ocean models. We could also introduce a time-scale separation parameter ε, depending on the di erent time scales of barotropic and baroclinic modes, to account for the time scale separation between the two modes. Even though here we   Table 1).
focused on other issues and did not consider any time scale separation in our numerical simulations, stochastic mode reduction is a possible research direction to be followed. In particular, we investigated the delicate step from a continuous to a discrete formulation and found that the numerics can be sensitive to the mean depth of the uid, and hence chose solvers that reproduce correctly the properties of the system, e.g. conservation of energy, in the most general scenario. Once this aspect had been settled, we analyzed the e ects on the system dynamics and statistics due to the introduction of the stochastic process. Mainly we compared the results for two di erent scenarios: in the rst, the noise of each grid point behaves like an iid random variable while, in the second, we considered spatio-temporal correlations. We found that employing iid noise leads to either that energy is not conserved or to unphysical results and hence de ning a spatio-temporal structure is important to respect the underlying dynamics of geophysical ows and for the conservation of energy and the preservation of important statistical properties, e.g. PDF. This is due to the Eulerian nature of our implementation. Frank and Gottwald employed a Lagrangian description which follows the trajectories of the single elements. Hence in their model the noise had the unique purpose of perturbing the trajectories while remaining on the manifold of constant energy. On the other hand, a Eulerian point of view looks at a well-de ned domain and considers the uid as a whole. Therefore in this frame the noise should perturb the dynamics while conserving energy and preserving the main properties of the uid.
In the present work a convex combination of the rst two EOFs, computed on the data of a high resolution deterministic run, have been used to de ne the spatio-temporal correlations consistent with the behavior of the deterministic system; other dimension reduction techniques, such as [4,10,14], could be used too. We did not recover the same amount of variance as in the high resolution simulation, but the eddy length in the barotropic mode is improved. This suggests that the stochastic perturbations are not strong or spatially Table 1: Summary of some of the previously discussed analyses in the cases of, from left to right: the reference solution, a deterministic run at the coarse resolution, Euler-Maruyama and Milstein ensembles. In particular we report: location of the center of the PDF and, in case of the ensembles, its % CI, total variance and eddy length (the latter computed only for barotropic and baroclinic streamfunctions).  ] coherent enough. Another possible explanation is the lack of temporal memory in our scheme [27]. Memory terms have been included in many techniques, such as multi-level regression models [36,35,40,30], showing encouraging results. However they might be rather complicated to implement and might lead to unstable and diverging simulations as reported in [13] in the case of the Wouters and Lucarini parameterization [61]. Investigating the impact of spatial coherence and memory in the noise will be part of our future research. Two basic arguments are that the constraint i ω i = for the noise amplitude was arbitrary and it could be changed in order to attribute a stronger (or weaker) amplitude to the noise; moreover we computed the EOFs with respect to the Euclidian norm and not to the total energy norm. A more philosophical discussion regards instead the usage of the EOF technique itself. In fact it is sensible that using the rst two EOFs improves the dynamics of the large scales since the rst EOFs can be easily associated to large scale dynamics. Going down the ladder, because of the orthogonality constraint, it becomes harder and harder to associate EOFs to well-de ned physical phenomena and hence also to the smaller scales [58]. As has already been shown in [60] and references therein, the error dynamics is considerably dependent on the speci c scale at which it is introduced, with a faster growth when located at small spatial scales. In spectral models this obstacle is easily resolved, since the wavenumber where the noise should be introduced (choosing therefore its spectral properties) can be selected directly. In a grid-point framework this is not the case. Further studies in this direction will be done in order to gain this ability also when using a grid-point discretization since most climate and ocean models are based on this type of numerics and will be reported elsewhere.  (2), de nes our stochastic 2-layer QG system.