Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter Open Access June 2, 2016

Application of multivariate storage model to quantify trends in seasonally frozen soil

  • Jonathan Woody EMAIL logo , Yan Wang and Jamie Dyer
From the journal Open Geosciences

Abstract

This article presents a study of the ground thermal regime recorded at 11 stations in the North Dakota Agricultural Network. Particular focus is placed on detecting trends in the annual ground freeze process portion of the ground thermal regime’s daily temperature signature. A multivariate storage model from queuing theory is fit to a quantity of estimated daily depths of frozen soil. Statistical inference on a trend parameter is obtained by minimizing a weighted sum of squares of a sequence of daily one-step-ahead predictions. Standard errors for the trend estimates are presented. It is shown that the daily quantity of frozen ground experienced at these 11 sites exhibited a negative trend over the observation period.

1 Introduction

Climate change has been shown to be disproportionately significant during the cold season in the mid and high latitudes[1]. Indeed, many studies indicate that the northern high latitudes are experiencing climate change in the form of increasing near-surface atmospheric temperatures [1] and decreased snow pack [2]. Such changes could have impacts on cool season ecosystem health as well as warm season water availability due to snow runoff; therefore, assessment of changes to these environments as a result of climate variability is critical. While 55% of the unglaciated land area in the northern hemisphere is associated with frozen ground, more detailed methods of analysis are needed to determine the effect of climate change on seasonally frozen ground [3].

Periodic time series models may be used to model the dynamics in the ground thermal regime. Seasonally frozen ground is defined as soil that freezes and thaws annually. The phrase "seasonally frozen ground" is a bit ambiguous in that it may reference two distinct processes, one of which occurs in the summer, and one of which occurs in the winter. First, the active layer of permafrost is defined as the layer of soil that thaws during the warmer summer months. Modeling the seasonal dynamics and spatial trends in the active layer has been the subject of several recent studies [4]. Long term increases in active layer depth are associated with permafrost declines from the top down [5-7]. Such melting has broader ramifications as it introduces large quantities of methane gas into the atmosphere [8] and may also reintroduce long frozen strains of bacteria into the environment [9].

A second possible meaning of seasonally frozen ground references a soil thermal regime in the absence of permafrost. In this scenario seasonally frozen ground is defined as soil that only freezes in winter. For example, frost that may form during winter months in more temperate climates may be referenced as seasonally frozen ground. The frozen portion of the soil thermal regime has impacts on anthropogenic construction and agriculture. As such, a more detailed study of the process is suggested in this article.

For the remainder of this article, the term seasonally frozen soil will refer specifically to soil that only freezes in winter, not the active layer in permafrost. With proper modifications, the methods developed below would be suitable for a study on the active layer of permafrost, as will be discussed in the final section of this study.

Currently, the method for studying seasonally frozen ground relies on a single observation for each year. The maximum annual Seasonal Freeze Depth (SFD) has been used to quantify long-term trends in seasonally frozen ground [10]. The SFD is dependent on multiple factors such as temperature forcing, warm season precipitation, site specific soil types, moisture content, and snow cover [11]. Therefore, SFD may be used as a indicator of environmental changes associated with global climatic change [12].

This paper introduces a new metric and methodology to provide the first data driven statistically detailed study to assess trends in seasonally frozen ground. Specifically, a time series of daily observations of the quantity of frozen soil is constructed. As the total quantity of frozen soil may be considered a stored quantity, a storage model from queuing theory is adopted. In particular, a periodic Markov storage model is fitted to a time series of daily estimates of the quantity of frozen soil. Similar stochastic storage type models have been used to model time series of daily snow depths [13, 14]. The measured quantity of frozen soil constructed here is different than the SFD, in that this model allows the soil to alternate between frozen and thawed, perhaps multiple times, at a fixed time, as measurement depth increases. Thus, the total quantity of frozen soil at a given observation site and time is simply the cumulative sum of all layers of frozen soil. This method has potential to capture more of dynamics of the soil thermal regime than the SFD. It is noted that the methods here need not only be applied to quantities of soil below freezing. Indeed, any temperature range of interest may be examined, such as the quantity of soil exceeding some temperature threshold, which may be of interest for future studies in forest and agricultural resources.

This paper focuses on data from surface observing stations in North Dakota for the analysis of seasonally frozen soil. The climate of North Dakota is described as a continental climate subject to cold winters and hot summers. The Köppen classification of the climate is simi-arid in the western half of the state and transitions to humid continental climate in the eastern portion of the state. Seasonally frozen ground appears throughout the state during the colder months of the year.

The remainder of this paper proceeds as follows. Section 2 describes the data set. Section 3 details data preparation. Section 4 introduces the multivariate storage model of the soil freeze process. Section 5 discusses parameter inference, and Section 6 discusses the results, and Section 7 gives concluding remarks.

2 The NDAWN data

The North Dakota Agricultural Weather Network (NDAWN) is a collection of 72 observation stations located across the state and bordering regions in the United States. The reported variables are atmospheric temperature, rainfall, wind speed, wind direction, and solar radiation. In addition, deep soil measurements are recorded at eleven NDAWN stations. The deep soil temperature will be the focus of this study. Table 1 summarizes the location information for the eleven sites in North Dakota that report deep soil temperatures.

Table 1

Summary Statistics for the eleven stations reporting deep soil termperatures in North Dakota.

Station LocationLatitudeLongitudeElevation
Bottineau48°49’38”N100°26’44”W1637 ft
Carrington47°26’58”N99°07’34”W1588 ft
Dickinson46°52’23”N102°47’23”W2411 ft
Fargo46°52’38”N96°47’23”W904 ft
Grand Forks47°55’31”N97°01’58”W843 ft
Harvey47°46’11”N99°56’07”W1598 ft
Hettinger47°26’58”N99°07’34”W2697 ft
Langdon48°45’36”N98°22’05”W1611 ft
Minot48°13’59”N101°17’32”W1621 ft
Streeter46°39’28”N99°21’28”W1923 ft
Williston48°08’49”N103°37’04”W1877 ft

Data are reported each day, and an initial quality control check identifies missing and erroneous errors for each reading. The process is described at the NDAWN website, which may be found at www.ndawn.ndsu.nodak.edu/help-overview.html. Although initial QC methods are applied, the data is not guaranteed for commercial use.

The deep soil data are measured by type T thermocouples (extension grade, copper/constantan, duplex, ANSI type TX) and are set at varying depths at each station [15]. The measurement depths and the length of the data record are listed in Table 2. Daily temperature observations are recorded from a depth of 1 cm to a maximum depth of 1170cm. Note that the data record for the Fargo station is longer and measurements are taken at deeper depths than the other sites.

Table 2

The duration and depths of the data record at the eleven NDAWN stations recording deep soil temeratures.

North Dakota Agricultural Weather Network Meta-data
StationIndexSampling DurationSampling Depths (cm)
Bottineau = 112/01/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Carrington = 208/18/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Dickinson = 307/09/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Fargo = 409/09/1992~12/31/20121,5,10,20,30,40,50,60,80,100,125,150,175,
200,250,270,300,370,470,570,770,970,1170
Grand Forks = 507/03/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Harvey = 610/21/1994~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Hettinger = 708/14/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Langdon = 808/28/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Minot = 908/18/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Streeter = 1009/03/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175
Williston = 1109/24/1993~12/31/20125,10,20,30,40,50,60,80,100,125,150,175

3 The stored quantity of frozen soil

Previous investigations of seasonally frozen ground discuss the SFD. This metric only models the extremes of the soil thermal process however, in that the SFD only records one observation-the maximum freeze depth each year. Since the SFD measures the yearly maximum depth, and trend assessment may be sensitive to outliers, a new method for the assessment of trends in the seasonal ground freeze process is posited. In this study, frozen soil is defined as soil that is at or below zero degrees Celsius. The Stored Quantity of Frozen Soil (SQFS) will be the sum of the thickness of all layers of frozen ground. This section will develop the SQFS time series.

First, we note that the ground freeze portion of the soil thermal regime is limited to the late fall to the subsequent spring in the study region. As such, the yearly soil freeze process occurs over the course of two calendar years. For ease of exposition, we will index the yearly soil thermal regime in "winter-centered years". To this end, we set July 1st to be the first day of the winter-centered year and June 30 to be the last day of this year. We will deal with leap years by removing the June 30 observation in the calendar year that February 29 occurs. There are no recorded years in which any portion of the soil is at or below zero degrees Celsius on June 30 for any site in this study.

Let {St(d)} be a time series of daily soil temperatures at depth d of location = 1,…,11. The site index defines the station index which is listed in Table 2. The temperatures are sampled at times t for t = 1, …, n where t = 1 corresponds to July 1, 1992 (the beginning of the winter centered year with the first recorded data) and n = 7665 corresponds to June 29th, 2012. Since the ground thermal regime is assumed periodic, we introduce the periodic seasonal representation for each time t as t = k × T + ν, where T = 365, ν = 1, …, T represents the day of the winter centered year, and 0 ≤ kf− 1 represents the number of full years since t = 1. Note that f = 21 years of observations are studied is this article. The first day an observation is recorded (at the Fargo station) occurs on September 9, 1992, which is indexed t = 71, where in seasonal indexing ν = 71, and k = 0.

Now, the set of possible depths, d, for each station, and the station index are listed in Table 2. As an example, at = 4, and d = 10 on day t, then St4(10) is the soil temperature at a depth of 10cm on day t = ν+k × T at the Fargo station.

Figure 1 displays the soil temperature at 10 of the twenty-four different depths recorded at the Fargo station from June 15, 2005 to June 20, 2006 (10 are displayed to avoid unnecessary visual clutter of presenting all 24). Note the soil temperature observations close to the surface experience a larger range of values than the deepest observations, which seem to lack any apparent yearly seasonality. Figure 1 also depicts a line at zero degrees Celsius. Here we quantify the total quantity of frozen soil at or below zero degrees Celsius. In order to estimate the total quantity, one must identify the depth(s) at which the temperature crosses zero degrees Celsius. Towards estimating such a depth, a linear interpolation of the temperature between successive observation depths is constructed. That is, the temperature on day t, location , at depth d* such that d1d*≤d2 is set to be

St(d)=St(d1)+dd1d2d1St(d2)St(d1).(1)
Figure 1 Soil temperature time series at various depths.
Figure 1

Soil temperature time series at various depths.

Figure 4 Frozen Ground Thermal Series, 11 Sites.
Figure 4

Frozen Ground Thermal Series, 11 Sites.

Then, after finding all the depths d* such that St(d*)=0, check the depths at which the sign of the temperature in degrees Celsius changes at depth d*. Consider such a depth as the boundary between soil that is frozen and not frozen.

Figure 2 depicts the soil thermal regime to a depth of 200 cm at the Fargo station. The top panel represents the soil temperatures recorded on February 4, 1998. This is a typical temperature mid-winter temperature signature in that the coldest regions of the soil are at the surface, where nighttime low temperatures induce very cold temperatures on the surface of the soil. The second panel depicts a common temperature pattern for the spring frozen soil. As the higher daily temperatures warm the soil, the upper layers appear to thaw. Note however that a layer of frozen soil is still present from a depth of 11cm to a depth of 29.5 cm.

Figure 2 Frozen soil thermal process in mid-winter verses early spring. The stored quantity of frozen soil on day t is the depth of all layers of frozen soil. Typically, only one such layer is present, but may be below soil that is not frozen. There are few cases where multiple layers are present in the data, and the SQFS adds the depths of each.
Figure 2

Frozen soil thermal process in mid-winter verses early spring. The stored quantity of frozen soil on day t is the depth of all layers of frozen soil. Typically, only one such layer is present, but may be below soil that is not frozen. There are few cases where multiple layers are present in the data, and the SQFS adds the depths of each.

Figure 3 Fargo Quantity of Frozen Ground Time Series 1992-2012.
Figure 3

Fargo Quantity of Frozen Ground Time Series 1992-2012.

On occasion, multiple layers of frozen soil may be observed. This typically occurs in spring as near-surface air temperatures oscillate about 0oC. If the upper layer (most shallow portion) of the frozen soil is appropriately deep, so that a late season cold snap refreezes the top portion of the soil, but not to the depth of this uppermost layer of frozen soil already present, then a set of two distinct layers of frozen soil is produced. This is not observed very often in the data (perhaps once or twice a year), but this is allowed for in the construction of the SQFS values below. A model for the frozen soil process that allows for such dynamics is now posited.

Let Ut(1) denote the depth of the upper depth of the first layer of frozen ground, and Lt(1) the lower depth of the first layer of frozen ground on day t at location . Here, the number 1 denotes the most shallow layer of frozen soil. By convention, Ut(1)=0 cm if the ground is frozen at the most shallow soil measurement depth. Noting that L1(t)>U1(t) by construction, the term (Lt(1)Ut(1)) approximates the stored quantity of the first layer of frozen ground. Letting N(t) denote the number of frozen layers, the total depth of frozen ground on day t at location may be written

Xt=r=1N(t)[Lt(r)Ut(r)],(2)

where N(t) denotes the number of distinct layers of frozen soil at location on day t. The SQFS values presented in Figure 2 are thus

X20434=37.50=37.5

and

X20944=29.511=18.5

centimeters on February 4th, 1998, and March 26th, 1998, respectively. Let the sequence {Xt} denote the time series of the Stored Quantity of Frozen Soil (SQFS) at location . The next section introduces a stochastic storage model that will be used to assess trends in the stored quantity of frozen soil.

4 The model

This section introduces a discrete time multivariate periodic storage model for the SQFS process. Let {Xtp×1} be a sequence of p × 1 random vectors containing each of the SQFS values of Xt for 1 ≤ p on day t. Since the number of stations in this study is 11, p = 11. A stochastic storage model for the SQFS process {Xtp×1} is

[Xt1Xtp]=[max{Xt11+Ct1,0}max{Xt1p+Ctp,0}],(3)

where Ct for 1 ≤ ≤11 quantifies the change in SQFS from day t − 1 to day t. For notational convenience, let the vector

Ctp×1=[Ct1Ctp](4)

summarize the statistical changes, Ct , at each of the p = 11 sites. We assume that Ctp×1 is a multivariate white noise process that is independent of Xt1p×1,Xt2p×1, with periodic mean Mtp×1(Θ)=EΘ[Ctp×1] where EΘ[·] denotes statistical expectation as a function of Θ. For computational convenience, the distribution of Ctp×1 is assumed to be multivariate normal. The expected value EΘ[Ctp×1] will have trend components as discussed below. The periodic covariance function is Σt=Cov(Ctp×1) . The variance of {Ctp×1} is assumed to be periodic and is indexed in seasonal notation as

Cov(Ck×T+νp×1)=Σν,(5)

noting the temporal dependence of Σt on t may be expressed only in terms of the day of year ν. Further discussion on modeling the structure of Σν is provided below. Note that {Xtp×1} is a Markov chain (in a periodic sense), thus allowing for correlation in {Xtp×1} . The maximum in (3) prevent {Xtp×1} from being negative so as to preclude the possibility of observing a negative value of Xt . This is a key model feature for processes involving quantities that are seasonally absent.

Now, returning to Mtp×1(Θ) , let the seasonal dynamics of the mean process be modeled by

EΘ[Ctp×1]=Mtp×1(Θ)=Mt1(Θ1)Mtp(Θp).(6)

The seasonal mean function of location in Equation (6) at time t is

Mt=𝟙(t,)(A,0+k=1r[A,kcos(kwt)+B,ksin(kwt)].+.Θt(7)

where the summation quantifies a rth order Fourier fit of the seasonal mean, Θ quantifies the linear trend in Ct (and is the parameter of interest in this study), and 𝟙(t,) is an indicator function that is defined to be zero at times of the year when frozen ground is not likely present, and will also handle missing and quality controlled data below. The next section discusses estimation of the parameters in Equation (7).

5 Model estimation

Parameter estimation for the model in Equation (7) is now considered in detail. Since the model in (7) is a Markov Chain, one could pursue a likelihood approach. However, to ease numerical difficulties, one can reduce the overall model parameterization with a few simple assumptions presented below. However, these modeling assumptions are not particularly conducive to a likelihood approach. Therefore, a conditional expectation approach is considered whereby the trend parameter estimates are obtained by minimizing a sum of squares of one-step-ahead prediction errors.

Note that if the parsimonious first order Fourier approximation (r = 1) model is adopted in Equation (7), then 5 × 11 = 55 parameters must be estimated for the mean function in (7). The large number of parameters leads to difficulty in numerically estimating the trend parameters of interest. Therefore, a simplifying modeling method is considered in which the seasonality is removed before trend estimates are obtained. Specifically, inference for A,k, B,k, w for = 1,…,11 and k = 0,1 in Equation (7) are first fit to each of the 11 stations individually during what will be defined as the winter freeze season. Estimation of the 11 trend parameters is then performed by minimizing a sum of squares of one-step-ahead prediction errors with respect to only the trend components in Equation (7).

Now, when a positive store of SQFS is present at each station, one may approximate the day by day change in the SQFS as

CtXtXt1.(8)

The probability distribution of both sides of Equation (8) are nearly equivalent when the SQFS is so large that a thaw is not likely drive Xt towards the state space boundary of zero at any location . This is typical of the mid winter soil thermal regimes in this study, and as evidenced in Figure 1, for example. Additionally, if the approximation in Equation (8) is assumed, then one must be wary of the physical interpretation of the behavior of {Ct} during the spring, summer, and fall months. Consider the behavior of Cn, when ν represents a mid-summer’s day. Typically, no frozen soil is present, and any contribution to the mean change in frozen soil is thus zero, leading to the conclusion EΘ[Cνp×1]=0 when ν occurs during the summer season. However, this doesn’t seem physically meaningful, for if the quantity of frozen soil is positive (Xν>0) at some location during summer season at day ν, then one would the expect a SQFS to thaw more rapidly towards zero in the coming days given typical summer temperatures. Indeed, a more natural model would consider a warm season freeze as an outlier, or a large deviation from the expected deep soil temperature signature.

In order to address this, the indicator function in Equation (7) is put to zero at station = 1, … ,11 during times of the year when the quantity soil considered frozen is positive in less than 50% of the observed years. Define the days of year ν when the indicator function is set to 1 at location as the freeze season at this location. In particular, set the start of the freeze season at station as the first time in the winter centered year when the SQFS is positive more than 50% of the observed years, and likewise set the end date of the freeze season as the last day of the year in which the SQFS is positive in 50% of the observed years. Figure 5 graphically depicts this selection process at the Fargo station. This method is then applied to station = 1,…,11 to determine the freeze season at each station. The results of the freeze season’s estimated starting and ending dates are depicted in Table 3. Note that the freeze season runs from the end of November to the start of the subsequent late March or Early April by this paradigm for the stations in this study.

Figure 5 Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.
Figure 5

Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.

Table 3

Freeze season starting and ending dates as estimated by times with frozen soil present at least 50 percent of the time.

Station LocationStartEnd
BottineauNov 25Apr 25
CarringtonNov 25Apr 15
DickinsonNov 28Mar 27
FargoNov 27Apr 11
Grand ForksNov 24Apr 7
HarveyNov 30Apr 13
HettingerNov 25Apr 1
LangdonNov 21Apr 27
MinotNov 25Apr 9
StreeterNov 30Apr 14
WillistonNov 23Mar 28

Once the freeze season is selected at each station, a first order fit to each of the parameters A,k, B,k, w is estimated. For station = 1,…,11, let s() and e() represent the start and end dates of the freeze season respectively, again, with respect to the winter centered year. Next, obtain the average change from day ν − 1 to day ν during the freeze season at station overall each year of the study. Specifically, let SMν represent the seasonal mean daily change at station on day ν such that s() < νe() and evaluate this as

SMν=k=0f1XkT+νXkT+ν1f.(9)

Next, the MATLAB function "cfit" with a selection of "fourier1" is used to fit a first order fit to the series {SMν}ν=s()+1e() . The results of this fit are depicted for the Fargo station in Figure 6. The results of this fit visually look to be quite natural, as the estimated curve appears to track the data well. This process is repeated for the other 10 stations in this study. The estimates for the SMν for = 1, … ,11 are depicted in Figure 7.

Figure 6 Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.
Figure 6

Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.

Figure 7 Estimated Daily Change in SQFS, all 11 sites.
Figure 7

Estimated Daily Change in SQFS, all 11 sites.

Interestingly, the maximum of maximum values depicted in Figure 7 occurs at the Dickinson station, and the minimum of maximum values at the Grand Forks station, corresponding to the second highest, and lowest elevations respectively. This is not surprising, as given all else constant, one would anticipate deeper freezing associated with cooler conditions occurring at higher altitudes.

Estimation of the trend parameters in (7) is now considered. Let Θ = [Θ1, …, Θp]T be the 11 × 1 column vector of the model trend parameters in Equation (7), which model the trend in mean daily change of the SQFS process. Now, set

g(ν)^=A^,0+A^,1cos(w^t)+B^,1sin(w^t),(10)

where A^,0,A^,1,w^, and B^,1 are also estimated by MATLAB as discussed above for = 1,…,11. One may now write the model in Equation (7) as

Mt=𝟙(t,)(g(ν)^+Θt).(11)

In order to keep count of all 11 values of the indicator function in (11) define the p × p diagonal projection matrix

Pt=diag(𝟙(t,1),,𝟙(t,p)).(12)

Noting that Pt2=Pt , a sum of squares based on the one-step-ahead predictors as a function of Θ is

t=1nXtp×1X^tp×1(Θ)TPtXtp×1X^tp×1(Θ).(13)

Here, the term X^tp×1(Θ)=EΘ[Xtp×1|Xt1p×1,,X1p×1] where EΘ[X|Y] indicates the conditional expectation of X given Y. Since {Xtp×1} is modeled as a periodic Markov chain, the SQFS predicted today, depends only on the SQFS yesterday; that is, X^tp×1(Θ)=EΘ[Xtp×1|Xt1p×1] and the terms Xt–2, …, X1 are dropped from the conditional expectation which define the one-step-ahead predictions. A weighted least squares function is now introduced since the SQFS process exhibits periodic characteristics in both mean and variance (estimation of the periodic variance is discussed further below). In particular, prediction errors are scaled during each day of the freeze season, at each location, by an estimate of its variability. Thus the sum of squares function that we will minimize may be written as

S(θ)=t=1n[Xtp×1X^tp×1(Θ)]TPtΣt1Pt[Xtp×1X^tp×1(Θ)](14)
=k=0f1ν=1TXkT+νX^kT+ν(Θ)TPtΣν1PtXkT+νX^kT+ν(Θ),(15)

where inference for Σν is discussed below. It is noted here that precise values for Σν are not essential as optimizing a version of S(θ) without any seasonal weights gives asymptotically consistent estimators of all parameters; however, these estimators will not have a minimal variance (see[16] for example). We next make the reasonable assumption that the correlation matrix R of the first order differenced series XtXt-1 is nonseasonal, in that for the seasonal representation of t, t = kT + ν, the correlation matrix

R=Corr(XkT+νXkT+ν1,XkT+νXkT+ν1)

is not a function of the day of the year ν. Estimation of R is restricted maximal extent of the freeze season, all days of the year ν such that min111{s()}νmax111{e()}. Assuming no seasonality in the correlation matrix R is distinct from assuming the covariance matrix Σν is nonseasonal. Indeed, this is clearly not the case, and we allow for seasonality in the covariance matrix below.

Now, an estimate of the correlation matrix R^ is obtained by employing the corr function in MATLAB on the first order differenced series {XtXt1}t=2n (for all days where Xt and Xt1 are both not missing and within the maximal extent of the freeze season defined above). Seasonality is now introduced into the covariance structure. To this end, rescale the correlation matrix to a seasonal covariance matrix with the matrix product

Σν=Y(ν)RY(ν),(16)

where Y(ν) are p × p diagonal matrices, with the ith diagonal entry set to the seasonal variance at station i on day ν.

Here, Y(ν) approximates the seasonal variance of the first order differences of XkT+vXkT+v–1 at location on day ν. To obtain an estimate of Y(ν) at location on day ν, first compute the simple seasonal sample variance

σν=1f1k=0f1XkT+νXkT+ν12.(17)

Next, fit the sampled values (σν)2 for ν = s() + 1, …, e() with a first order Fourier approximation using the same methodology that was used to fit the seasonal mean curve SMν in Equation (10). Let the sequence SVν denote the smoothed seasonal variance obtained by the first order Fourier fit. The smoothed daily variance values are depicted by the smooth curve in Figure 8. Next, note that during times of the year when the SQFS is frequently zero (up to 50% of the time during the freeze season), the estimated variance of the day by day change in the SQFS XnT+νXnT+ν1 will frequently be zero, specifically because the zero modified support set of Xt. Hence, the daily variance at this time is small. In order to keep any early and late season outliers from producing extreme residuals and unduly influencing trend parameter estimates, the early and late season variance estimates are therefore assigned a weighting factor. This is done by adding

(1Pr(ν))×max1i365{(σν)2}
Figure 8 Fargo station: Variance in the mean daily change in the quantity of frozen soil, with a first order Fourier fit.
Figure 8

Fargo station: Variance in the mean daily change in the quantity of frozen soil, with a first order Fourier fit.

to SVν, where Pr(ν) is defined as the proportion of time that the SQFS is positive on day ν at location . Let wt denote the smoothed daily variance now be defined it as

wt=SVν+(1Pr(ν))×max1ν365{(σν)2}.(18)

The resulting daily variance estimate is now depicted in Figure 9. It is noted here that the trend estimates will be consistent regardless of the weighting factor [14].

Figure 9 The top panel depicts a typical winter daily variance estimate at the Fargo station. The bottom panel juxtaposes the daily variance estimates at all 11 sites. The Langdon stations stands out as being an order of magnitude larger than the others.
Figure 9

The top panel depicts a typical winter daily variance estimate at the Fargo station. The bottom panel juxtaposes the daily variance estimates at all 11 sites. The Langdon stations stands out as being an order of magnitude larger than the others.

The top panel in Figure 9 shows the estimated daily variances at the Fargo Station. The bottom panel depicts the estimated daily variances at all 11 locations. Notice that when the daily variances are for all 11 sites are juxtaposed in the bottom panel, all but one of the daily variance estimates are roughly of the same magnitude as depicted in Figure 9. Specifically, series corresponding to Langdon appears to have about an order of magnitude greater variance than that of the other stations, as depicted by the outlying, higher valued curve in the panel.

A closer inspection of the series at Langdon depicts an unusual temperature series in that inordinately large variances are recorded in several years of the observed SQFS. This is illustrated in Figure 10. On original inspection, it was surmised that this increased variance in the soil thermal regime was a function of some geophysical properties, perhaps something such as the mineral composition of the soil or perhaps existence or lack of a snowpack. However, the series seems to experience a period of much lower variance in the SQFS after the winter of 07-08. This is depicted by noting the difference in the jagged nature of the SQFS at the Langdon site depicted in the Top Panel (Winter 97-98) vs the Middle Panel (Winter 07-08) of Figure 10. Indeed, all winter seasons after 06-07 exhibit a less jagged structure, and hence a lower variance. It is therefore posited that some change of machinery, observer, or both may have occurred at this site during this time frame. Note that the day by day variance in the SQFS process in the middle panel is much more in line with a typical SQFS series depicting the winter of 97-98 at the Fargo station.

Figure 10 Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.
Figure 10

Fargo station mean daily change in the quantity of frozen soil, with a first order Fourier fit.

In order to quality control this issue, a moving average of order 4 was applied to each of the SQFS series processes throughout the duration of the study period. The smoothing brought the daily variance estimates for the Langdon site much more in line with that of the other sites as depicted in Figure 11. The First two panels of Figure 11 depict the daily variance estimates of the Langdon site on the same scale. Note that the daily variance estimates have decreased by an order of magnitude. The third panel depicts the daily variance estimates after the moving average is applied to each site. Note that the variances all seem to depict the same yearly pattern, with roughly the same values. Finally, for numerical stability, the seasonal variance is not allowed to be less than unity for any day of the freeze season. Now that the daily variance estimates have been smoothed, an estimate of the seasonal variance function is obtained as

Σ^ν=Y^(ν)R^Y^(ν),(19)
Figure 11 Langdon Site: Smoothing Examined.
Figure 11

Langdon Site: Smoothing Examined.

where the diagonal entries of Y(ν) are set to Fourier approximations to the (σν)2 estimates constructed from the smoothed data as discussed above.

Now that the variances have been estimated, inference on the trend components Θ for = 1,…,11 in Equation (10) is considered.

First, following the derivation provided in the appendix of Woody et al (2009), one may compute the one-step-ahead predictions at each location as

EΘ[Xt+1|Xt]={Xt+Mt+1(Θ)}{1Φ(XtMt+1(Θ)wt+1)}+wt+1ϕ(Xt+Mt+1(Θ)wt+1),

where Φ(x) is the Cumulative Distribution Function (C.D.F.) of a standard normal random variable, Φ(x) is the Probability Density Function (P.D.F.) of a standard normal random variable ((x) = Φ'(x)).

Estimates of the trend vector Θ = [Θ1, …, Θp]T may now be obtained by minimizing the sum of squares S(Θ) in Equation (15) with respect to Θ.

Inference theory for this problem is considered in detail in Klimko and Nelson (1978) [17]. Specifically, it is shown that the parameter estimate Θ that minimizes S(Θ) is consistent and with asymptotically distributional (weak) convergence

θ^DMVN(θ,V/f),(20)

as the number of periods f, tends to infinity. The matrix ν is a positive definite covariance matrix and MVN identifies the distribution as multivariate normal. Since the vector Ctp×1 in Equation (3) is assumed to be normally distributed, note the estimators in Equation (20) are normally distributed regardless of sample size if the max in Equation (3) where omitted. Here, approximately 10% of observations are zero, so the normality assumption in Equation (20) is likely robust to apperance of the the max term in Equation (3).

With respect to the covariance matrix in (20), Klimko and Nelson (1978) showed that ν/f is approximated by the inverse of the second derivative matrix of S(Θ) evaluated at θ=θ^:

V/f[2S(θ)θθ']1|θ=θ^(21)

This is important as this relation allows one to compute standard errors for the model parameter estimates. In particular, the standard errors are the positive square roots of the diagonal components of ν/f. The focus of this study is assessment of a linear trend in the SQFS process, therefore the null hypothesis is set to

H0:Θ1==Θp=0,(22)

and represents the trend being zero at each station. The alternative hypothesis is that Θi ≠ 0 for some i = 1,…,11. The results of this hypothesis test are presented below.

6 Results

This section presents the trend fit for the storage model to the 11 NDAWN deep soil temperature series and tests the null hypothesis in (22). Before discussing the final results, three points are addressed.

First, if any of the trend parameters Θ is in fact negative, then there will be a last time in which any frozen soil is present at location , thus one can not expect to estimate Θ consistently in the limit. However, as a practical matter, this would likely occur over a long enough time-line so that reasonable estimates of Θ could be documented before this hypothetical last freeze occurs.

Secondly, missing data is handled by the indicator functions 𝟙(t,). Since inference is performed on the sequence XtX^t where X^t=EΘ[Xt|Xt1], observations at location must not be missing on both day t and on day t− 1. If either observation is missing, the indicator function 𝟙(t,) is set to zero, and Pt in Equation (15) is likewise updated.

Third, an additional Quality Control (QC) method is applied to the {Xt} series at each location = 1,…,11 separately. The objective is to identify questionable observed changes from day t− 1 to day t in {Xt} at each location . The QC method is specified as follows: assuming data is not missing on day t and t − 1 at location , flag the Xt if

|([XtXt1]SMν)wt|4.(23)

That is, if the day to day change from day t − 1 to day t is more than four standard deviations (with respect to seasonal variance) from SMν, flag the data. If the QC method does flag the observed value Xt, set the indicator function 𝟙(t,) is set to zero, and update Pt in Equation (15) by setting Pt=0.

The results are now obtained by applying the methods described in this article to the QC’ed data. The MATLAB function fminunc is used to numerically minimize S(Θ). The results are depicted in Table 4. Note the estimated trend at 9 of the 11 sites is negative. One needs to be careful in assigning too much significance to this however, due to the correlation of the estimates.

Table 4

Trend estimates (alongside standard errors) and z-scores for the 11 stations.

Station LocationTrend EstimatesZ-scores
Bottineau-0.8969 (0.1377)-4.3197
Carrington-1.4817 (0.1047)-14.2575
Dickinson-0.7076 (0.2656)-0.8940
Fargo-0.5350 (0.1494)-1.1664
Grand Forks-0.1311 (0.1507)2.7512
Harvey-0.8580 (0.2479)-2.0568
Hettinger-1.2317 (0.1752)-6.0304
Langdon-0.9044 (0.1452)-5.4690
Minot-0.9512 (0.1932)-1.7663
Streeter-0.6005 (0.1878)0.9801
Williston-0.9783 (0.2650)-1.5619

The null hypothesis in Equation (22) may be evaluated as follows. First, set

W=V/f[2S(θ)θθ']1|θ=θ^.(24)

Next, one may now obtain standardized the results in Table4 by writing

Z=(Θ^Θ0)W1/2MVN(0,I11)(25)

where I11 is an 11 × 11 identity matrix. The identity matrix in Equation (25) indicates that individual values Zi of Z for i = 1,…,11 are independent normal random variables, with mean Θ0 = 0 under the assumption the null hypothesis is true(see [18] for example). Table 4 also lists the 11 z-scores for the 11 elements of Z.

Note that 4 of the individual z-scores in Table 4 are more than 4 standard deviations below 0. Now, testing the null hypothesis in Equation(22), equates to testing a null hypothesis of

H0:E[Z1]==E[Z11]=0(26)

vs the alternative of not H0 however, since an incorrect scale parameter Θ0 in Equation (25), results in an incorrect scale parameter E[Z] = 0.

The trend estimates in Table 4 are all of the same sign. A chi-squared test statistic is constructed by taking the inner product

χ2=ZTZ=i=111Zi2.

The resulting value is χ2 = 308.6926 where the degrees of freedom 11. The resulting p-value for the above hypothesis test is therefore essentially zero. For a more conservative p-value, note that if the degrees of freedom is set to 55 (total number of parameters in the model), the p-value for the hypothesis test remains approximately zero. We can now conclude there is sufficient evidence to reject the null hypothesis that E[Zi] = 0 for i = 1,…,11. Note that this claim is not that the trends are all negative, only that the trend is not zero for all 11 sites. However, combining the rejection of H0 the fact that all 11 trend estimates in Table (4) are negative enforces that at least one of the trends is indeed negative.

To asses the model fit, we provide a visual diagnostic. Figure 12 depicts the actual SQFS vs the predicted SQFS for the winter of 2009-2010. Note that the one-step-ahead predictions track the data quite well. It is therefore posited that the model seems to have captured the overall dynamics of the SQFS process reasonably well.

Figure 12 SQFS with one-step-ahead predictions.
Figure 12

SQFS with one-step-ahead predictions.

7 Comments

Here, a new quantity termed the daily Stored Quantity of Frozen Soil is defined as the sum of frozen soil layers during the winter months as determined from deep soil temperatures at 11 NDAWN stations. This article presented a detailed statistical study aimed at assessing trends in this quantity.

Although changes in the SQFS process are possibly nonlinear functions of time, linear trends may be considered as the average rate of change in the quantity of interest over the observation period, and are therefore considered here.

A point along this line of reasoning is that confounding correlation with trend estimates also needs to be considered. While correlation can potentially effect trend estimates in the short term, as the number of observed periods tends to infinity, this effect should be mitigated. Regardless of whether the negative trends detected in this paper are symptomatic of a long term trend, or sequentially correlated climatological conditions, the average SQFS decreased over the study period, and thus the results are considered of interest as a proxy for assessment of climatologically changing conditions.

Finally, while the trend estimates presented here are for the SQFS process which is defined by the quantity of soil at or below zero degrees Celsius, any temperature range of interest may be investigated. Indeed, trends in different soil temperature ranges in the soil thermal regime may be of interest for agricultural purposes. One would need only compute the values of (2) for the stored quantity of interest.

References

[1] Serreze M., Framcos J., The Arctic amplification debate, Clim. Chang., 2006, 76, 241-264.10.1007/s10584-005-9017-ySearch in Google Scholar

[2] Brown R., Robinson D., Northern Hemisphere spring snow cover variability and change over 1922–2010 including an assessment of uncertainty, The Cryosphere, 2011, 5, 219-229.10.5194/tc-5-219-2011Search in Google Scholar

[3] Shiklomanov N., Non-climatic factors and long-term, continental-scale changes in seasonally frozen ground, Environ. Res. Lett., 2012, 7, 011003(3pp)10.1088/1748-9326/7/1/011003Search in Google Scholar

[4] Riseborough D., Shiklomanov N., Etzelműller B., Gruber S., Marchenko S., Recent advances in permafrost modelling, Permafrost Periglacial Process., 2008, 19, 137-156.10.1002/ppp.615Search in Google Scholar

[5] Nelson F., Shiklomanov N., Mueller G., Hinkel K., Walker D., Bockheim J., Estimating Active-Layer Thickness over a Large Region: Kuparuk River Basin, Alaska, U.S.A.,Arct., Antarc., Alp. Res., 1997, 29, 367-378.10.2307/1551985Search in Google Scholar

[6] Brown J., Hinkel K., Nelson F., The circumpolar active layer monitoring (calm) program: Research designs and initial results, Polar Geography, 2000, 24, 166-258.10.1080/10889370009377698Search in Google Scholar

[7] Osterkamp T., Characteristics of the recent warming of permafrost in Alaska, J. Geophys. Res., 2007, 112, F02S0210.1029/2006JF000578Search in Google Scholar

[8] Torben R. et al., Thawing sub-arctic permafrost: Effects on vegetation and methane emissions, Geophys. Res. Lett., 2004, 31, L0450110.1029/2003GL018680Search in Google Scholar

[9] Legendre M. et al., Thirty-thousand-year-old distant relative of giant icosahedral DNA viruses with a pandoravirus morphology. PNAS, 2014, 111, 4274-4279.10.1073/pnas.1320670111Search in Google Scholar PubMed PubMed Central

[10] Zhang T., Barry R., Gilichinsky D., Bykhovets S., Sorokovikov V., Ye J., 2001, An Amplified Signal of Climatic Change in Soil Temperatures During the Last Century at Irkutsk, Russia, Climatic Change, 2001, 49, 41-76.10.1023/A:1010790203146Search in Google Scholar

[11] Zhang T., Influence of the seasonal snow cover on the ground thermal regime: An overview, Rev. Geophys., 2005, 43, RG400210.1029/2004RG000157Search in Google Scholar

[12] Frauenfeld O., Zhang T., An observational 71-year history of seasonally frozen ground changes in the Eurasian high latitudes, Environ. Res. Lett., 2011, 6(4):04402410.1088/1748-9326/6/4/044024Search in Google Scholar

[13] Perona P., Porporato A., Ridolfi L., A stochastic process for the interannual snow storage and melting dynamics, J. Geophys. Res., 2007, 112, D0810710.1029/2006JD007798Search in Google Scholar

[14] Woody J., Lund R., Grundstein A., Mote T., A storage model approach to the assessment of snow depth trends, Water Resour. Res., 2009, 45, W1042610.1029/2009WR007996Search in Google Scholar

[15] Akyüz F., Ewens M., Carcoana R., Mullins B., NWS Frost depth observation with liquid-in probes performance: Two-year review, Journal of Service Climatology, 2008, 2, 1-1010.46275/JoASC.2009.06.001Search in Google Scholar

[16] Fuller W., Introduction to Statistical Time Series, Second Edition, John Wiley, New York, 1996.10.1002/9780470316917Search in Google Scholar

[17] Klimko L., Nelson P., On Conditional Least Squares Estimation for Stochastic Processes, Ann. Stat., 1978, 6, 629-642.10.1214/aos/1176344207Search in Google Scholar

[18] Casella G., Berger R., Statistical Inference, Second Edition, Duxbury Press, 2002.Search in Google Scholar

Received: 2015-12-3
Accepted: 2016-2-28
Published Online: 2016-6-2
Published in Print: 2016-6-1

© 2016 J. Woody et al., published by De Gruyter Open

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Downloaded on 2.3.2024 from https://www.degruyter.com/document/doi/10.1515/geo-2016-0036/html
Scroll to top button