# When is discretionary fiscal policy effective?

• Steven M. Fazzari , James Morley and Irina Panovska

## Abstract

We investigate the effects of discretionary changes in government spending and taxes using a medium-scale nonlinear vector autoregressive model with policy shocks identified via sign restrictions. Tax cuts and spending increases have larger stimulative effects when there is excess slack in the economy, while they are much less effective, especially in the case of government spending increases, when the economy is close to potential. We find that contractionary shocks have larger effects than expansionary shocks across the business cycle, but this is much more pronounced during deep recessions and sluggish recoveries than in robust expansions. Notably, tax increases are highly contractionary and largely self-defeating in reducing the debt-to-GDP ratio when the economy is in a deep recession. The effectiveness of discretionary government spending, including its state dependence, appears to be almost entirely due to the response of consumption. The responses of both consumption and investment to discretionary tax changes are state dependent, but investment plays the larger quantitative role.

JEL classification: E32; E62; C32

## 1 Introduction

Since the Global Financial Crisis, there has been worldwide resurgence in the use of discretionary fiscal policy. Both stimulus and austerity have been enacted in many countries, with policies often being a mix of tax and spending changes. By its nature, “discretionary” implies choice, including choice about timing. Thus, if the effects of discretionary fiscal policy depend nonlinearly on economic conditions at the time when the policy is undertaken, it opens up important questions about when different policies would be comparatively more or less effective, questions that would simply not be relevant under linearity.

This paper addresses three key questions about the timing and type of discretionary fiscal policy: (i) When do discretionary government spending increases and tax cuts provide more or less effective stimulus to the economy? (ii) Do the effects of government spending differ from the effects of taxes? (iii) Is austerity more or less effective than stimulus? In answering these questions, we make three contributions to the literature on nonlinear state-dependent effects of fiscal policy.

First, we examine the exact nature and robustness of state dependence in the effectiveness of fiscal policy by considering an informationally-sufficient medium-scale threshold vector autoregressive (TVAR) model and by comparing and testing many different possible threshold variables. Using U.S. data and identifying government spending and tax shocks via sign restrictions, we find strong empirical support for nonlinearity related to economic slack in the relationship between government spending and aggregate output and between taxes and aggregate output, both when considering dollar-for-dollar and cumulative multiplier responses. The measure of economic slack that we find most closely relates to the nonlinear relationship between fiscal policy and aggregate output is the model-averaged output gap developed by Morley and Panovska (2019) based on earlier research by Morley and Piger (2012). The model averaging approach addresses uncertainty about the appropriate forecasting model for aggregate output by averaging implied estimates of the output gap from forecast-based trend-cycle decompositions for a large set of similarly-fitting reduced-form time series models. However, it is important to note that we find generally robust results in terms of the timing and implications of the nonlinearity for various conventionally used measures of slack. Meanwhile, we show that structural shocks identified from our medium-scale model pass the conventional informational sufficiency tests. Thus, our results suggest that previous findings in favor of nonlinearity are not simply due to omitted variables or a failure to account for fiscal foresight.

Second, using evolving-regime generalized impulse response analysis, we demonstrate that tax cuts and government spending increases have similarly large expansionary effects during deep recessions and sluggish recoveries, but they are much less effective, especially in the case of government spending increases, when the economy is in a robust expansion. Meanwhile, tax increases and government spending cuts are most contractionary during deep recessions, and, as a result, are largely self-defeating if the goal of fiscal austerity implemented during an economic crisis is to bring down the debt-to-GDP ratio. Overall, we find that austerity has larger dollar-for-dollar effects on output than stimulus across the business cycle.

Third, we investigate and determine the roles of consumption and investment in driving the effects of both government spending and taxes on aggregate output. Our results imply that the effectiveness of discretionary government spending shocks, including its state dependence, is almost entirely due to the response of consumption. The responses of both consumption and investment to discretionary tax changes are state dependent, but investment plays the larger quantitative role.

The rest of our paper is organized as follows. Section 2 reviews the relevant previous literature. Section 3 presents our empirical model. Section 4 examines the evidence for nonlinearity and state dependence in the effects of government spending and taxes on aggregate output. Section 5 reports evolving-regime impulse response analysis to investigate when discretionary changes in government spending and taxes are comparatively more or less effective. Section 6 explores the roles of consumption and investment in driving the state-dependent effects of fiscal policy on aggregate output. Section 7 concludes.[1]

## 2 Previous literature

### 2.1 Nonlinear effects of fiscal policy

Recent theoretical research highlights potential channels through which fiscal policy shocks transmit nonlinearly. Michaillat (2014) shows that public employment can have much larger multiplier effects when the unemployment rate is high than when it is low. Canzoneri et al. (2016) emphasize the role of the credit channel, and McManus et al. (2018) emphasize the importance of the credit channel when credit constraints are occasionally binding and endogenous. Gali et al. (2007) showed that a fiscal multiplier could be large when the ratio of rule-of-thumb consumers is large and stimulus policies work primarily through the consumption channel. Leeper et al. (2017) showed that fiscal multipliers can be persistently high when government spending interacts favorably with consumer preferences.

The empirical literature on fiscal policy multipliers has grown rapidly and has many different strands. Our analysis builds on and merges several of these. Most closely related, a number of studies with smaller-scale nonlinear vector autoregressive (VAR) models find state-dependent effects of discretionary changes in government spending – see, for example, Auerbach and Gorodnichenko (2012), Auerbach and Gorodnichenko (2013), Bachmann and Sims (2012), Baum et al. (2012), Caggiano et al. (2015), Candelon and Lieb (2013), Fazzari et al. (2015), (FMP henceforth), and Mumtaz and Sunder-Plassmann (2019). However, the existence of state dependence does not seem settled. Studies using a narrative approach and military spending shocks to identify the effects of government spending often find little support for state dependence–see, for example, Owyang et al. (2013) and Ramey and Zubairy (2018).

Some recent empirical studies have also considered asymmetries in the effects of stimulus versus austerity, mostly in a sign-dependent framework. Jones et al. (2015) find that tax cuts have significant positive effects on US output, while tax increases have no substantial negative effects, but these results are reversed for the UK. Barnichon and Matthes (2017) find that government spending cuts have larger effects than increases, with the results driven primarily by very strong negative responses of output to government spending decreases during recessions. Guajardo et al. (2014) and Jorda and Taylor (2016) find large decreases in output in response to exogenous fiscal consolidations. Alesina et al. (2015) show that for a panel of 16 OECD countries, fiscal consolidations based on spending cuts are less costly in terms of output loss than consolidations based on tax increases. In a state-dependent framework, Klein (2017) finds that austerity has large negative effects on output when the level of private debt is high.

### 2.2 Shock identification

Different strands of the fiscal literature have also taken varied approaches to shock identification. The three most popular are the timing approach, the narrative approach, and the sign restriction approach.

Variations of the timing approach are used by, for example, Blanchard and Perotti (2002), Auerbach and Gorodnichenko (2012), and FMP. The timing approach entails imposing a restrictions such as, for example, government spending not responding to business cycle shocks within a quarter.

The narrative approach uses government spending shocks or tax shocks constructed by examining historical announcements about changes in government spending and taxes unrelated to the business cycle or overall economic conditions. Ramey (2011), Owyang et al. (2013), Ramey and Zubairy (2018), Cloyne (2013), Romer and Romer (2010), and Jones et al. (2015), inter alia, use the narrative approach, sometimes combined with timing restrictions. However, many studies that use the narrative approach consider only military spending shocks or narrative measures limited to large consolidations. This means that many of the observations for the narrative shock series are equal to zero for a large part of the sample, which makes exploring state dependence challenging econometrically.

The sign restriction approach defines the number of structural shocks of interest (which can be smaller than the number of variables in the VAR model) and restricts the sign of the response of variables over particular horizons. This approach is usually considered more agnostic than the timing approach because it effectively nests the timing restrictions. In a linear setting, Mountford and Uhlig (2009) find that deficit-financed tax cuts increase output more than deficit-financed increases in government spending. Candelon and Lieb (2013) extend the model to a nonlinear setting, and find that there is strong evidence of nonlinearity in the response of output to government spending shocks, but that the multipliers are always lower than one.

Despite the rapidly growing fiscal literature, few studies have formally considered whether both the effects of government spending and taxes are nonlinear in a joint model or whether state dependence could imply sign asymmetry. Notably, a theoretical model with endogenous credit constraints would imply both state dependence and sign dependence. For example, in a model in the spirit of (McManus et al., 2018), a cut in transfers to impatient households that keeps the households constrained would have larger effects than an increase. Because our medium-scale model described in the next section embeds detailed information about fiscal and other macroeconomic variables, we are able to address a number of potential problems in identifying discretionary government spending and tax changes shocks separately, while evolving-regime generalized impulse responses presented in Section 5 allow us to consider the presence of sign asymmetry under state dependence.

## 3 A medium-scale TVAR model

### 3.1 Reduced-form model and estimation method

We construct a TVAR model and consider different possible threshold variables. Let Yt denote the vector containing the endogenous variables. The TVAR model splits the stochastic process for Yt into two different regimes. Within each regime, the process for Yt is linear, but Yt can evolve endogenously between regimes. Let qt−d denote the threshold variable that determines the prevailing regime, where the integer d is the delay lag for a regime switch. If the threshold variable q td crosses c at time td, the dynamics of the TVAR model change at time t. Defining an indicator function I[.] that equals 1 when q td exceeds the threshold c and 0 otherwise, the full model can be written in a single equation as

(1) Y t = Φ 0 1 + Φ 1 1 ( L ) Y t 1 + ( Φ 0 2 + Φ 1 2 ( L ) Y t 1 ) I [ q t d > c ] + ε t .

The dynamics of the system when q td is below c are given by Φ 0 1 and the lag polynomial matrix Φ 1 1 ( L ) , and by Φ 0 2 and the lag polynomial matrix Φ 1 2 ( L ) when q td is above c. The disturbances ε t are assumed to be nid with mean zero and variance-covariance matrix Σ that is assumed fixed across regimes.[2] For our benchmark specification, Yt includes nine variables: log real federal consumption and investment spending, log real federal transfer payments to persons, log real federal interest payments on debt, log real transfer taxes, log of other tax revenues in real terms, log real GDP, a measure of slack, an interest rate (measured using the Federal Funds Rate or the Wu and Xia (2016), shadow rate during the zero-lower-bound period), and inflation (calculated using the GDP deflator). The sample period for the benchmark model is 1967Q1–2015Q4. All fiscal variables are converted to real terms using the GDP deflator, and all nominal series were obtained from NIPA-BEA.

By focusing on federal variables only, we are able to trace out the impact on public debt, a variable of obvious interest in debates about fiscal policy. In particular, if the total federal debt at time t is D t, then

D t = D t 1 + G t + G t t r a n s f e r + G t i n t e r e s t T t t r a n s f e r T t o t h e r

and the debt-to-GDP ratio can be calculated as

d t = d t 1 * Y t 1 / Y t + G t + G t t r a n s f e r + G t i n t e r e s t T t t r a n s f e r T t o t h e r Y t ,

where dt is the ratio at time t.[3] Most fiscal stimulus or austerity that involves discretionary changes in government spending is usually implemented by changes in government consumption or investment. Nonetheless, transfer and interest payments may have sizable effects on the debt-to-output ratio. Furthermore, government transfer payments to persons are strongly affected by the state of the business cycle and respond to movements in output. Changes in transfer payments are occasionally used as a fiscal policy tool (a notable example is the extension of unemployment benefits during the Great Recession), although most movements in transfer payments are likely to be endogenous.[4] Likewise, it is important to split taxes into two sub-components: transfer taxes, which depend on the state of the business cycle and are rarely used as a discretionary fiscal policy tool, and federal tax receipts net of transfer taxes (the federal equivalent of Blanchard and Perotti’s 2002, tax series).

We estimate the parameters Φ i j , the threshold c, the delay lag d, and the number of lags included in the TVAR model using Bayesian methods (technical details are provided in Appendix A). A Bayesian approach has two advantages in highly parametrized models such as the TVAR model. First, conventional frequentist tests can be severely underpowered. The Bayesian approach circumvents this problem by allowing us to directly compare the linear to the nonlinear model using marginal likelihoods. The marginal likelihoods are calculated based on Chib and Jeliazkov (2001) algorithm and models are evaluated using the implied Bayes factors. In addition, motivated by concerns described by Campolieti et al. (2014), we also report the expected posterior likelihoods and the highest posterior density for all of the models. Second, the impulse responses for the endogenously evolving system have nonstandard distributions that will be highly non-Gaussian and depend on the history and the size or sign of the shocks, even when the true values of parameters are known. The Bayesian sampler conveniently produces the entire posterior distribution for c, Φ i j and Σ conditional on the data and the entire posterior distribution of the impulse responses.

### 3.2 Impulse responses and shock identification

The main empirical questions we consider are whether the effects of government spending differ across regimes defined by economic slack and whether, conditional on any state dependence, austerity has effects that are significantly different from a mirrored effect of stimulus of the same magnitude. Rejecting linearity using Bayesian model comparison directly implies that at least one of the impulse responses to at least one identified structural shock is different across regimes. However, the nature and degree of this asymmetry can be evaluated only by looking at the impulse response functions themselves.

The main impulse responses that we consider reflect, after appropriate conversion, the dollar-for-dollar responses of a variable of interest (for example output) to a one-time policy shock. However, these may provide an incomplete picture of the overall effects from a policy shock. If a researcher is interested in calculating multipliers in a more conventional sense, the cumulative change in output scaled by the cumulative change in government spending or taxes may be the more appropriate measure.[5] The dollar-for-dollar responses and the cumulative multipliers provide related, but slightly different pieces of information. In particular, the dollar-for-dollar responses address the question of how output responds today (or at some future horizon) to a policy change today and are close to what, for example, the Congressional Budget Office releases and is reported by the media when they estimate the effects of a policy at a given horizon (although the CBO also calculates and reports the cumulative net effects). For that reason, and because the literature is not unanimous about reporting cumulative or dollar-for-dollar responses, we consider both.

In constructing impulse responses (or functions of the impulse responses, in the case of multipliers), the structural shocks need to be identified using a plausible orthogonal decomposition of the variance-covariance matrix Σ. When imposing sign restrictions, we take an approach that is similar in spirit to Mountford and Uhlig (2009). However, following recent developments in the time-series literature that show the penalty function approach used by Mountford and Uhlig (2009) may bias the impulse responses and lead to artificially narrow credibility intervals, we construct the impulse responses using the efficient sampler proposed by Arias et al. (2018).[6] Our focus is on four structural shocks identified using sign restrictions on the impulse responses, summarized in Table 1.

Table 1:

Sign Identification.

Response
Shock G TransPay IntPay TransTax OtherTax Y slack i π
G +++ ? ? ? ? +/? ? ? ?
T ? ? ? ? +++ ? ? ? ?
BC ? ? ? ? +++ +++ +++ ? ?
MP ? ? ? ? ? ?/− − − ? +++ − − −
1. Question marks indicate that the sign is left unrestricted.

The four identified shocks are a government spending shock, a tax shock, a “business cycle” shock, and a monetary policy shock. The first sign in each cell of Table 1 shows the assumed direction of the effect of a shock on the response variable on impact; the second and third signs in each cell are the assumed signs in the first and second quarter following the shock. All shocks are assumed to be orthogonal to one another, which differs from Mountford and Uhlig (2009), who do not impose the restriction that tax shocks are orthogonal to government spending shocks.[7]

A positive business cycle shock is restricted to increase output, tax revenues, and the measure of slack on impact and for 2 quarters following the shock.[8] Meanwhile, a positive monetary policy shock is specified to increase the interest rate contemporaneously and for the subsequent two quarters, while decreasing inflation on impact and for the subsequent two quarters. That is, a “positive” monetary shock is contractionary in the sense of having a disinflationary and negative liquidity effect. However, because there is conflicting evidence from the monetary policy literature (see, for example, Lo and Piger (2005), Alpanda and Zubairy (2019)) about whether the responses of output to monetary policy can vary and possibly be insignificant at some points of the business cycle, we do not impose the restriction that output falls in response to a contractionary monetary policy shock, although our main results do not change when we impose this restriction. A positive tax shock is assumed to increase tax revenues contemporaneously and for two quarters following the shock. Similarly, a positive government spending shock increases government consumption and investment contemporaneously and for two quarters following the shock.[9] Following previous results from the fiscal literature, we also impose the restriction that output increases on impact in response to a positive government spending shock (see, for example, FMP and Auerbach and Gorodnichenko 2013) and that exogenous tax increases decrease output on impact (see, for example, Romer and Romer 2010). Even studies that find no evidence of state dependence or studies that find that government spending multipliers decline sharply after the first quarter find positive multipliers on impact (for example, see Ramey and Zubairy 2018). The responses of output are then left unrestricted after impact.[10]

The responses to negative shocks are restricted to have the opposite signs to those shown in Table 1. In the case where we consider the evolving-state impulse responses, the responses are constructed assuming that the economy evolves endogenously from one regime to another, with an orthogonalization accepted if the sign restrictions hold for two quarters even if the economy evolves from one regime to another. The technical details of the impulse response calculation are discussed in Appendix A.

## 4 Evidence for nonlinearity and state dependence

Because the main goal of this paper is to explore when different types of discretionary fiscal policy are effective in the presence of state dependence, we first need to establish what evidence there is for nonlinearity when considering a large enough model to ensure informational sufficiency. To do this, we consider the choice of threshold variable, assess the evidence of state dependence in the dollar-for-dollar and cumulative multipliers, and then perform formal tests to demonstrate that our shocks can be considered “structural” in the sense they are orthogonal both to survey forecasts and to information from other macroeconomic variables.

### 4.1 Choice of threshold variable

Two key issues complicate the choice of threshold variable. First, any proposed measure of slack may not accurately capture the true degree of under (or over) utilization of resources in the economy. Second, economic slack may not actually be what triggers nonlinear responses of output to fiscal policy.[11]

#### 4.1.1 Measures of slack

Even focusing on the output gap (i. e., the difference between actual and potential log real GDP) as a measure of slack, large discrepancies arise when using different models to estimate the output gap (see, inter alia, Morley and Piger 2012; Morley and Panovska 2019; Perron and Wada 2016). To address this model uncertainty, the measure of slack that we use in our benchmark TVAR model is the model-averaged output gap (MAOG) from Morley and Panovska (2019).[12] The MAOG is calculated using equal weights on estimated output gaps from a large set of linear and nonlinear time series models (we refer the reader to the original study for technical details). Morley and Panovska (2019) show the MAOG approach performs very well in matching business cycle dates and correspondence to narrower measures of slack, not just for the US, but for a large group of OECD countries.

We note that the extant nonlinear fiscal spending multiplier literature has used many different observed variables as potential proxies of slack. For example, in FMP, we considered capacity utilization. Meanwhile, a large number of studies use the CBO output gap (see, for example Baum et al. 2012). Auerbach and Gorodnichenko (2012, 2013 use various combinations of moving averages of output growth rates, whereas Ramey and Zubairy (2018) use the unemployment rate. Given this variety of slack measures, two immediate questions arise: First, which measure of slack drives possible nonlinearity in our medium-scale TVAR model? Second, which measure of slack is the “right” measure when modeling the macroeconomy as a whole? To address these questions, we consider three sets of models. The first set is based on our benchmark specification, but using different measures of slack in the VAR and as a threshold variable. This set of models helps us assess which measure of slack drives the nonlinearity in our model. The second set of models also covers different measures of slack, but corresponds to a smaller VAR that excludes fiscal instruments and only includes output, inflation, interest rates, and the measure of slack. The third set of models also covers different measures of slack, but corresponds to a small-scale fiscal model that includes federal consumption and investment, federal revenues, output, inflation, interest rates, and the measure of slack.

For the three sets of models, Table 2 reports the various threshold estimates, and different measures of fit, including the log marginal likelihood, across different measures of slack in the VAR and as threshold variables.[13] The implied Bayes factors strongly favor the TVAR model over the linear counterpart in all cases. This result is particularly notable for the benchmark medium-scale model because evidence of nonlinearity for the smaller models could have been due to omitted variables included in our larger model. Meanwhile, the MAOG is the preferred measure of slack in almost every case, both for the benchmark model and for the smaller models. Furthermore, while the threshold estimates for some of the other variables change across different models, the estimated thresholds for the MAOG as the threshold variable are fairly robust across the different specifications. Taken together, these results suggest that the MAOG is a good measure of economic slack and driver of nonlinearities in macroeconomic dynamics.

Table 2:

Model comparison: Linear versus nonlinear models with different measures of slack.

 Benchmark VAR: measure of slack in the VAR Threshold variable capacity utilization unemployment rate CBO gap MAOG Linear (none) −2670.03 −2525.25 −2430.87 −2394.05 −2676.38 −2523.81 −2429.42 −2389.91 −822.73 −507.72 −506.87 −833.73 capacity utilization −2238.54 −2137.32 −2028.08 −1938.70 −2239.00 −1.35 −2029.01 −2.00 −2029.47 −0.30 −1938.02 −1.33 −339.26 (−1.66, −1.04) −339.26 (−2.63, −0.28) −344.21 (−1.79, 0.23) −568.77 (−1.82, −0.91) unemployment rate −2294.43 −2152.21 −1764.52 −1958.42 0.13 −2293.77 0.59 −2151.00 0.06 −1761.30 0.88 −1958.63 (−0.05, 0.38) −339.26 (−0.06, 0.73) −362.84 (−0.29, 0.38) −341.39 (0.06, 1.04) −542.63 CBOgap −2231.54 −2089.74 −1597.20 −1916.56 −2231.70 −1.64 −2091.77 −1.66 −1598.00 −2.00 −1918.22 −1.64 −303.37 (−1.95, −0.84) −329.92 (−1.95, −0.66) −312.61 (−2.16, −1.30) −487.05 (−1.94, −1.06) MAOG −1937.81 −2085.69 −1579.30 −1873.11 −1937.00 −0.69 −2084.27 −0.74 −1578.33 −0.71 −1870.00 −0.74 −269.66 (−0.74, −0.52) −329.92 (−0.80, −0.39) −309.65 (−0.82, −0.39) −450.37 (−0.86, −0.51) Var with output, inflation, interest rates, and a measure of slack: measure of slack in the VAR Threshold variable capacity utilization unemployment rate CBO gap MAOG Linear (none) −667.09 −510.15 −442.52 −365.37 −664.22 −509.23 −441.52 −364.22 −194.27 −162.52 −405.22 −240.38 capacity utilization −569.70 −423.05 −339.42 −247.23 −567.99 −3.03 −422.99 −3.03 −339.40 −3.02 −244.22 −3.03 −84.37 (−4.84, −0.86) −64.22 (−3.51, −0.82) −110.95 (−3.42. −1.52) −91.11 (−3.38, −0.47) unempoyment rate −592.15 −428.39 −94.83 −274.90 −592.91 0.74 −426.22 0.73 −92.11 1.20 −276.21 0.73 −109.22 (−0.04, 1.03) −62.19 (−0.07, 0.87) −63.59 (−0.05.1.39) −90.23 (−0.06, 0.89) CBO gap −585.11 −420.45 −237.37 −268.55 −584.00 −1.64 −420.40 −0.69 −235.22 −2.17 −266.211 −0.41 −111.96 (−1.96, −0.22) −54.11 (−2.71, −0.21) −89.65 (−2.42, −0.41) −80.29 (−1.96, 0.14) MAOG −569.55 −408.14 −50.10 −211.84 −567.01 −1.02 −407.22 −1.03 −49.99 −1.49 −210.22 −1.21 −84.62 (−1.34, −0.02) −44.07 (−1.34, −0.60) −35.09 (−1.51, −0.51) −58.57 (−1.34, −0.95) Var with federal spending, federal revenues, output, inflation, interest rates, and a measure of slack: measure of slack in the VAR Threshold variable capacity utilization unemployment rate CBO gap MAOG Linear (none) −1490.90 −1335.36 −1243.82 −1196.54 −1488.20 −1333.00 −1242.01 −1194.24 −433.76 −459.44 −466.73 −363.01 Capacity Utilization −1266.93 −1118.90 −11036.46 −936.63 −1262.90 −3.00 −1118.00 −2.90 −1134.00 −3.01 −930.10 −3.13 −182.07 (−3.50, −0.89) −139.18 (−3.52, −1.71) −174.33 (−3.45, −2.52) −317.02 (−3.47, −0.94) unemployment rate −1336.69 −1185.00 −772.61 −1028.60 −1334.92 0.79 −1184.50 0.80 −771.51 1.09 −1022.51 0.43 −176.22 (−0.16, 0.99) −223.16 (−0.00, 1.08) −223.16 (0.09, 1.21) −261.95 (−0.07, 0.96) CBO gap −1296.73 −1147.56 −917.49 −990.20 −1292.00 −1.91 −1148.00 −1.56 −915.00 −1.86 −986.02 −1.64 −170.02 (−2.01, −1.01) −128.33 (−2.40, −0.72) −374.00 (−2.41, −1.05) −320.58 (−1.95, 0.76) MAOG −1253.99 −1110.35 −613.42 −942.67 −1253.80 −1.16 −1109.00 −1.02 −612.00 −1.11 −942.02 −0.74 −161.22 (−1.51, −0.36) −104.56 (−2.22, −0.37) −99.52 (−1.25, −0.55) −263.00 (−2.01, −0.31)
1. Each cell reports the log likelihood obtained from maximum likelihood estimation, the expected posterior log likelihood obtained Bayesian estimation, and the log marginal likelihood (top, middle, bottom). The second entry is the threshold estimate, including 90% credibility intervals, obtained from the posterior Bayesian distribution. The best model fit for each measure of slack is reported in bold.

#### 4.1.2 Slack versus alternative threshold variables

To account for the possibility that nonlinearity could actually be driven not so much by the degree of slack in the economy, but more by the direction of change in economic activity or fiscal policy, we also consider the following possible threshold variables: a 4-quarter moving average change in log output, a 4-quarter moving average change in the log of government spending and consumption, and a 4-quarter moving average change in log tax revenues (net of transfer taxes).[14] For completeness, we also consider the level of the ex-ante real interest rate (based on static expectations) as a threshold variable to allow for the possibility that any asymmetry is related more to the stance of monetary policy rather than fiscal policy. Figure 1 plots all of the possible threshold variables that relate to economic slack, growth rates, and policy changes. The left panels display the measures of slack and the right panels display the additional threshold variables related to growth rates and policy changes.

Figure 1:

Threshold variables and estimated thresholds.

Recent developments in the fiscal and monetary literature have also indicated that household debt or household debt overhang could be a potential channel for nonlinear transmission of policy (Alpanda and Zubairy 2019; Bernadini and Peersman 2018; Klein 2017). Furthermore, the empirical literature has also related the effectiveness of fiscal policy to the level of the government debt (see, inter alia, the now highly controversial study by Reinhard and Rogoff (2009)). Thus, we also consider household and Federal debt-output ratios as potential threshold variables. Figure 2 plots the debt levels and the debt overhang. As shown in the figure, both debt ratios are clearly nonstationary. For the sake of direct comparison with previous studies on federal debt, we consider using the ratios in levels (while fully acknowledging that this could be problematic and could lead to incorrect inference, as discussed in detail in Supplementary Appendix B). We also consider their overhang levels, which are stationary. Following the previous literature (inter alia Klein 2017 or Alpanda and Zubairy 2019), overhang is defined as the difference between the ratio and its long run trend, where trend in this case is estimated using the low frequency output of the HP filter with λ = 10 4 .

Figure 2:

Household and federal debt-to-GDP ratios.

The threshold variables are adjusted for any structural breaks. The estimated threshold (median) and 90% credibility intervals from Tables 2 and 3 are also displayed.

Table 3:

Model comparison: slack versus growth versus debt as threshold variables in a TVAR model with MAOG as the measure of slack.

Threshold variable Measure of slack in the VAR model
MAOG
Linear model (none) −2394.05
−2389.91
−833.73
Moving Average Output Growth −1879.45 −1.51
−1878.00 (−1.67, −0.89)
−454.37
Moving Average Government Spending Growth −1966.62  3.93
−1962.79 (1.06, 4.45)
−503.65
Moving Average Taxes Growth −1924.22  −3.31
−1923.16 (−4.79, 1.34)
−490.12
Real Interest Rate −1953.92  −0.21
−1953.00 (−0.62, 0.87)
−515.22
Household Debt Ratio −1958.25 47.25
−1956.22 (n/a)
−600.22
Federal Debt Ratio −1992.85  34.58
−1990.11 (n/a)
−751.00
Household Debt Overhang −1995.11 −0.75
−1994.22 (−0.42, 0.85)
−671.00
Federal Debt Overhang −1939.31 −1.385
−1933.22 (−2.02, 2.10)
−688.95
MAOG −1873.11 −0.74
−1870.00 (−0.86, −0.51)
−450.37
1. Each cell reports the log likelihood obtained using maximum likelihood estimation, the expected posterior log likelihood obtained Bayesian estimation, and the log marginal likelihood (top, middle, bottom). The second entry is the threshold estimate, including 90% credibility intervals, obtained from the posterior Bayesian distribution. The growth rate threshold variables are 4-quarter moving averages of log differences. The real interest rate is an ex ante measure given static expectations. The best model fit is reported in bold.

The trends are estimated using an HP filter with λ equal to 104. The estimated threshold (median) and 90% credibility intervals from Tables 3 and 4 are also displayed.

Table 4:

Output multipliers: cumulative scaled responses at selected horizons.

Horizon Government spending Taxes
Excess slack Close to potential Excess slack Close to potential
1 year 1.15 1.24 −6.24 −2.22
2 years 1.21 0.72 −5.37 −3.19
3 years 1.23 0.38 −4.51 −3.68
4 years 1.18 0.28 −4.28 −4.07
5 years 1.15 0.26 −4.50 −4.33

Table 3 reports the results of a model comparison for a specification where slack (namely the MAOG) is used as the threshold variable versus specifications with policy or debt threshold variables. Similar to Table 2, the MAOG is strongly preferred as the threshold variable. Looking back at Figure 1, the threshold estimates for all measures of slack split the sample up into periods of excess slack (recessions and their immediate recoveries) and “normal” times when the economy is closer to potential. Related to this delineation of the sample and, as can be seen by comparing Tables 2 and 3, any of the slack variables is strongly preferred as a threshold variable compared to the policy variables, while the MAOG is preferred over the moving average of output growth, even though both identify similar dates for the regimes, as shown in Figure 1. The models in which debt ratios are included as threshold variables in levels perform worse than any of the models that include a measure of slack as a potential threshold variable. Furthermore, the estimates are quite imprecise when considering policy measures as threshold variables. In the case of debt overhang, the nonlinear model outperforms the linear model, but, again, all model selection criteria strongly prefer the model with the MAOG as the threshold variable. Therefore, in our remaining analysis, we focus on the model with the MAOG as a measure of slack and as a threshold variable.[15]

### 4.2 Fixed-state responses

Before considering evolving-regime responses to both government spending and tax shocks in the next section, we first establish that our identified government spending and tax shocks have state-dependent effects. Figure 3 plots the dollar-for-dollar responses and Table 4 summarizes the responses for the cumulative multipliers at select horizons.[16] The left panels in Figure 3 display the fixed-regime responses for the “excess slack” regime (defined by the estimated threshold), the middle panels display fixed-regime responses when the economy is close to potential, and the right panels display the posterior differences for the responses between the two regimes. Both median responses and 90% credibility intervals are reported.[17]

Figure 3:

Dollar-for-dollar effects of government spending and taxes on output.

In the excess slack regime, output responds with a large and persistent increase to a positive shock to government spending. By contrast, when the economy is close to potential, an increase in government spending temporarily increases output on impact, but the response dies out and becomes negative after two years. Tax cuts have similar state dependence. Flipping the sign of the displayed response to a tax hike, the fixed-regime impulse responses suggest that a tax cut would increase output in both regimes. A tax cut in the excess slack regime increases output by $2 (dollar for dollar) and is significant for 13 quarters, whereas the response is smaller and dies out after 7 quarters when the economy is close to potential, peaking at$1.3 and becoming insignificant after 7 quarters.[18] The posterior differences in the right panels make it clear that the state dependence is significant. Meanwhile, the cumulative tax multipliers in Table 4 are larger than the cumulative spending multipliers, especially at longer horizons, with both spending and tax multipliers exhibiting clear state dependence. The spending multipliers are larger and more persistent in the excess slack regime than when the economy is close to potential. The spending multipliers peak early when the economy is close to potential, and start declining after less than a year. Similarly, the tax multipliers are larger in magnitude in the excess slack regime than when the economy is close to potential.

The figure displays the fixed-regime responses of output to a government spending shock (first row) and a tax shock (second row) in the excess slack regime (left column) and close to potential regime (middle column) for the benchmark model. The right column plots the posterior differences in responses across the two regimes. Posterior medians and 90% credibility intervals are reported.

### 4.3 Informational sufficiency

The impulse responses in Figure 3 are presented under the assumption that the VAR model includes sufficient information to correctly identify the fiscal shocks. However, if shocks are correlated with other information available to economic agents that is not included in the VAR model, the estimated impulse responses can be biased. For example, Ramey (2011) shows possible “fiscal foresight” about shocks is a problem for a small Blanchard-Perotti type VAR model for which the shocks identified from the VAR model can be Granger-caused by forecasts of those shocks and are, therefore, likely to be anticipated by economic agents.

We follow Forni and Gambetti (2014) to assess whether the shocks from our TVAR model are unanticipated.[19] We regress the structural shocks for government spending, in our case from each iteration of the Bayesian sampler for our benchmark model, on the Survey of Professional Forecasters’ forecasts of Real Federal Government Consumption Expenditures and Gross Investment in the subsequent four quarters, calculated based on mean responses and taking into account compounding. Our assumption with this test is that the SPF forecasters aggregate relevant information about the anticipated component of government spending. Meanwhile, because the residuals for a Bayesian VAR model may not necessarily be orthogonal to the VAR information set, we also perform a Granger causality test where we regress the structural shocks on a model that includes both the right-hand-side variables from Equation (1) and the SPF forecasts. In either case, we are unable to reject the null of information sufficiency. For the case where we do not include the conditioning variables, the p-values range from [0.21, 0.98] across different draws of the sampler, with the median p-value being 0.56. For the case where we include the conditioning variables, the p-values range from [0.47, 1.00], with the median value being 0.83. Therefore, our medium-scale TVAR model appears informationally sufficient when identifying government spending shocks.

We also conduct orthogonality tests by checking whether either the spending or tax shocks are correlated with lags of principle components extracted from a large macroeconomic dataset that proxies for information available to economic agents. In particular, we use the FRED MD Stock and Watson dataset (see McCracken and Ng 2016 for details).[20] Different conventional tests indicate that the relevant number of principal components for the dataset is between 1 and 7. Table 5 reports the results for 2 lags of each principal component at a time and for 2 lags of all seven principal components at the same time. We cannot reject the null of orthogonality for any of the principal components. The p-values are especially large for the TVAR models. This result is consistent with Forni and Gambetti (2016), who show that, while a small fiscal VAR model is insufficient to identify fiscal shocks, a larger VAR model that includes forward-looking variables such as inflation and interest rates (and exchange rates in their case) is sufficient. Therefore, our shocks appear to be “structural” in the sense that they are not correlated with other information at time t about the macroeconomy and are thus possibly also “structural” in the conventional SVAR sense.

Table 5:

Orthogonality tests.

Shocks PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC1-7
G Linear VAR 0.872 0.953 0.935 0.620 0.354 0.785 0.162 0.919
T Linear VAR 0.854 0.997 0.861 0.122 0.225 0.275 0.090 0.181
G TVAR 0.533 0.432 0.196 0.698 0.898 0.327 0.456 0.881
T TVAR 0.562 0.267 0.790 0.851 0.145 0.596 0.899 0.736
1. Each cell reports the p-value for an F-test where the null is that the highest posterior density estimates of shocks are orthogonal to the lags of the principal components.

## 5 Evolving-regime impulse response analysis

The responses reported in Figure 3 and in Table 4 embed three different sources of uncertainty: uncertainty about the threshold estimate, uncertainty about the TVAR parameters, and uncertainty about the orthogonalization matrix that identifies the shocks. Even though we account for all of these different sources of uncertainty and we use 90% credibility bands, which are conservative in the fiscal literature, there is clear evidence of state dependence in the responses of output to fiscal policy. Notably, the posterior differences in the right columns of Figure 3 are large in magnitude and are highly likely to be different from zero. The cumulative multipliers exhibit similar state dependence.

In this section, we turn to exploring implications of state dependence in more realistic scenarios for fiscal policy spending and tax shocks in which the economy is allowed to evolve endogenously from one regime to another. This approach allows us to consider possible sign and size asymmetries and to determine when discretionary fiscal policy is comparatively more or less effective. While the fixed-regime responses are useful for testing state dependence across regimes, the responses within a regime are be linear by construction – i.e., they are proportional to the size and sign of a shock. However, if the economy is allowed to evolve across regimes, threshold models allow (but do not impose) the possibility that negative shocks can have different proportional effects for positive shocks or that shocks of different magnitudes have non-proportional effects.

The evolving-regime analysis requires specification of the history of the economy prior to the shock because the effects of the shock will depend on the system’s proximity to the threshold. For our generalized impulse response calculations, we focus on three particular histories of interest from a policy perspective: a strong expansion, a deep recession, and a sluggish recovery, defined as follows:

1. 1996Q1: a robust expansion, when the economy is usually classified as being above or close to potential according to our various threshold variables and threshold estimates;

2. 2008Q3: a deep recession, when the economy is clearly classified as being in the excess slack regime;

3. 2012–2014: a sluggish recovery, when the economy is close to the estimated threshold for at least some of our threshold variables and threshold estimates.

For each history, we calculate the responses to an increase in government spending and taxes and to decreases in government spending and taxes.[21] The shocks are scaled to 1% of GDP and 3% of GDP to consider possible size asymmetries. Sign restrictions are simply reversed for negative shocks. To address the computational burden when calculating the generalized impulse response functions, we abstract from the parameter uncertainty and fix the parameters at their highest posterior density values, although we know from fixed-regime responses in the previous section and as noted above that there is evidence of state dependence even when taking this parameter uncertainty into account.

Figure 4 plots the responses of output to changes in government spending and taxes when the economy starts in robust expansion and Figure 5 plots the responses of output when the economy starts in a deep recession. In both cases, a shock scaled to 1% of GDP is considered. The top panels of the figures display the responses to government spending, the bottom panels plot the responses to tax changes. The left column displays the responses to positive shocks (higher government spending or higher taxes), the middle panel displays the response to a negative shock (scaled by −1 for ease of comparison), and the right panel displays the difference between the scaled response to a contractionary shock and the response to an expansionary shock.

Figure 4:

Sign-dependent effects of government spending and taxes on output in a robust expansion.

Figure 5:

Sign-dependent effects of government spending and taxes on output in a deep recession.

The results in Figure 4 show that contractionary shocks (i.e., cuts in government spending or tax increases) have somewhat larger effects, on average, than expansionary shocks when the economy starts in a robust expansion. However, the difference is economically small and not significant. Tax cuts appear to be more efficient at stimulating output than increases in government spending ($1.7 vs.$0.6 after one year), which is consistent with Mountford and Uhlig (2009). The magnitude of the peak responses to tax shocks is also in line with, for example, the responses obtained by Romer and Romer (2010). However, our results for tax increases stand in contrast to the findings of Jones et al. (2015), who find that tax increases do not affect output, but decreases have a strong positive effect. Our results, by contrast, indicate that tax increases have a strong contractionary effect on output across the business cycle.

Meanwhile, as shown in Figure 5, the effects of contractionary shocks are more persistent and larger than the effects of expansionary shocks when the economy starts in a deep recession. Cuts in government spending decrease output by $1.7 after 9 quarters. Tax increases decrease output by almost$3 after 10 quarters. The responses to stimulative shocks are smaller than the responses to austerity shocks. Both increases in taxes and decreases in government spending significantly decrease output (the response is different from zero at all horizons for tax increases and for two years for spending cuts). These results indicate that, if the aim of discretionary policy is to stimulate the economy, either government spending or tax cuts could be used in deep recessions, but tax cuts should be used when the economy is in a robust expansion.

Size and sign asymmetries might be particularly relevant in a sluggish recovery when the economy is close to the threshold and different shocks can influence the probability of crossing it. Figure 6 plots the dollar-for-dollar responses of output to “small” (1% of GDP) and “large” (3% of GDP) shocks, both positive and negative, when the economy starts in a sluggish recovery. Figure 7 then plots posterior differences between responses to positive and negative shocks, responses to large positive and large negative shocks, responses to small and large negative shocks, and responses to small and large positive shocks.

Figure 6:

Sign-dependent and size-dependent effects of government spending and taxes on output in a sluggish recovery.

Figure 7:

Differences in sign and size effects of government spending and taxes on output in a sluggish recovery.

The figure displays the evolving-regime responses of output to a government spending shock (first row) and a tax shock (second row) for the benchmark model in a robust expansion. The left columns plot the responses to a positive shock, the middle column plots the response to a negative shock (scaled by −1 for ease of comparison), and the right column plots the difference in magnitude of (scaled) responses for positive and negative shocks. The shocks are equal to 1% of GDP. Posterior medians and 90% credibility intervals are reported.

The figure displays the evolving-regime responses of output to a government spending shock (first row) and a tax shock (second row) for the benchmark model in a deep recession. The left columns plot the responses to a positive shock, the middle column plots the scaled (by −1 for ease of comparison) response to a negative shock, and the right column plots the difference in magnitude of the scaled responses for positive and negative shocks. The shocks are equal to 1% of GDP. Posterior medians and 90% credibility intervals are reported.

Figures 6 and 7 illustrate that the responses to contractionary shocks are larger than the responses to expansionary shocks. This is particularly pronounced when we consider large shocks. Large contractionary shocks have very persistent effects. By contrast, large expansionary shocks have positive effects in the short run that quickly die out as the economy gets closer to potential. The responses to large expansionary shocks (positive government spending shocks or negative tax shocks) are proportionally smaller than the responses to smaller expansionary shocks. In particular, when the economy is above the threshold, the dollar-for-dollar responses are dampened. By contrast, the responses to different sized cuts in government spending or tax increases are proportionally more similar.

We also use evolving-regime impulse responses to trace out the effects of different shocks on government debt. While contractionary shocks have stronger effects on output, dollar for dollar, than expansionary shocks, the difference for government spending is much smaller in a robust expansion. This implies both that austerity could sometimes be self-defeating and that different fiscal policies of the same magnitude could have different effects on the debt-to-GDP ratio. Figure 8 plots the effects of stimulus on the debt-to-GDP ratio, while Figure 9 plots the effects of austerity. The top rows display the responses to a change in government spending and the bottom rows display the responses to tax changes. The left columns display the evolving responses when the economy starts in a deep recession, the middle columns display the responses when the economy starts in a robust expansion, and the right columns display the posterior differences.

Figure 8:

State-dependent effects of fiscal stimulus on the debt-to-GDP ratio with evolving regimes.

Figure 9:

State-dependent effects of austerity on the debt-to-GDP ratio with evolving regimes.

Figure 8 shows that a decrease in taxes immediately raises the debt-to-GDP ratio and this increase is significant, albeit temporary. Because the increase in output is larger when taxes are cut in a deep recession, the rise in the debt-to-GDP ratio from a tax cut is significantly smaller than in a robust expansion.

As Figure 9 shows, even though there is a substantial amount of uncertainty associated with the responses of the debt-to-GDP ratio to cuts in government spending, the posterior differences indicate that, at medium horizons at least, government spending cuts implemented in a robust expansion decrease the debt-to-GDP ratio more than government spending cuts implemented in a deep recession. The posterior difference peaks at 0.5% of GDP at intermediate horizons (Quarter 6 through Quarter 12). Similarly, increases in taxes reduce the debt-to-GDP ratio more if implemented in a robust expansion, but the difference is most pronounced at short horizons.

The figure displays evolving-regime responses of output to a government spending shock (first row) and a tax shock (second row) for the benchmark model in a sluggish recovery. The first column plots the responses to a positive shock equal to 1% of GDP, the second column plots responses to a positive shock equal to 3% of GDP, scaled by 1/3 for ease of comparison. The third column plots responses to a negative shock equal to 1% of GDP (scaled by – 1), and the fourth column plots responses to a negative shock equal to 3% of GDP (scaled by – 1/3). Posterior medians and 90% credibility intervals are reported.

The figure displays the differences in evolving-regime responses of output to a government spending shock (first row) and a tax shock (second row) for the benchmark model in a sluggish recovery. The first column plots the difference in magnitude of (scaled) responses for positive and negative shocks equal to 1% of GDP, the second column plots the difference in magnitude of (scaled) responses for positive and negative shocks equal to 3% of GDP, the third column plots the difference in magnitude of (scaled) responses for a negative shock equal to 3% of GDP and a negative shock equal to 1% of GDP, and the fourth column plots the difference in magnitude of (scaled) responses for a positive shock equal to 3% of GDP and a positive shock equal to 1% of GDP. Posterior medians and 90% credibility intervals are reported.

The figure displays the evolving-regime responses of the debt-to-GDP ratio to a positive government spending shock (first row) and a negative tax shock (second row) for the benchmark model in a deep recession (left column) and in a robust expansion (middle column). The right column plots the difference between the responses in a deep recession and a robust expansion. The shocks are equal to 1% of GDP. Posterior medians and 90% credibility intervals are reported.

The figure displays the evolving-regime responses of the debt-to-GDP ratio to a negative government spending shock (first row) and a positive tax shock (second row) for the benchmark model in a deep recession (left column) and in a robust expansion (middle column). The right column plots the difference between the responses in a deep recession and a robust expansion. The shocks are equal to 1% of GDP. Posterior medians and 90% credibility intervals are reported.

## 6 Responses of consumption and investment

In this section, we disaggregate the responses of output by considering the separate responses of consumption and investment to government spending and tax shocks. This allows us to consider different channels through which policy effects propagate. The impulse responses are calculated following the same approach used to calculate the impulse responses for output, although the impact responses of consumption and investment to fiscal shocks are intentionally left unrestricted as there is a less clear consensus on the underlying source of the response on output than there is about the overall response.

Figures 10 and 11 plot the fixed-regime responses of consumption and investment, respectively, while Table 6 summarizes the results for the cumulative multipliers at select horizons.[22] There is strong evidence in Figure 10 for state dependence in the response of consumption to government spending shocks, which is also confirmed in the cumulative multipliers in Table 6. A clear implication of this result is that consumption drives much of the state dependence in the response of output documented previously. In particular, consumption responds strongly in the excess slack regime, while the response is much smaller, although still significant, when the economy is closer to potential. Similarly, the response of consumption to tax shocks is stronger in the excess slack regime. The response pattern of consumption to government spending shocks is consistent with theoretical models that incorporate a time-varying share of rule-of-thumb consumers, models featuring habit formation in which government spending and consumption are complements, as in Leeper et al. (2017) and the endogenous credit constraint model by McManus et al. (2018).

Figure 10:

Dollar-for-dollar effects of government spending and taxes on consumption.

Figure 11:

Dollar-for-dollar effects of government spending and taxes on investment.

Table 6:

Consumption and investment multipliers: cumulative scaled responses at selected horizons.

Horizon Government spending Taxes
Excess slack Close to potential Excess slack Close to potential
Consumption 1 year 0.67 0.43 −0.80 −0.30
2 years 0.99 0.36 −1.19 −0.80
3 years 1.29 0.32 −1.44 −0.90
4 years 1.49 0.32 −0.29 −0.84
5 years 1.52 0.32 −0.12 −0.83
Investment 1 year 0.03 −0.16 −1.61 −0.25
2 years −0.02 −0.22 −1.30 −0.51
3 years −0.09 −0.25 −0.39 −0.30
4 years −0.13 −0.27 −0.14 −0.36
5 years −0.14 −0.28 −0.23 −0.41

There is no evidence in Figure 11 of state dependence in the response of investment to government spending shocks, again confirmed in Table 6. However, there is evidence that investment responds very differently to government spending and tax shocks, dollar for dollar, and some evidence of state dependence in the responses of investment to tax shocks. Investment does not increase significantly in response to government spending shocks, but it responds significantly to tax changes. Thus, the responses of output to discretionary changes in taxes appears to be driven by both consumption and investment.

The figure displays the fixed-regime responses of output to a government spending shock (first row) and a tax shock (second row) in the excess slack regime (left column) and close to potential regime (middle column) for the benchmark model. The right column plots the posterior differences in responses across the two regimes. Posterior medians and 90% credibility intervals are reported.

The figure displays the fixed-regime responses of output to a government spending shock (first row) and a tax shock (second row) in the excess slack regime (left column) and close to potential regime (middle column) for the benchmark model. The right column plots the posterior differences in responses across the two regimes. Posterior medians and 90% credibility intervals are reported.

## 7 Conclusions

Our analysis has considered when discretionary changes in government spending and taxes are comparatively more or less effective. We have found strong and robust empirical evidence in favor of nonlinearity and state dependence in the relationship between both types of fiscal policy and aggregate output. In particular, estimates from a threshold structural vector autoregressive model imply different responses of the economy both to government spending and to taxes during periods of excess slack compared to when the economy is closer to potential.

If the aim of discretionary fiscal policy is to stimulate the economy during periods of excess slack, both government spending multipliers and tax multipliers are high and work primarily through a consumption channel. However, when the economy is closer to potential, tax cuts have larger effects than government spending increases and work primarily through an investment channel. Notably, austerity designed to lower the government debt-to-GDP ratio is largely self-defeating in deep recessions. In particular, if the aim of austerity is to reduce the debt-to-GDP ratio, our results suggest it will have smaller negative effects on economic activity if pursued when the economy is in a robust expansion.

Research since the influential study by Auerbach and Gorodnichenko (2012) has debated the importance of state dependence and nonlinearity in the response of the economy to fiscal policy. While many studies find nonlinearity, important questions have been raised about robustness of the evidence, such as those in Ramey and Zubairy (2018). By exploring these issues with a relatively large model that nests much of the earlier research and allows us look extensively at robustness to various choices of threshold variables and shock identification, we find strong, robust support for nonlinearity in the effects of fiscal shocks that can help guide the timing and structure of future fiscal interventions.

Corresponding author: Irina Panovska, University of Texas at Dallas, School of Economics, Political, and Policy Sciences, 800 W Campbell Road GR 31, Richardson, TX, TX 75080, USA, E-mail:

Award Identifier / Grant number: Discovery Grant DP130102950

## Acknowledgments

We thank participants at the 2016 SNDE Symposium, 2017 CEF Conference, 2017 NBER NSF SBIES Conference, 2018 SEA Conference, the 2019 EEA Conference and in seminars at the University of Texas at Dallas, the Federal Reserve Bank of Philadelphia, the Australian Treasury, Monash University, the Reserve Bank of New Zealand, and the Bank for International Settlements.

1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

2. Research funding: Morley acknowledges financial support from the Australian Research Council (Discovery Grant DP130102950 on “Estimating the Effects of Fiscal Policy”).

3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

## Appendix A: Bayesian estimation and impulse response calculation

### A.1 Estimation

For the linear VAR model, we assume conventional normal- inverse Wishart conjugate priors. The prior for the conditional mean parameters and the autoregressive lag polynomial parameters conditional on Σ lin is multivariate normal with mean zero and variance 100*I n where n is the total number of parameters in the autoregressive lag polynomial. and the prior for the variance matrix Σ lin conditional on the VAR parameters is an inverse Wishart distribution with mean I 9 and scale parameter 25. Because these priors are conjugate, the posterior for the VAR parameters is normal and the posterior for the variance-covariance matrix Σ lin is an inverse Wishart distribution. Let Φ l i n ( m h ) and Σ l i n ( m h ) denote the mh th draw from the posterior distributions.

To ensure that the sign restrictions do not artificially bias the impulse responses and to ensure that we are sampling from the correct posterior distribution, we follow the algorithm from (Arias et al. 2018). For a draw Φ l i n ( m h ) and Σ l i n ( m h ) , we generate an orthonormal matrix Q sampled from a uniform orthogonal distribution following theorem 4 in (Arias et al. 2018). We then use Q as a rotation matrix to obtain the impact matrix and the impulse response shocks. For each of the 4 response criteria in Table 1, we check if there is exactly one of the nine shocks satisfying the restrictions. If yes, we keep the draw, and the sampler moves on to mh + 1. If the sign restrictions are not satisfied or if there is more than 1 shock that satisfies the restrictions, the sampler immediately moves to the mh + 1 iteration, and the impulse responses from the mh th draw are discarded. The acceptance rate for the linear model was 12%. We used 200,000 MH iterations.

Figure A1 plots the impulse responses to government spending and tax shocks for the linear model. These are the impulse responses implied by the priors for the nonlinear TVAR model, which are discussed next.

Figure A1:

Responses of output to government spending and tax shocks for linear VAR model.

Let Φ NL denote [ v e c ( Φ 0 1 ) v e c ( Φ 1 1 ) v e c ( Φ 0 2 ) v e c ( Φ 1 2 ) ] . For the nonlinear TVAR model, we assume that the prior for the autoregressive and conditional mean parameters given Σ NL and c is multivariate normal centered around the parameters for the linear model. In particular, [ v e c ( Φ 0 2 ) v e c ( Φ 1 2 ) ] | Σ N L , c is centered around zero, implying no nonlinearity. The variance for Φ NL |Σ, c is 100 * I n 1 where n 1 is the number of VAR parameters in the TVAR model. The prior for the variance matrix Σ NL NL , c is an inverse Wishart distribution centered around the variance-covariance matrix for the linear model and scale parameter 25. The prior for the threshold parameter c is uniform and covers the middle 80% of the observations for q td .

It is important to note that the priors for the nonlinear model are centered around the linear model, but they are very diffuse with variance-covariance 100 * I n 1 . However, to ensure that the inference and the impulse responses were not affected by the choice of priors, we report both the marginal likelihood and the expected posterior density for all models we considered, and we also consider an alternative specification for the benchmark model where the priors for the TVAR parameters conditional on Σ NL , c are centered around zero.

Conditional on c mh , the model is linear in Φ NL and Σ NL . Given c, Σ NL the posterior for Φ NL is Gaussian. Given c, Φ NL the posterior for Σ NL is inverse Wishart, and those parameters can be sampled directly. The posterior distribution for c is unknown, but can be sampled using a Metropolis-Hastings step. We use a student-t distribution with mean c mh−1 and variance equal to std(q td ) as the proposal density for sampling c. After sampling c mh , Φ N L m h , and Σ N L m h , we generate an orthonormal matrix Q drawn from a uniform distribution. We then (within the iteration) compute the generalized impulse responses for the fixed state responses as follows

1. Check if exactly one of the shocks for the linear impulse responses for the fixed excess slack regime and for the fixed close to potential regime that use the orthogonalization that uses chol NL )*Q satisfy the sign restrictions for both the fixed excess slack regime and the fixed close to potential regime. The responses to negative shocks are assumed to have the opposite signs from the responses to positive shocks. If the responses satisfy the sign restrictions, move to step 2. If no, move to the next iteration mh + 1 and discard the responses;

2. To generate the evolving responses we follow the standard approach proposed by Koop, Pesaran, and Potter (1996) with the key difference being the orthogonalization scheme. Pick a history of interest Ψ t−1 (this is the actual value of the lagged endogenous variables at time t). Given Φ1,2, Σ NL , c, Q

1. Draw a sequence of forecast errors ε t + k from N(0, Σ) for k = 0, 1, … , 20. We then generate two paths from the TVAR. The first path is the baseline path where the TVAR is simulated using the randomly drawn shocks. The second path is a path with a structural shock at time zero and the randomly drawn shocks for time 1, … , 20.

2. For the baseline path, using Ψ t−1 and ε t + k , simulate the evolution of Y t+k over 21 periods. Denote the resulting path Y t + k ( ε t + k , Ψ t 1 ) by using Equation (1);

3. For the path with the shock: the structural shock at time zero is e j where e j is a vector with 1 in element j and zeroes for all other entries. Use the inverse of Σ NL , Q to get the reduced-form shocks at time zero ε 0 . The shocks at time k = 1, 2, … 20 are the same as the shocks for the baseline period ε t + k s h o c k = ε t + k for k ≥ 1. Simulate the evolution of Y t+k over 21 periods using Equation (1). Denote the simulated evolution of Y t+k as Y t + k ( ε t + k s h o c k , Ψ t 1 ) for k = 0, … , 21;

4. Construct a draw of a sequence of impulse responses as Y t + k ( ε t + k s h o c k , Ψ t 1 ) Y t + k ( ε t + k , Ψ t 1 ) for k = 0, 1, .., 20;

5. Repeat steps 2.a through 2.d B = 500 times to obtain the average responses of Y t conditional on c, Φ, Σ, Q.To obtain the average response for a subset of histories, repeat step 2 for each history and report the distribution averaged across histories;

3. Repeat Steps 1 (and 2 if the restrictions are satisfied) for each draw of the sampler.

### A.2 Sensitivity to the choice of priors

To account for the fact that the priors are centered around the linear model estimates, we use priors that are disperse with variance covariance 100*I n . In addition to the marginal likelihood, we also report the expected posterior log likelihood, which has been shown to be less sensitive to the choice of priors (Campolieti et al. 2014). We use the slightly informative prior to speed up the sampler. However, it is important to assess whether the posteriors are sensitive to the choice of priors. We perform two experiments: in the first experiment we center the priors for the nonlinear model Φ NL NL , c around zero. In the second experiment, we centered the priors for the linear model around the HPD of the posterior got for the linear model (the second experiment is included only for completeness). Table A1 reports the results. We report the log likelihood obtained frequentist maximum likelihood estimation (which is obviously not affected by the choice of priors), the expected posterior log likelihood, and the marginal likelihood. We also report the estimated threshold for the TVAR model. The numbers on the diagonal match the numbers from Table 2 in the paper, the number on the off-diagonal are the additional experiments. Again, the data prefers the nonlinear model, and the threshold estimate is not affected by the choice of the prior. In Figure A2 we report the impulse responses for each state for the TVAR model that is used in the paper reported in Figure 3, and the impulse responses for a TVAR model where the priors are centered around zero. The responses for a prior centered around zero are virtually identical to the original responses. The only differences are at the second or third decimal place (less than 1% difference between the two sets of responses given the scale of the responses). In short, because the priors are very diffuse, they do not affect our inference or the threshold estimate.

Table A1:

MLE, posterior density, and marginal likelihood comparisons.

Prior for the VAR or TVAR coefficients
Centered around zero Centered around linear model
Linear model −2394.05

−2389.91

−833.73
−2394.05

−2388.01

−801.03
TVAR −1873.11 −1873.11
−1871.02 −0.73 −1870.00 −0.74
−430.23 (−0.85, −0.49) −450.37  (−0.86, −0.51)

Each cell reports the log likelihood obtained using maximum likelihood estimation, the expected posterior log likelihood obtained Bayesian estimation, and the log marginal likelihood (top, middle, bottom). The second entry is the threshold estimate, including 90% credibility intervals, obtained from the posterior Bayesian distribution.

Figure A2:

Impulse responses for TVAR models with informative and non-informative priors.

The figure displays the fixed-regime responses of output to a government spending shock (first row) and a tax shock (second row) in the excess slack regime (left column) and close to potential regime (right column) for the benchmark model and for the model where the priors for both states are centered around zero. Posterior medians and 90% credibility intervals are reported.

## References

Alesina, A., C. Favero, and F. Giavazzi. 2015. “The Output Effects of Fiscal Consolidations.” Journal of International Economics 96: 19–42. https://doi.org/10.1016/j.jinteco.2014.11.003.Search in Google Scholar

Alpanda, S., and S. Zubairy. 2019. “Household Debt Overhang and Transmission of Monetary Policy.” Journal of Money, Credit and Banking 51: 1265–307. https://doi.org/10.1111/jmcb.12548.Search in Google Scholar

Arias, J., J. Rubio-Ramirez, and D. Waggoner. 2018. “Inference Based on SVARS Identified With Sign and Zero Restrictions: Theory and Applications.” Econometrica 86: 685–720. https://doi.org/10.3982/ecta14468.Search in Google Scholar

Auerbach, A., and Y. Gorodnichenko. 2012. “Measuring the Output Responses to Fiscal Policy.” American Economic Journal: Economic Policy 4 (2): 1–27. https://doi.org/10.1257/pol.4.2.1.Search in Google Scholar

Auerbach, A., and Y. Gorodnichenko. 2013. “Output Spillovers From Fiscal Policy.” American Economic Review Papers and Proceedings 103 (3): 141–6. https://doi.org/10.1257/aer.103.3.141.Search in Google Scholar

Bachmann, R., and E. Sims. 2012. “Confidence and the Transmission of Government Spending Shocks.” Journal of Monetary Economics 59: 235–49. https://doi.org/10.1016/j.jmoneco.2012.02.005.Search in Google Scholar

Barnichon, R., and C. Matthes. 2017. Stimulus Versus Austerity: The Asymmetric Government Spending Multiplier. ECPR. Discussion Paper 10584.Search in Google Scholar

Baum, A., M. Poplawski-Ribeiro, and A. Weber. 2012. Fiscal Multipliers and the State of the Economy. International Monetary Fund. Working Paper No. 12.286.10.5089/9781475565829.001Search in Google Scholar

Bernadini, M., and G. Peersman. 2018. “Private Debt Overhang and the Government Spending Multiplier: Evidence for the United States.” Journal of Applied Econometrics 33: 485–508.10.1002/jae.2618Search in Google Scholar

Blanchard, O., and R. Perotti. 2002. “An Empirical Characterization of the Dynamic Effects of Changes in Government Spending and Taxes on Output.” Quarterly Journal of Economics 117 (4): 1329–68. https://doi.org/10.1162/003355302320935043.Search in Google Scholar

Caggiano, G., E. Casteluovo, V. Colombo, and G. Nodari. 2015. “Estimating Fiscal Multipliers: News From a Nonlinear World.” Economic Journal 125: 746–76. https://doi.org/10.1111/ecoj.12263.Search in Google Scholar

Campolieti, M., D. Gefang, and G. Koop. 2014. “Time Variation in the Dynamics of Worker Flows: Evidence From North America and Europe.” Journal of Applied Econometrics 29: 265–90. https://doi.org/10.1002/jae.2296.Search in Google Scholar

Candelon, B., and L. Lieb. 2013. “Fiscal Policy in Good and Bad Times.” Journal of Economic Dynamics and Control 37: 2679–94. https://doi.org/10.1016/j.jedc.2013.09.001.Search in Google Scholar

Canzoneri, M., F. Collar, H. Dellas, and B. Diba. 2016. “Fiscal Multipliers in Recessions.” Economic Journal 126: 75–108. https://doi.org/10.1111/ecoj.12304.Search in Google Scholar

Chib, S., and I. Jeliazkov. 2001. “Marginal Likelihood From the Metropolis-Hastings Output.” Journal of the American Statistical Association 96: 270–81. https://doi.org/10.1198/016214501750332848.Search in Google Scholar

Cloyne, J. 2013. “Discretionary Tax Changes and the Macroeconomy: New Narrative Evidence From the United Kingdom.” American Economic Review 103 (4): 1507–28. https://doi.org/10.1257/aer.103.4.1507.Search in Google Scholar

Favero, C., and F. Giavazzi. 2012. “Measuring tax Multipliers: The Narrative Method in Fiscal Vars.” American Economic Journal: Economic Policy 4 (2): 69–94. https://doi.org/10.1257/pol.4.2.69.Search in Google Scholar

Fazzari, S. M., J. Morley, and I. Panovska. 2015. “State-Dependent Effects of Fiscal Policy.” Studies in Nonlinear Dynamics and Econometrics 19 (3): 285–315. https://doi.org/10.1515/snde-2014-0022.Search in Google Scholar

Forni, M., and L. Gambetti. 2014. “Sufficient Information in Structural Vars.” Journal of Monetary Economics 66: 124–36. https://doi.org/10.1016/j.jmoneco.2014.04.005.Search in Google Scholar

Forni, M., and L. Gambetti. 2016. “Government Spending Shocks in Open Economy Vars.” Journal of International Economics 99: 68–84. https://doi.org/10.1016/j.jinteco.2015.11.010.Search in Google Scholar

Gali, J., D. Lopez-Salido, and J. Valles. 2007. “Understanding the Effects of Government Spending on Counsumption.” Journal of the European Economic Association 5: 227–60. https://doi.org/10.1162/jeea.2007.5.1.227.Search in Google Scholar

Guajardo, J., D. Leigh, and A. Pescatori. 2014. “Expansionary Austerity? International Evidence.” Journal of the European Economic Association 12: 949–68. https://doi.org/10.1111/jeea.12083.Search in Google Scholar

Jones, P., E. Olson, and M. Wohar. 2015. “Asymmetric Tax Multipliers.” Journal of Macroeconomics 43: 38–48. https://doi.org/10.1016/j.jmacro.2014.08.006.Search in Google Scholar

Jorda, O., and A. M. Taylor. 2016. “The Time for Austerity: Estimating the Average Treatment Effect of Fiscal Policy.” The Economic Journal 66: 219–55. https://doi.org/10.1111/ecoj.12332.Search in Google Scholar

Klein, M. 2017. “Austerity and Private Debt.” Journal of Money, Credit, and Banking 49: 1555–85. https://doi.org/10.1111/jmcb.12424.Search in Google Scholar

Koop, G., M. H. Pesaran, and S. M. Potter. 1996. “Impulse Response Analysis in Nonlinear Multivariate Models.” Journal of Econometrics 74: 119–147.10.1016/0304-4076(95)01753-4Search in Google Scholar

Leeper, E., N. Traum, and T. Walker. 2017. “Clearing Up the Fiscal Multiplier Morass.” American Economic Review 107: 2409–54. https://doi.org/10.1257/aer.20111196.Search in Google Scholar

Lo, M., and J. Piger. 2005. “Is the Response of Output to Monetary Policy Asymmetric? Evidence From a Regime-Switching Coefficients Model.” Journal of Money, Credit, and Banking 37: 865–86. https://doi.org/10.1353/mcb.2005.0054.Search in Google Scholar

McCracken, M., and S. Ng. 2016. “Fred-md: A Monthly Database for Macroeconomic Research.” Journal of Business and Economic Statistics 34: 574–89. https://doi.org/10.1080/07350015.2015.1086655.Search in Google Scholar

McManus, R., F. Gulcin Ozkan, and D. Trzeciakiewicz. 2018. Why Are Fiscal Multipliers Countercyclical? The Role of Credit Constraints. Working Paper. https://doi.org/10.1111/ecca.12340.Search in Google Scholar

Michaillat, P. 2014. “A Theory of Countercyclical Government Multiplier.” American Economic Journal: Macroeconomics 6: 190–217. https://doi.org/10.1257/mac.6.1.190.Search in Google Scholar

Morley, J., and I. Panovska. 2019. “Is Business Cycle Asymmetry Intrinsic in Industrialized Economies?” Macroeconomic Dynamics 24 (6): 1403–36. (forthcoming). https://doi.org/10.1017/S1365100518000913.Search in Google Scholar

Morley, J., and J. Piger. 2012. “The Asymmetric Business Cycle.” The Review of Economics and Statistics 91: 208–21. https://doi.org/10.1162/rest_a_00169.Search in Google Scholar

Mountford, A., and H. Uhlig. 2009. “What Are the Effects of Fiscal Policy Shocks.” Journal of Applied Econometrics 24: 960–92. https://doi.org/10.1002/jae.1079.Search in Google Scholar

Mumtaz, H., and L. Sunder-Plassmann. 2019. Non-linear Effects of Government Spending Shocks in the US. Evidence From State-Level Data. Working Paper.10.1002/jae.2800Search in Google Scholar

Owyang, M., V. Ramey, and S. Zubairy. 2013. “Are Government Spending Multipliers Greater During Periods of Slack? Evidence From 20th Century Historical Data.” American Economic Review Papers and Proceedings 103 (3): 129–34. https://doi.org/10.1257/aer.103.3.129.Search in Google Scholar

Perron, P., and T. Wada 2016. “Measuring Business Cycles with Structural Breaks and Outliers: Applications to International Data.” Research in Economics 70: 281–303.10.1016/j.rie.2015.12.001Search in Google Scholar

Ramey, V. 2011. “Identifying Government Spending Shocks: It’s All in the Timing.” Quarterly Journal of Economics 126: 1–50. https://doi.org/10.1093/qje/qjq008.Search in Google Scholar

Ramey, V., and S. Zubairy. 2018. “Government Spending Multipliers in Good Times and in Bad: Evidence From U.S. Historical Data.” Journal of Political Economy 126: 850–901. https://doi.org/10.1086/696277.Search in Google Scholar

Reinhard, C. M., and K. Rogoff. 2009. This Time Is Different: Eigh Centuries of Financial Folly. Princeton, New Jersey: Princeton University Press.10.1515/9781400831722Search in Google Scholar

Romer, C., and D. Romer. 2010. “The Macroeconomic Effects of Tax Changes: Estimates Based on a New Measure of Fiscal Shocks.” American Economic Review 100: 763–801. https://doi.org/10.1257/aer.100.3.763.Search in Google Scholar

Wu, J. C., and F. D. Xia. 2016. “Measuring the Macroeconomic Impact of Monetary Policy at the Zero Lower Bound.” Journal of Money, Credit, and Banking 48: 253–91. https://doi.org/10.1111/jmcb.12300.Search in Google Scholar