Abstract
Is academic research anticipating economic shake-ups or merely reflecting the past? Exploiting the corpus of articles published in the Journal of Economics and Statistics (Jahrbücher für Nationalökonomie und Statistik) for the years 1949 to 2010, this pilot study proposes a quantitative framework for addressing these questions. The framework comprises two steps. First, methods from computational linguistics are used to identify relevant topics and their relative importance over time. In particular, Latent Dirichlet Analysis is applied to the corpus after some preparatory work. Second, for some of the topics which are closely related to specific economic indicators, the developments of topic weights and indicator values are confronted in dynamic regression and VAR models. The results indicate that for some topics of interest, the discourse in the journal leads developments in the real economy, while for other topics it is the other way round.
Acknowledgments
We want to thank Andreas Schiermeier, Peter Reifschneider, and Christoph Funk for assistance with obtaining and preprocessing the data. We are grateful for the funding provided by Lucius & Lucius and to Jochen Kothe at Niedersächsische Staats- und Universitätsbibliothek Göttingen, Georg-August-Universität Göttingen for providing access to digizeitschriften.de. Last but not least, we want to thank three anonymous referees for their comments and the editor Joachim Wagner for facilitating the refereeing process.
References
Banerjee, A., I.S. Dhillon, J. Ghosh, S. Sra (2005), Clustering on the Unit Hypersphere Using Von Mises-Fisher Distributions. Journal of Machine Learning Research 6: 1345–1382.Search in Google Scholar
Blei, D.M., J.D. Lafferty (2007), A Correlated Topic Model of Science. Annals of Applied Statistics 1 (1): 17–35.Search in Google Scholar
Blei, D.M., J.D. Lafferty (2009), Topic Models. 71–94 in: A.N. Srivastava, M. Sahami (Eds.), Text Mining: Classification, Clustering, and Applications. Data Mining and Knowledge Discovery Series, Chap. 4. Boca Raton, Florida, USA: CRC Press.Search in Google Scholar
Blei, D. M., Ng, A. Y., Jordan, M. I. (2003), Latent Dirichlet Allocation. Journal of Machine Learning Research 3: 993–1022.Search in Google Scholar
Buntine, W. (2009), Estimating Likelihoods for Topic Models. In: Advances in Machine Learning. Vol. 5828. Lecture Notes in Computer Science, 51–64.Search in Google Scholar
Burret, H.T., L.P. Feld, E.A. Köhler (2013), Sustainability of Public Debt in Germany – Historical Considerations and Time Series Evidence. Journal of Economics and Statistics (Jahrbücher Fuer Nationalökonomie Und Statistik) 233 (3): 291–335.Search in Google Scholar
Chang, J., J. Boyd-Graber, C. Wang, S. Gerrish, D.M. Blei (2009), Reading Tea Leaves: How Humans Interpret Topic Models. Neural Information Processing Systems.Search in Google Scholar
Deerwester, S.C., S.T. Dumais, T.K. Landauer, G.W. Furnas, R.A. Harshman (1990), Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science 41 (6): 391–407.Search in Google Scholar
Dhillon, I.S., D.S. Modha (2001), Concept Decompositions for Large Sparse Text Data Using Clustering. Machine Learning 42 (1): 143–175.Search in Google Scholar
Griffiths, T.L., M. Steyvers (2004), Finding Scientific Topics. Proceedings of the National Academy of Sciences 101 (Suppl 1): 5228–5235.Search in Google Scholar
Grün, B., K. Hornik (2011), Topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40 (13): 1–30.Search in Google Scholar
Hall, D., D. Jurafsky, C.D. Manning (2008), Studying the History of Ideas Using Topic Models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. EMNLP ’08, 363–371.Search in Google Scholar
Hansen, S., M. McMahon, A. Prat (2014), Transparency and Deliberation within the FOMC: A Computational Linguistics Approach. CEP Discussion Papers dp1276. Centre for Economic Performance.Search in Google Scholar
Hofmann, T. (1999), Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 50–57.Search in Google Scholar
Hornik, K., I. Feinerer, M. Kober, C. Buchta (2012), Spherical k-means Clustering. Journal of Statistical Software 50 (10): 1–22.Search in Google Scholar
Hornik, K., B. Grün (2014), MovMF: An R Package for Fitting Mixtures of von Mises-Fisher Distributions. Journal of Statistical Software 58 (10): 1–31.Search in Google Scholar
Lancichinetti, A., M. Irmak Sirer, J.X. Wang, D. Acuna, K. Körding, L.A.N. Amaral (2015), High-Reproducibility and High-Accuracy Method for Automated Topic Classification. In: Physical Review X 5.011007.Search in Google Scholar
Larsen, V.H., L.A. Thorsrud (2015), The Value of News. Working Papers 0034. Centre for Applied Macro- and Petroleum economics (CAMP), BI Norwegian Business School.Search in Google Scholar
Lütkepohl, H. (2007), New Introduction to Multiple Time Series Analysis. Berlin, Springer.Search in Google Scholar
Rahlf, T. (2016), The German Time Series Dataset 1834–2012. Journal of Economics and Statistics (Jahrbücher Fuer Nationalökonomie Und Statistik) 236 (1): 129–143.Search in Google Scholar
Roberts, M.E., B.M. Stewart, E.M. Airoldi (2016), A Model of Text for Experimentation in the Social Sciences. Journal of the American Statistical Association. DOI: http://dx.doi.org/10.1080/01621459.2016.1141684Search in Google Scholar
Salton, G., M.J. McGill (1986), Introduction to Modern Information Retrieval. New York, NY, McGraw-Hill.Search in Google Scholar
Turner, A., A. Haldane, P. Wolley, S. Wadhwani, A. Smithers, A. Large, J. Kay, M. Wolf, P. Boone, S. Johnson, R. Layard (2010), The Future of Finance: The LSE Report. London School of Economics & Political Science. Available at: https://harr123et.wordpress.com/.Search in Google Scholar
Wallach, H.M., I. Murray, R. Salakhutdinov, D. Mimno (2009), Evaluation Methods for Topic Models. In: Proceedings of the 26th Annual International Conference on Machine Learning. ICML ’09, 1105–1112.Search in Google Scholar
Welling, M., Y.W. Teh, B. Kappen (2008), Hybrid Variational/MCMC Inference in Bayesian Networks. In: Proceedings on the 24th Conference on Uncertainty in Artificial Intelligence.Search in Google Scholar
Winker, P. (2000), Optimized Multivariate Lag Structure Selection. Computational Economics 16 (1-2): 87–103.Search in Google Scholar
Appendices
A German stopwords
The following stopwords are removed from the vocabulary. The list is supplied by the r package tm.
aber alle allem allen aller alles als also am an ander andere anderem anderen anderer anderes anderm andern anderr anders auch auf aus bei bin bis bist da damit dann der den des dem die das daß derselbe derselben denselben desselben demselben dieselbe dieselben dasselbe dazu dein deine deinem deinen deiner deines denn derer dessen dich dir du dies diese diesem diesen dieser dieses doch dort durch ein eine einem einen einer eines einig einige einigem einigen einiger einiges einmal er ihn ihm es etwas euer eure eurem euren eurer eures für gegen gewesen hab habe haben hat hatte hatten hier hin hinter ich mich mir ihr ihre ihrem ihren ihrer ihres euch im in indem ins ist jede jedem jeden jeder jedes jene jenem jenen jener jenes jetzt kann kein keine keinem keinen keiner keines können könnte machen man manche manchem manchen mancher manches mein meine meinem meinen meiner meines mit muss musste nach nicht nichts noch nun nur ob oder ohne sehr sein seine seinem seinen seiner seines selbst sich sie ihnen sind so solche solchem solchen solcher solches soll sollte sondern sonst über um und uns unse unsem unsen unser unses unter viel vom von vor während war waren warst was weg weil weiter welche welchem welchen welcher welches wenn werde werden wie wieder will wir wird wirst wo wollen wollte würde würden zu zum zur zwar zwischen.
B Tables
List of volumes.
Vol | Year | Vol | Year | Vol | Year | Vol | Year | Vol | Year | Vol | Year | Vol | Year |
1 | 1863 | 38 | 1882 | 74 | 1900 | 111 | 1918 | 147 | 1938 | 184 | 1970 | 221 | 2001 |
2 | 1864 | 39 | 1882 | 75 | 1900 | 112 | 1919 | 148 | 1938 | 185 | 1971 | 222 | 2002 |
3 | 1864 | 40 | 1883 | 76 | 1901 | 113 | 1919 | 149 | 1939 | 186 | 71/72 | 223 | 2003 |
4 | 1865 | 41 | 1883 | 77 | 1901 | 114 | 1920 | 150 | 1939 | 187 | 72/73 | 224 | 2004 |
5 | 1865 | 42 | 1884 | 78 | 1902 | 115 | 1920 | 151 | 1940 | 188 | 1975 | 225 | 2005 |
6 | 1866 | 43 | 1884 | 79 | 1902 | 116 | 1921 | 152 | 1940 | 189 | 1975 | 226 | 2006 |
7 | 1866 | 44 | 1885 | 80 | 1903 | 117 | 1921 | 153 | 1941 | 190 | 75/76 | 227 | 2007 |
8 | 1867 | 45 | 1885 | 81 | 1903 | 118 | 1922 | 154 | 1941 | 191 | 76/77 | 228 | 2008 |
9 | 1867 | 46 | 1886 | 82 | 1904 | 119 | 1922 | 155 | 1942 | 192 | 77/78 | 229 | 2009 |
10 | 1868 | 47 | 1886 | 83 | 1904 | 120 | 1923 | 156 | 1942 | 193 | 1978 | 230 | 2010 |
11 | 1868 | 48 | 1887 | 84 | 1905 | 121 | 1923 | 157 | 1943 | 194 | 1979 | ||
12 | 1869 | 49 | 1887 | 85 | 1905 | 122 | 1924 | 158 | 1943 | 195 | 1980 | ||
13 | 1869 | 50 | 1888 | 86 | 1906 | 123 | 1925 | 159 | 1944 | 196 | 1981 | ||
14 | 1870 | 51 | 1888 | 87 | 1906 | 124 | 1926 | 160 | 1944 | 197 | 1982 | ||
15 | 1870 | r | 1888 | 88 | 1907 | 125 | 1926 | 161 | 1949 | 198 | 1983 | ||
16 | 1871 | 52 | 1889 | 89 | 1907 | 126 | 1927 | 162 | 1950 | 199 | 1984 | ||
17 | 1871 | 53 | 1889 | 90 | 1908 | 127 | 1927 | 163 | 1951 | 200 | 1985 | ||
18 | 1872 | 54 | 1890 | 91 | 1908 | 128 | 1928 | 164 | 1952 | 201 | 1986 | ||
19 | 1872 | 55 | 1890 | 92 | 1909 | 129 | 1928 | 165 | 1953 | 202 | 1986 | ||
20 | 1873 | 56 | 1891 | 93 | 1909 | 130 | 1929 | 166 | 1954 | 203 | 1987 | ||
21 | 1873 | 57 | 1891 | 94 | 1910 | 131 | 1929 | 167 | 1955 | 204 | 1988 | ||
22 | 1874 | 58 | 1892 | 95 | 1910 | 132 | 1930 | 168 | 1956 | 205 | 1988 | ||
23 | 1874 | 59 | 1892 | 96 | 1911 | 133 | 1930 | 169 | 1958 | 206 | 1989 | ||
24 | 1875 | 60 | 1893 | 97 | 1911 | 134 | 1931 | 170 | 1958 | 207 | 1990 | ||
25 | 1875 | 61 | 1893 | 98 | 1912 | 135 | 1931 | 171 | 1959 | 208 | 1991 | ||
26 | 1876 | 62 | 1894 | 99 | 1912 | r | 1931 | 172 | 1960 | 209 | 1992 | ||
27 | 1876 | 63 | 1894 | 100 | 1913 | 136 | 1932 | 173 | 1961 | 210 | 1992 | ||
28 | 1877 | 64 | 1895 | 101 | 1913 | 137 | 1932 | 174 | 1962 | 211 | 1993 | ||
29 | 1877 | 65 | 1895 | 102 | 1914 | 138 | 1933 | 175 | 1963 | 212 | 1993 | ||
30 | 1878 | 66 | 1896 | 103 | 1914 | 139 | 1933 | 176 | 1964 | 213 | 1994 | ||
31 | 1878 | 67 | 1896 | 104 | 1915 | 140 | 1934 | 177 | 1965 | 214 | 1995 | ||
32 | 1879 | 68 | 1897 | 105 | 1915 | 141 | 1935 | 178 | 1965 | 215 | 1996 | ||
33 | 1879 | 69 | 1897 | 106 | 1916 | 142 | 1935 | 179 | 1966 | 216 | 1997 | ||
34 | 1879 | 70 | 1898 | 107 | 1916 | 143 | 1936 | 180 | 1967 | 217 | 1998 | ||
35 | 1880 | 71 | 1898 | 108 | 1917 | 144 | 1936 | 181 | 67/68 | 218 | 1999 | ||
36 | 1881 | 72 | 1899 | 109 | 1917 | 145 | 1937 | 182 | 68/69 | 219 | 1999 | ||
37 | 1881 | 73 | 1899 | 110 | 1918 | 146 | 1937 | 183 | 69/70 | 220 | 2000 |
Notes on the list of volumes
181 Issue 4 was the first to appear in 1968 (March)
182 Issue 4–5 was the first to appear in 1969 (March)
183 Issue 5 was the first to appear in 1970 (February)
184 completely appeared in 1970
185 completely appeared in 1971
186 Issue 3 was the first to appear in 1972 (February)
187 Issue 2 was the first to appear in 1973 (January)
188 Issue 1 Appeared in 1973 (December), Issue 2–5 appeared in 1974 (January to November), Issue 6 appeared in 1975 (February)
190 All issues appeared in 1976 (contrary to the available meta data)
191 Issue 4 was the first to appear in 1977 (February)
192 Issue 5 was the first to appear in 1978
C Topic probabilities
The Figure 4 shows the development of probabilities for the key topics between 1948 and 2010.

Topic probabilities.
D Further topics
The following pages show additional topics identified by the LDA algorithm. In addition to the key topics used in the analysis, there are further topics in the field of inflation (Figure 5), trade (Figure 6), debt (Figure 7) unemployment (Figure 8) and interest rates (Figure 9). This list of of fields is far from being exhaustive. There are a variety of other topics discussed in the journal (see examples in Figure 10), which are not easily operationalized as the discussion of capitalism and Marxism (Topic 100) or may not very interesting from an economic point of view (e. g. “terms describing a table” in Topic 165).

Estimated topics related to inflation.

Estimated topics related to trade.

Estimated topics related to debt.

Estimated topics related to unemployment.

Addtional estimated topic related to interest rates.

Example for “unrelated topics” estimated by the algorithm.
While Topic 144, which we used in the analysis, is narrowly focused on inflation and the inflation rate, there are further topics related to inflation (Figure 5), Topic 119 is concerned with geldpoliti [en: monetary policy], as well as money supply and expansionary policy. Topic 134 is concerned with shocks, with inflation being a prominent term. Topic 142 is the English language equivalent to Topic 119 (monetary policy). Figure 6 shows further topics associated with international trade. The German equivalent (topic 36) to the topic we selected (Topic 1) is centered around “ausland” and “inland” [en: foreign and domestic] and not as narrow as the english original. Topic 44 is loosely concerned with trade, with terms “handelspoliti” [en: trade policy] and “aussenhandelstheori” [en: theory of international trade] popping into the eye. Price differentiation [ger: preisdifferenzier], product [ger: erzeugnis] as well as terms relating to foreign and domestic are at the center of topic 86. Figures 7 and 8 show additional topics related to debt and unemployment respectively. Apart from topic 191, which is concerned with interest rates in the narrow sense and consequently used in our analysis, only Topic 120 (Figure 9) appears to be somewhat related but talks more about central banking.
In the regression analysis it would be possible to combine two or more topics, which makes the analysis broader. Prior research has shown that this does not improve our results. It can be assumed that narrow topics are best at reflecting narrow economic ideas.
©2016 by De Gruyter Mouton