Clark Glymour, Joseph D. Ramsey, Kun Zhang
January 18, 2019
A central theme in western philosophy was to find formal methods that can reliably discover empirical relationships and their explanations from data assembled from experience. As a philosophical project, that ambition was abandoned in the 20th century and generally dismissed as impossible. It was replaced in philosophy by neo-Kantian efforts at reconstruction and justification, and in professional statistics by the more limited ambition to estimate a small number of parameters in pre-specified hypotheses. The influx of “big data” from climate science, neuropsychology, biology, astronomy and elsewhere implicitly called for a revival of the grander philosophical ambition. Search algorithms are meeting that call, but they pose a problem: how are their accuracies to be assessed in domains where experimentation is limited or impossible? Increasingly, the answer is through simulation of data from models of the kind of process in the domain. In some cases, these innovations require rethinking how the accuracy and informativeness of inference methods can be assessed. Focusing on causal inference, we give an example from neuroscience, but to show that the model/simulation strategy is not confined to causal inference, we also consider two classification problems from astrophysics: identifying exoplanets and identifying dark matter concentrations.