# To Explain or to Predict

@article{Shmueli2011ToEO, title={To Explain or to Predict}, author={Galit Shmueli}, journal={arXiv: Methodology}, year={2011} }

Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been… Expand

#### 1,365 Citations

Explanatory Versus Predictive Modeling

- Medicine
- PM & R : the journal of injury, function, and rehabilitation
- 2014

The differences between explanatory and predictive modeling are reviewed, which affects every aspect of model building and evaluation. Expand

What Can We Learn from Predictive Modeling?

- Computer Science, Mathematics
- Political Analysis
- 2017

The central benefits of predictive modeling are reviewed from a perspective uncommon in the existing literature: it is focused on how predictive modeling can be used to complement and augment standard associational analyses. Expand

A Unified Statistical Framework for Evaluating Predictive Methods

- Computer Science
- ICIS
- 2014

A unified statistical framework is presented for evaluating predictive methods that can be applied to most problems and datasets and has the theoretical advantage that it is not necessary to assume a normal distribution. Expand

Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning

- Psychology, Medicine
- Perspectives on psychological science : a journal of the Association for Psychological Science
- 2017

It is proposed that principles and techniques from the field of machine learning can help psychology become a more predictive science and an increased focus on prediction, rather than explanation, can ultimately lead to greater understanding of behavior. Expand

A note on the interpretation of tree-based regression models.

- Mathematics, Medicine
- Biometrical journal. Biometrische Zeitschrift
- 2020

If the generating model contains chains of direct and indirect effects, then the typical variable importance measures suggest selecting as important mainly the background variables, which have a strong indirect effect, disregarding the variables that directly influence the response. Expand

Model averaging in ecology: a review of Bayesian, information-theoretic, and tactical approaches for predictive inference

- Computer Science
- Ecological Monographs
- 2018

It is shown that model averaging is particularly useful if the predictive error of contributing model predictions is dominated by variance, and if the covariance between models is low, and for noisy data, which predominate in ecology, these conditions will often be met. Expand

A Model Must Be Wrong to be Useful: The Role of Linear Modeling and False Assumptions in Theoretical Explanation~!2010-01-05~!2010-04-18~!2010-07-21~!

- Mathematics
- 2010

It is true that many times relationships in the real world do not fall into a linear pattern. Nevertheless, even if the true causal structure of the phenomenon under study is not linear, it does not… Expand

Enhancing Validity in Observational Settings When Replication Is Not Possible

- Mathematics
- 2018

We argue that political scientists can provide additional evidence for the predictive validity of observational and quasi-experimental research designs by minimizing the expected prediction error or… Expand

The Need for More Emphasis on Prediction: A “Nondenominational” Model-Based Approach

- Computer Science
- 2014

It is argued that the performance of a prediction procedure in repeated application is important and should play a significant role in its evaluation. Expand

Reflection on modern methods: generalized linear models for prognosis and intervention—theory, practice and implications for machine learning

- Computer Science, Medicine
- International journal of epidemiology
- 2020

Five primary ways in which generalized linear models for prediction differ from GLMs for causal inference are identified, which will help ensure that both prediction and causal modelling are used appropriately and to greatest effect in health research. Expand

#### References

SHOWING 1-10 OF 230 REFERENCES

To Explain or To Predict?

- Computer Science
- 2010

The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the model- ing process. Expand

Causation, Prediction, and Search, 2nd Edition

- Mathematics
- 2001

What assumptions and methods allow us to turn observations into causal knowledge, and how can even incomplete causal knowledge be used in planning and prediction to influence and control our… Expand

Prediction Versus Accommodation and the Risk of Overfitting

- Computer Science
- The British Journal for the Philosophy of Science
- 2004

A new approach to the vexed problem of understanding the epistemic difference between prediction and accommodation is presented, floating the hypothesis that accommodation is a defective methodology only when the methods used to accommodate the data fail to guard against the risk of overfitting. Expand

Instrumentalism, Parsimony, and the Akaike Framework

- Mathematics
- Philosophy of Science
- 2002

Akaike’s framework for thinking about model selection in terms of the goal of predictive accuracy and his criterion for model selection have important philosophical implications. Scientists often… Expand

The art of causal conjecture

- Computer Science
- 1996

The Art of Causal Conjecture shows that causal ideas can be equally important in theory and by bringing causal ideas into the foundations of probability allows causal conjectures to be more clearly quantified, debated, and confronted by statistical evidence. Expand

Bayes model averaging with selection of regressors

- Mathematics
- 2002

When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian… Expand

The Hierarchy Principle in Designed Industrial Experiments

- Computer Science
- 2005

The general question of appropriate criteria for the development of models from a designed experiment is considered, and the broader research question revolves around the choice of criteria for model building, variable selection and model discrimination. Expand

Studies in the Logic of Explanation

- Philosophy
- Philosophy of Science
- 1948

To explain the phenomena in the world of our experience, to answer the question "why?" rather than only the question "what?", is one of the foremost objectives of all rational inquiry; and… Expand

Predictive Analytics in Information Systems Research

- Computer Science
- MIS Q.
- 2011

To show that predictive analytics and explanatory statistical modeling are fundamentally disparate, it is shown that they are different in each step of the modeling process and these differences translate into different final models, so that a pure explanatory statistical model is best tuned for testing causal hypotheses and a pure predictive models is best in terms of predictive power. Expand

Statistical modeling: The two cultures

- Computer Science
- 2001

If the goal as a field is to use data to solve problems, then the statistical community needs to move away from exclusive dependence on data models and adopt a more diverse set of tools. Expand