Sunday, April 6, 2008

Note from Dean Keith Simonton

In response to some of my recent posts, I received a nice note from Dean Keith Simonton, in which he applied the concept of suppressor variables to the three traditional criteria for causality. With Dean's permission, here is his comment. -- Alan

I was reading your posting when I came across the following page, where you state that there are three criteria for the inference of causality, the first being correlation. Correlation is specified as a necessary but not sufficient standard for causal inference.

This I believe is incorrect. When I teach causal modeling I emphasize a paradoxical version of the commonplace statement that "correlation does not prove causation," namely that "no correlation does not prove no causation." Both are equally true.

The problem is this: Not only can third variables generate spuriously non-zero correlations but they can also generate spuriously zero correlations. Only after we control for these attenuating effects will we discover that the (partial) correlation (or regression coefficient) is actually non-zero. Not only can this happen, but we even have a name for this consequence: suppression. Third variables that enlarge rather than reduce the association between two variables are suppressor variables.

Admittedly, suppression is often seen as something to be avoided. This is especially true when suppression yields standardized partial regression coefficients that are greater than one (or less than minus one) or when the relationship between the two variables changes sign (e.g., from a significantly positive correlation to a significantly negative beta). Often such effects can be seen as artifacts of poor measurement or design (e.g., excessive collinearity among the independent variables measured by tests with numerous shared items).

Yet it can also happen that suppression leads to a superior understanding of the underlying causal process. Sometimes the true model operates in such a fashion that it produces a zero bivariate correlation between two variables that are actually causally related. In such instances, no correlation does not prove no causation.

The example I use in class is the equation I've been developing over the years to predict the greatness assessments of US presidents.* It turns out that one of the best predictors in a 6-variable multiple regression equation is whether or not a president was assassinated while in office. Yet assassination does not have a significant zero-order correlation. How can this be? Well, another major predictor of leader performance is duration of tenure in office, and this variable quite understandably has a negative correlation with assassination. On the average, assassinated presidents have shorter tenures. So the positive association between tenure duration and the global leadership assessment masks the positive impact of assassination. Only when both are put into the same equation will the causal impact of assassination emerge. In addition, the predictive power of tenure duration is increased because its true effect size is no longer obscured by assassination. In the literature, this is sometimes called "cooperative suppression" (a term that seems inappropriate in the current example!).

I could give other empirical illustrations, but the foregoing should suffice. Two variables can have a causal relation even in the absence of a non-zero correlation. Zero-order correlations can be spuriously small as well as spuriously large. This outcome is especially likely in the complex causal networks that likely underlie real-world phenomena. Hence, the three conditions for causal inference from correlational data are misspecified. They probably reduce to two: temporal priority and a non-zero correlation after controlling for all reasonable third variables.

*The original 6-variable equation was published in Simonton, D.K. (1986). Presidential personality: Biographical use of the Gough Adjective Check List. Journal of Personality and Social Psychology, 51, 149-160. An update of the entire research program will appear in Simonton, D.K. (in press). Presidential greatness and its socio-psychological significance: Individual or situation? Performance or attribution? In C. Hoyt, G. R. Goethals, & D. Forsyth (Eds.), Leadership at the crossroads: Psychology and leadership (Vol. 1). Westport, CT: Praeger.

1 comment:

Unknown said...

I really think the comments from Dean Keith Simonton and Les Hayduk are excellent. I’d like, however, to add another case where you can have no correlation but still have causality: nonlinear relationships.

Indeed, Pearsonian correlation coefficient only represent the linear effects of the variable in the model. If the causal relationships are not linear, the correlation might be very well be 0 – but only because the right function was not specified.

Guillaume