Alan & Bo's Correlation & Causality Blog: Note from Les Hayduk

Commentary continues to come in on the criteria for causality. This latest note is from Les Hayduk. He has agreed to my reprinting of these lightly edited comments, which he originally posted in full to the SEMNET discussion forum on Saturday, April 5, 2008, with the subject heading: “correlation-causality blog – improvements.” Dr. Hayduk requests that continuing discussion of his comments take place on the SEMNET forum (click here for an introduction to SEMNET). – Alan

I had a look at the blog Alan provided (see below) and found this easily readable, traditional, and in some ways extremely UN-helpful. I will pick up on two of the things that seem standard, but that slant people's thinking in ways that are unhelpful, and hence where I see improvements are possible. The two matters I will take on are: experiments as the supposed benchmark/gold-standard against which SEM is to be evaluated (I doubt this), and the conditions for causality (2 of the 3 traditional conditions are wrong, the third is imprecise).

There is some substantial artificiality of comparing single experiments and single SEM studies, but I skip this for the moment, though I suspect it may eventually become an important matter.

Some failings of experiments: 1) random assignment of cases (say people) to groups should result in the groups being SIGNIFICANTLY different on 5 out of every 100 characteristics, in the long run. (SEM analysis of the experimental data can include potentially problematic variables, to see if they happen to be among the 5%, if the experimenters are not too proud to combine experiments with SEM.)

2) Experiments minimize, but do NOT statistically control for any remaining measurement error. Such control can and should be done by SEM statistics. This is NOT a feature of the experiment, but involves the statistics that could be connected to the experiment. Notice that comparing experiments and SEM is implicitly comparing two different things: the methods, and the statistics that usually go along with the methods.

SEM should be used IN CONJUNCTION WITH experimentation, so I see Alan as (possibly unknowingly) working against the helpful combining of SEM with experimentation.

Within a single experiment the mechanisms of action WITHIN the study are usually NOT well-investigated with experiments, but can be much better done in a single SEM (via inclusion of indicators of the appropriate/anticipated intervening causal structures).

Model testing is LESS well done in experiments than in SEM. SEM has an overall model test, and experiments do not usually have a comparable test (if ANOVA, or regression, or mean-differences are used as the statistical procedures). These procedures provide parameter tests parallel to those in SEM, but they have no parallel to the OVERALL MODEL TEST in SEM. Often experimenters are not even aware that they do not have an overall model test comparable to SEM's overall model test.

Enough on this for now, so I will move to the criterion for causality. Here is a quote from Alan's blog:

…contemporary SEM practitioners would probably be more comfortable with suggestions of causation if the data were collected longitudinally (more specifically, with a panel design, in which the same respondents are tracked over time). Of the three major criteria for demonstrating causality, longitudinal studies are clearly capable of demonstrating correlation and time-ordering; provided that the most plausible “third variable” candidates are measured and controlled for, the approximation to causality should be good…

Time sequence is NOT required for causation. Causes can go “both ways simultaneously” – there are such things as reciprocal causes (e.g. Rigdon, 1995, Multivariate Behavioral Research 30(3): 359-383) and variables can even cause themselves (for example, see my 1996 book chapter 3, or Hayduk, 1994, Journal of Nonverbal Behavior 18:245-260).

Correlation is NOT required. Suppressor effects can result in a variable causing another variable, and yet other parts of the causal system can counteract the causal covariance contribution, so the covariance between the variables is zero. (See Duncan, 1975, Introduction to Structural Equation Models, page 29 [equation for Greek-ro-subcript23] and realize that one term in the equation can be positive and the other of equal-magnitude yet negative.)

[Moderator’s note: See also Dean Keith Simonton’s discussion of suppressor variables and causal inference, in the posting immediately below.]

“Third variable” control should refer to MANY variable control – there can be many common causes, and many correlated causes that influence the two variables, and even reciprocal effects where the jargon of “third variables” is not quite correct. The full causal structure should be attended to, including misplacement of causally downstream variables to upstream locations. The issue here is the full proper causal specification of the model, not something connected to just third variables.

I notice you mentioned Judea Pearl's work. [Interested readers are encouraged to] have a look at the SEMNET archive for the comments Judea Pearl provided to SEMNET some years back [and] discussion of some of Pearl's work in SEM 2003, 10(2):289-311, which was designed to help SEM people understand one part of Pearl's book that directly connects to SEM.

Monday, April 7, 2008

Note from Les Hayduk