Mathematical Psychology
About

Path Analysis

Path analysis, developed by Sewall Wright, decomposes correlations among observed variables into direct, indirect, and spurious components using systems of structural equations.

r_ij = Σ(direct paths) + Σ(indirect paths) + Σ(spurious components)

Path analysis, introduced by the geneticist Sewall Wright (1921), was one of the earliest methods for analyzing causal relationships among observed variables. It uses a system of simultaneous regression equations, depicted graphically in a path diagram, to decompose the observed correlations between variables into direct effects, indirect (mediated) effects, and spurious (non-causal) components. Path analysis laid the conceptual and mathematical groundwork for modern structural equation modeling.

Path Diagrams and Tracing Rules

Wright's Tracing Rules r_ij = Σ (products of path coefficients along all legitimate paths)

Legitimate path: go back along arrows, then forward, with at most
one reversal at a double-headed arrow (correlation) between exogenous variables.

Direct effect: Y ← X, path coefficient = β_YX
Indirect effect through M: β_YM × β_MX
Total effect = direct + Σ(indirect effects)

In a path diagram, single-headed arrows represent direct causal effects (path coefficients, which are standardized regression coefficients or unstandardized structural coefficients). Double-headed arrows represent correlations between exogenous variables. Wright's tracing rules allow the decomposition of any observed correlation into the sum of products of coefficients along all legitimate connecting paths. A legitimate path can go backward along arrows, change direction at most once (at a correlation between exogenous variables), and then go forward — never passing through the same variable twice.

Decomposition of Effects

The total effect of X on Y is the sum of the direct effect (the path coefficient from X to Y) and all indirect effects (products of path coefficients through mediating variables). The difference between the total effect and the bivariate correlation r_XY reveals the spurious component — the part of the association due to common causes rather than causal influence. This decomposition is the essence of path analysis and provides the mathematical basis for mediation analysis.

Mediation and the Indirect Effect

In the classic mediation model X → M → Y, the indirect effect is the product of the path from X to M (a) and from M to Y (b): indirect = a × b. Baron and Kenny (1986) popularized a four-step procedure for testing mediation, but modern practice favors directly testing the indirect effect using bootstrapping (Preacher & Hayes, 2008) or the joint significance test. The Sobel test provides a normal-theory approximation: z = ab / √(b²s²_a + a²s²_b), but it assumes normality of the product, which is typically skewed, making bootstrap confidence intervals preferable.

Assumptions and Limitations

Path analysis assumes that (1) the causal structure is correctly specified — omitted common causes produce biased estimates; (2) relationships are linear; (3) residuals are uncorrelated with predictors and with each other (unless explicitly modeled); and (4) all variables are measured without error. The last assumption is particularly problematic in psychology, where measurement error attenuates path coefficients and biases indirect effects. This limitation motivated the development of latent variable structural equation models, which incorporate measurement error explicitly.

Despite its limitations, path analysis remains valuable as a conceptual tool for thinking about causal processes and as a building block for more complex models. The transition from path analysis of observed variables to structural equation modeling with latent variables represents one of the most important methodological developments in quantitative psychology, preserving Wright's fundamental insight — that correlations can be decomposed into meaningful causal components — while addressing the pervasive problem of measurement error.

Related Topics

References

  1. Wright, S. (1921). Correlation and causation. Journal of Agricultural Research, 20, 557–585.
  2. Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research. Journal of Personality and Social Psychology, 51(6), 1173–1182. doi:10.1037/0022-3514.51.6.1173
  3. Preacher, K. J., & Hayes, A. F. (2008). Asymptotic and resampling strategies for assessing and comparing indirect effects in multiple mediator models. Behavior Research Methods, 40(3), 879–891. doi:10.3758/BRM.40.3.879
  4. Bollen, K. A. (1989). Structural equations with latent variables. Wiley. doi:10.1002/9781118619179

External Links