Determining the correct number of factors to extract is one of the most consequential decisions in factor analysis. Overextraction yields spurious factors composed of sampling error; underextraction forces meaningful variance into the error term or collapses distinct dimensions. Parallel analysis (PA), proposed by Horn (1965), addresses this problem by generating random datasets with the same number of variables and observations as the real data, computing their eigenvalues, and retaining only those factors whose observed eigenvalues exceed the corresponding random eigenvalues.
The Method
2. Generate B random datasets (e.g., B = 1000) with n rows and p columns from N(0,1)
3. Compute eigenvalues for each random dataset
4. For each component j, compute the mean or 95th percentile of random eigenvalues
5. Retain factor j if λ_observed,j > λ_random,j
The logic is straightforward: eigenvalues from real data should exceed what would be expected from random noise if they represent genuine common factors. Random data with the same dimensions will produce eigenvalues that are not all equal to 1.0 (as would be expected theoretically) but are dispersed around 1.0 due to sampling variability. The largest random eigenvalue will exceed 1.0, which is why Kaiser's eigenvalue-greater-than-one rule tends to overextract — it does not account for the sampling distribution of eigenvalues under the null hypothesis of no common factors.
Variants and Refinements
Several variants of parallel analysis have been developed. The original method uses the mean of random eigenvalues as the comparison point, but the 95th percentile provides a more conservative (and generally more accurate) criterion. Parallel analysis based on minimum rank factor analysis (PA-MRFA) uses the eigenvalues of the reduced correlation matrix (with communality estimates on the diagonal), making it more appropriate for common factor analysis as opposed to PCA. Revised parallel analysis by Green and colleagues adjusts for the bias in eigenvalue estimation.
Simulation studies consistently show that parallel analysis outperforms other factor retention methods. Zwick and Velicer (1986) found PA correct in 92% of conditions, compared to 22% for Kaiser's rule. The scree test depends on subjective judgment and is unreliable. Velicer's MAP test and the Bayesian Information Criterion (BIC) perform well in some conditions but not as consistently as PA. The combination of parallel analysis with examination of the scree plot and theoretical considerations is the recommended best practice.
Implementation Considerations
Modern implementations of parallel analysis use Monte Carlo simulation rather than analytical formulas, generating typically 500–1,000 random datasets. The method is computationally fast for typical psychological datasets (p < 100, n < 1000) and is available in all major statistical packages. Permutation-based parallel analysis randomly rearranges the observed data rather than generating new random data, preserving the marginal distributions while destroying the correlation structure; this can be advantageous when data are not normally distributed.
Despite its strengths, parallel analysis has limitations. It performs less well when factors are weak (low loadings), when factors are highly correlated (leading to a dominant first eigenvalue), or when the number of variables per factor is small. With very large samples, PA may suggest retaining trivial factors that account for statistically significant but practically negligible variance. Nevertheless, parallel analysis remains the single best method for the number-of-factors decision, and its routine use represents one of the clearest improvements in factor-analytic practice over the past half-century.