Principal components analysis (PCA) is a data-reduction technique that transforms a set of p correlated variables into a set of p uncorrelated linear composites — called principal components — ordered by the amount of variance they explain. The first principal component accounts for the maximum possible variance, the second accounts for the maximum remaining variance subject to being orthogonal to the first, and so on. PCA is among the most widely used multivariate methods in psychology, though its distinction from factor analysis is frequently misunderstood.
Mathematical Foundations
where R = p × p correlation (or covariance) matrix
V = matrix of eigenvectors (component loadings)
Λ = diagonal matrix of eigenvalues λ₁ ≥ λ₂ ≥ … ≥ λ_p
Component scores: C_j = V′_j × z (standardized variables)
Each eigenvalue λ_j represents the variance explained by the j-th component. In a correlation matrix, the total variance is p (the number of variables), so the proportion of variance explained by component j is λ_j / p. The eigenvectors (columns of V) give the weights for forming the component scores as linear combinations of the original variables. These weights are chosen to maximize variance subject to orthogonality constraints.
PCA vs. Factor Analysis
Despite frequent conflation, PCA and factor analysis differ in fundamental ways. PCA analyzes total variance — the diagonal of the correlation matrix contains 1s (or raw variances) — and seeks components that reproduce the full correlation matrix. Factor analysis analyzes common variance — the diagonal contains communality estimates less than 1 — and seeks latent factors that account only for the shared variance among variables. PCA is a descriptive, data-reduction technique; factor analysis is an explanatory, latent-variable model.
Factor Analysis: common variance = Σ h²_i (communalities)
PCA: R = VΛV′ (exact for all p components)
FA: R = ΛΛ′ + Ψ (model with unique variances)
When communalities are high (above 0.70) and the number of variables is large, PCA and factor analysis yield similar results. Differences become pronounced when communalities are moderate to low, when there are few variables per factor, or when the goal is to estimate latent constructs rather than simply reduce dimensionality. Researchers whose goal is to identify underlying constructs should prefer factor analysis; those seeking efficient data reduction may prefer PCA. In practice, many published studies using "factor analysis" actually performed PCA — a confusion that methodologists have long sought to correct.
Determining the Number of Components
The number of components to retain is a critical decision. Kaiser's eigenvalue-greater-than-one rule is widely used but often retains too many components. The scree test (Cattell, 1966) plots eigenvalues and looks for an "elbow." Parallel analysis (Horn, 1965) compares observed eigenvalues to those from random data of the same dimensions, retaining only components with eigenvalues exceeding the random baseline. Parallel analysis is generally regarded as the most accurate method and is the recommended default.
PCA is applied throughout psychology: in personality research (the Big Five emerged partly from PCA of trait-rating data), in neuroimaging (PCA reduces high-dimensional voxel data), in psychophysics (principal components of spectral sensitivity functions), and in test construction (examining the dimensionality of item sets). Its computational simplicity, lack of distributional assumptions, and deterministic solution make it an enduring tool, provided its limitations — particularly the distinction from factor analysis — are clearly understood.