The shifted Wald distribution, also known as the shifted inverse Gaussian, arises naturally from a simple model of evidence accumulation. If a decision process accumulates noisy evidence at a constant drift rate toward a single absorbing boundary, the time to reach that boundary follows an inverse Gaussian distribution. Adding a shift parameter accounts for non-decision time components such as stimulus encoding and motor execution.
Mathematical Foundation
The shifted Wald distribution has three parameters with clear psychological interpretations:
γ = drift rate (rate of evidence accumulation)
θ = shift (non-decision time)
The probability density function is defined for t > θ, where α > 0 and γ > 0. The mean of the distribution is θ + α/γ, and the variance is α/γ³. Because the distribution is derived from a specific stochastic process (Brownian motion with drift), every parameter has a direct connection to an underlying cognitive mechanism.
Relationship to the Drift Diffusion Model
The shifted Wald describes the first-passage time of a one-boundary diffusion process. The full drift diffusion model (DDM) with two absorbing boundaries produces a defective distribution at each boundary — the Wald distribution emerges as a special case when accuracy is very high and essentially all responses terminate at the correct boundary. This connection makes the shifted Wald a principled simplification for tasks where error rates are negligible.
While both distributions fit RT data well, the shifted Wald has a theoretical advantage: its parameters are directly interpretable in terms of an evidence accumulation process. The ex-Gaussian's μ, σ, and τ parameters, by contrast, lack straightforward cognitive interpretations. However, the ex-Gaussian is computationally simpler and remains popular for purely descriptive purposes.
Fitting and Applications
The shifted Wald can be fit via maximum likelihood, and closed-form expressions exist for the MLE of all three parameters given a known shift. In practice, all three parameters are typically estimated simultaneously using numerical optimization. The distribution has been applied to simple detection tasks, go/no-go paradigms, and single-response lexical decision experiments — any context where a single accumulation process terminates at one boundary.
Anders, Alario, and Van Maanen (2016) provided a comprehensive comparison showing that the shifted Wald outperforms the ex-Gaussian in accounting for the shape of RT distributions in simple tasks, particularly capturing the heavy right tail that is characteristic of reaction time data.