Bayesian Concept Learning

Joshua Tenenbaum's Bayesian model of concept learning (1999, 2001) addresses a fundamental question: how do people learn a new concept from just a few positive examples? The model uses Bayesian inference over a hypothesis space of possible concepts, with the "size principle" — the likelihood is inversely proportional to hypothesis size — explaining why people favor specific hypotheses that tightly fit the observed examples.

The Size Principle

Bayesian Concept Learning P(h|X) ∝ P(X|h) · P(h)

Likelihood (strong sampling): P(X|h) = (1/|h|)ⁿ for X ⊂ h, 0 otherwise
|h| = size of hypothesis h
n = number of examples

Smaller hypotheses get exponentially higher likelihood

The size principle creates a natural Occam's razor: hypotheses that are consistent with the data but overly broad (large |h|) receive low likelihood. If someone tells you a concept includes {16, 8, 2, 64}, the hypothesis "powers of 2" (small set) is strongly favored over "even numbers" (larger set) or "all numbers less than 100" (even larger). This preference for the most specific consistent hypothesis emerges naturally from the probabilistic framework without any explicit complexity penalty.

Significance

Bayesian concept learning demonstrated that human generalization patterns — which had seemed to require complex, domain-specific inductive biases — could be explained by a general-purpose Bayesian mechanism applied to structured hypothesis spaces. This work launched a productive program of "Bayesian cognitive science" that has been applied to causal reasoning, language acquisition, and intuitive physics.

The Size Principle

Significance

References

External Links

The Size Principle

Significance

Related Topics

References

External Links