Bias–Variance & Generalization Explorer

Inspired by Raschka, STAT 479: Model Evaluation 1 – Overfitting and Underfitting
This app links three ideas from the notes:
\[ \mathbb{E}\big[(y - \hat{y})^2\big] = \underbrace{\big(f(x) - \mathbb{E}[\hat{f}(x)]\big)^2}_{\text{Bias}^2} + \underbrace{\mathbb{E}\big[(\hat{f}(x) - \mathbb{E}[\hat{f}(x)])^2\big]}_{\text{Variance}} + \underbrace{\sigma^2}_{\text{Noise (irreducible)}} \]
Underfitting (High Bias)
Capacity controls hypothesis space size (capacity).
Visualization Mode (Left Plot)
Single fit: One training set, one model.
Many fits: Several training sets, several models – illustrates high bias vs. high variance.

1. Data, True Function, and Fitted Models

We assume a true regression function \(f(x)\). Each training set is drawn as \(y = f(x) + \epsilon\) with noise \(\epsilon\). We fit a polynomial of degree \(d\) and compare:
Single Fit: One Training Set
True function   Training points   Model fit(s)
Training vs. Test Error (Conceptual)
Train error   Test (generalization) error
Left: underfitting (high bias), right: overfitting (high variance).
Bias–Variance Decomposition (Conceptual)
Bias²   Variance   Total error (with noise)