Spring 2026 • Northeastern University

Regularization Explorer

Visualize how L1 (Lasso) and L2 (Ridge) regularization control the bias-variance tradeoff. Select a synthetic dataset, choose a regularization type, and watch how coefficients shrink and prediction errors change as λ increases.

Concepts from Regression Part 2 — CS 6140


Visualizations

Coefficients vs λ

Training vs Test Error

Fit at Selected λ


Understanding Regularization

Why does test error increase at very low λ?

When λ ≈ 0, there is almost no regularization. The model fits the training data very closely — including noise. This overfitting means the model generalizes poorly to new data, causing high test error.

Why do Lasso coefficients go to exactly zero?

The L1 penalty creates a diamond-shaped constraint region. The optimal solution often lies at a corner of this diamond, where one or more coefficients are exactly zero. This makes Lasso a feature selection method — it automatically identifies irrelevant features.

How do you choose λ?

In practice, use cross-validation: split training data into folds, fit the model at many λ values, and choose the λ that minimizes average validation error. The optimal λ balances bias (underfitting) and variance (overfitting).