Regression: Model Comparison
Compare how different feature engineering choices affect a linear regression model fitted to the classic Advertising dataset. Toggle options below and watch the coefficients and fit metrics update instantly.
Dataset from Chapter 3 of Introduction to Statistical Learning (James et al.)
Controls
Model Configuration
Model Summary
Residual Analysis
Coefficients
Interpreting the Results
How to read the residual plot
- Random scatter around zero: The model captures the relationship well
- Curved pattern: A nonlinear relationship exists — try polynomial features
- Funnel shape: Heteroscedasticity — variance changes with fitted values
- Large outliers: Individual points the model struggles to predict
Try toggling the interaction term — notice how the residual pattern changes when TV × Radio synergy is captured by the model.
What do the coefficients mean?
- Coefficients represent the change in sales for a one-unit change in the feature
- When features are normalized, coefficients reflect relative importance
- The intercept captures baseline sales when all features are zero
- Interaction terms capture synergy — does TV advertising work better in markets with high radio spend?
Why normalize features?
Without normalization, coefficient magnitudes depend on feature scales. TV ranges from 0-300 while Radio ranges from 0-50, so raw TV coefficients appear smaller. Normalization puts all features on the same scale, making coefficients directly comparable.