https://medium.com/data-science/bias-and-variance-...

Создано: 2 октября 2025 г.

Отвечено с использованием GPT-5 Thinking от Chat01

Мысль

Thought for 1m 19s

Ответ

Here’s a deep, no-fluff summary of the Medium article you shared.

What the article sets out to do

  • Give an intuitive and mathematical refresher on bias, variance, and irreducible error (as components of prediction error).
  • Show how to empirically decompose error into bias² and variance for real models using bootstrapping and a hands-on Python workflow (via MLxtend’s bias_variance_decomp). (Read Medium articles with AI)

Key concepts (short & precise)

  • Bias: systematic error—the gap between average model predictions and the true function. High bias → underfitting.
  • Variance: sensitivity to the training sample—predictions swing a lot across data resamples. High variance → overfitting.
  • Trade-off: Increasing model complexity usually lowers bias but raises variance; “sweet spot” is found empirically, not analytically. (Read Medium articles with AI)

How the decomposition is estimated

  • Procedure: Repeatedly bootstrap the training set, fit the same estimator each time, and evaluate all models on a fixed test set.
  • Computed quantities per the article’s pseudocode:
    • Average expected loss (MSE)
    • Average bias (squared gap between ensemble mean prediction and ground truth)
    • Average variance (dispersion of individual predictions around the ensemble mean)
      with the relation expected loss ≈ bias² + variance (noise omitted on the fixed test set).
  • Implementation is built around MLxtend’s routine with a clear loop: resample → fit → predict → aggregate. (Read Medium articles with AI)

Empirical examples & takeaways

Dataset: Boston Housing (regression).

  1. Decision Tree (single)
    • Avg expected loss: 32.419; bias: 14.197; variance: 18.222.
  2. Bagging of the same tree
    • Avg expected loss: 18.693; bias: 15.292; variance: 3.402.
    • Insight: Bagging slashes variance (models agree more) while slightly raising bias; overall error improves.
  3. Neural network (Keras)
    • Baseline: loss 25.470; bias 19.927; variance 5.543.
    • Higher complexity (more neurons): loss 23.458; bias 17.608; variance 5.850lower bias, higher variance, net gain in error. (Read Medium articles with AI)

Practical guidance distilled

  • Don’t chase only bias minimization—balance both components.
  • To reduce bias: increase model capacity (e.g., boosting, richer features/feature engineering).
  • To reduce variance: use bagging/ensembles or regularization.
  • In practice, sweep model complexities and pick the lowest total error; the theoretical “sweet spot” isn’t closed-form. (Read Medium articles with AI)

Why this matters

  • The decomposition turns a vague “over/underfit” diagnosis into measurable levers: you see if you’re dominated by bias or by variance, then choose the corresponding fix (capacity vs. stability). The article’s experiments concretely show bagging → variance↓ and capacity↑ → bias↓ (variance↑)—exactly what the theory predicts. (Read Medium articles with AI)

If you’d like, I can recreate the bias–variance plots and numbers from the article with scikit-learn/MLxtend on your machine and hand you a runnable notebook.

Поделиться этим Q&A