Ensemble Methodsintermediate

Bagging, Boosting & Stacking

“The wisdom of diverse crowds — combining imperfect models into something stronger than any individual”

Visual explanation of all ensemble paradigms — how bagging reduces variance (Random Forest), boosting reduces bias (AdaBoost.R2, SAMME), and stacking combines predictions via meta-learner.

45 min

11 diagrams

6 Concepts Covered

Prerequisites

→Decision Trees

→Gradient Boosting

Concepts Covered

Variance ReductionBias ReductionAdaBoostSAMMEMeta-learnerOOF Predictions

Previous: Gradient Boosting: XGBoost, LightGBM, CatBoost Next: OvA vs OvO Multi-class Classification

∑Key Formulas

Bias-Variance of Ensemble

Ensemble variance: reducing correlation ρ is the key gain

AdaBoost Weight

Higher weight for more accurate weak learners

Stacking Meta-Input

Out-of-fold predictions from base models feed the meta-learner

▶Interactive Simulation

Loading visualization…

⬡Model Architecture

Loading visualization…

🎯

Why Ensembles Win Kaggle

motivation

The top solution of nearly every Kaggle competition uses ensembling. Individual models have irreducible errors — some samples are hard for tree-based models, others for neural networks. By combining predictions from diverse models, errors cancel out. The result consistently beats any single model — often by 1-3% AUC, which is enormous in competition settings.

💡

The Three Ensemble Paradigms

intuition

Bagging (Bootstrap AGGregating): train K models on K random subsets of data → average/vote. Reduces variance. Random Forest is bagging. Boosting: train K models sequentially, each fixing the errors of the previous → weighted combination. Reduces bias. XGBoost is boosting. Stacking: train K diverse base models, use their predictions as features for a meta-learner that learns the optimal combination.

Key insight: ensembles work because models are diverse. A bagging ensemble of identical models has exactly the same performance as one model. Diversity = decorrelation = variance reduction.

⚙️

Stacking with Out-of-Fold Predictions

algorithm

Define K diverse base models (LightGBM, XGBoost, CatBoost, Neural Net, etc.)

For each base model, run 5-fold cross-validation

Collect out-of-fold (OOF) predictions — forms a column in meta-features matrix

Stack K columns to form meta-features matrix Ñ ∈ ℝ^{n×K}

Train meta-learner (Logistic Regression or LightGBM) on Ñ with target y

For test: average base model predictions across folds, feed to meta-learner

</>

Production Stacking Implementation

code

python25 lines

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import roc_auc_score

def stack_oof(models, X_train, y_train, X_test, n_folds=5):
    """Returns OOF predictions + test predictions for all models."""
    skf = StratifiedKFold(n_splits=n_folds, shuffle=True, random_state=42)
    oof_preds = np.zeros((len(X_train), len(models)))
    test_preds = np.zeros((len(X_test), len(models)))

    for m_idx, model in enumerate(models):
        fold_test_preds = np.zeros((len(X_test), n_folds))
        for f_idx, (tr, val) in enumerate(skf.split(X_train, y_train)):
            model.fit(X_train[tr], y_train[tr])
            oof_preds[val, m_idx] = model.predict_proba(X_train[val])[:,1]
            fold_test_preds[:, f_idx] = model.predict_proba(X_test)[:,1]
        test_preds[:, m_idx] = fold_test_preds.mean(axis=1)
        print(f"Model {m_idx} OOF AUC: {roc_auc_score(y_train, oof_preds[:,m_idx]):.4f}")

    # Meta-learner on OOF predictions
    meta = LogisticRegression(C=0.1)
    meta.fit(oof_preds, y_train)
    final_preds = meta.predict_proba(test_preds)[:,1]
    return final_preds, meta.coef_

?Knowledge Check

Progress is saved in your browser — no account needed.

Gradient Boosting: XGBoost, LightGBM, CatBoost

OvA vs OvO Multi-class Classification

Need an AI engineer or data scientist?

I build custom ML models, AI agents, computer vision, and automation — from idea to production.

Get in touch View services