Time Series Forecasting
“When the order of observations matters — learning from the past to predict the future”
Trend-seasonal-residual decomposition, lag features, rolling statistics, Fourier seasonality, TimeSeriesSplit cross-validation, ARIMA intuition, and gradient boosting for tabular forecasting — with animated decomposition and 3-step forecast.
Prerequisites
Concepts Covered
∑Key Formulas
Decomposition
Additive: trend + seasonal + residual. Multiplicative: T × S × R when amplitudes scale with trend.
AR(p) Model
Autoregression: current value is a linear combination of p past values
ACF
Autocorrelation Function — how correlated is the series with its k-step lag?
MAPE
Mean Absolute Percentage Error — scale-free forecasting metric
▶Interactive Simulation
Time Series Are Everywhere
Stock prices, electricity demand, server CPU load, website traffic, COVID cases, weather, sales — all are time series. The fundamental difference from standard ML: observations are ordered and correlated. Using tomorrow's data to predict yesterday violates causality. Using a standard train/test split (random shuffle) contaminates your evaluation because test data appears in the training period. Time series require temporal cross-validation and temporal feature engineering.
Prophet (Meta) and ARIMA are industry standards for forecasting. But gradient boosting with careful lag features and TimeSeriesSplit cross-validation often beats both on tabular time series.
Decomposition: Separating Signal from Noise
Most real-world time series have three components: Trend (the long-run direction — sales increasing over years), Seasonality (repeating patterns — higher sales in December, lower in January), and Residuals (random noise after trend and seasonality are removed). Additive decomposition works when seasonal amplitude is constant; multiplicative when it grows with the trend. STL (Seasonal-Trend decomposition using LOESS) is the modern robust approach — handles multiple seasonality periods and outliers.
Creating Features from Time
Time series can be treated as supervised ML by creating lag features and rolling statistics. Lag features: y_{t-1}, y_{t-2}, ..., y_{t-p} capture autocorrelation. Rolling statistics: rolling_mean(window=7), rolling_std, rolling_max capture recent trend and volatility. Calendar features: hour_of_day, day_of_week, month, is_holiday capture seasonality. Fourier features: sin(2πt/period), cos(2πt/period) encode smooth seasonal patterns. Once these features are created, any ML model (XGBoost, LightGBM) can be applied.
Create lag features: df['lag_1'] = df['y'].shift(1)
Rolling statistics: df['roll_mean_7'] = df['y'].rolling(7).mean()
Calendar features: df['dayofweek'] = df.index.dayofweek
Fourier seasonality: sin/cos pairs for each seasonal period
Always use TimeSeriesSplit — never shuffle time series for CV
Gap between train/validation: add gap= to avoid leakage from autocorrelation
TimeSeriesSplit: Correct Cross-Validation
Fold 1: Train=[t₁…t₃₀₀], Val=[t₃₀₁…t₄₀₀]
Fold 2: Train=[t₁…t₄₀₀], Val=[t₄₀₁…t₅₀₀]
Fold 3: Train=[t₁…t₅₀₀], Val=[t₅₀₁…t₆₀₀]
Training window always ends before validation — no future leakage
Option: gap=k between train end and val start (avoids autocorrelation leakage)
Option: max_train_size=N for rolling window (only last N points in train)
Forecasting with sklearn + LightGBM
import pandas as pd import numpy as np import lightgbm as lgb from sklearn.model_selection import TimeSeriesSplit from sklearn.metrics import mean_absolute_error class="tok-comment"># ── Sample daily time-series DataFrame ──────────────────────────────── dates = pd.date_range(class="tok-str">'class="tok-num">2022-class="tok-num">01-class="tok-num">01', periods=class="tok-num">365, freq=class="tok-str">'D') np.random.seed(class="tok-num">42) trend = np.linspace(class="tok-num">100, class="tok-num">200, class="tok-num">365) seasonal = class="tok-num">20 * np.sin(class="tok-num">2 * np.pi * np.arange(class="tok-num">365) / class="tok-num">7) class="tok-comment"># weekly pattern noise = np.random.randn(class="tok-num">365) * class="tok-num">5 df = pd.DataFrame({class="tok-str">'sales': trend + seasonal + noise}, index=dates) def create_features(df, target_col, lags, rolling_windows): class="tok-str">"""Create lag and rolling features for supervised time series forecasting.""" df = df.copy() for lag in lags: df[fclass="tok-str">'lag_{lag}'] = df[target_col].shift(lag) for w in rolling_windows: df[fclass="tok-str">'roll_mean_{w}'] = df[target_col].shift(class="tok-num">1).rolling(w).mean() df[fclass="tok-str">'roll_std_{w}'] = df[target_col].shift(class="tok-num">1).rolling(w).std() class="tok-comment"># Calendar features df[class="tok-str">'dayofweek'] = df.index.dayofweek df[class="tok-str">'month'] = df.index.month df[class="tok-str">'is_weekend'] = df[class="tok-str">'dayofweek'] >= class="tok-num">5 class="tok-comment"># Fourier seasonality (weekly=class="tok-num">7, yearly=class="tok-num">365) for k in range(class="tok-num">1, class="tok-num">3): df[fclass="tok-str">'sin_week_{k}'] = np.sin(class="tok-num">2*np.pi*k * df.index.dayofyear / class="tok-num">7) df[fclass="tok-str">'cos_week_{k}'] = np.cos(class="tok-num">2*np.pi*k * df.index.dayofyear / class="tok-num">7) return df.dropna() df_feat = create_features(df, class="tok-str">'sales', lags=[class="tok-num">1,class="tok-num">2,class="tok-num">3,class="tok-num">7,class="tok-num">14,class="tok-num">28], rolling_windows=[class="tok-num">7,class="tok-num">14,class="tok-num">28]) X = df_feat.drop(class="tok-str">'sales', axis=class="tok-num">1) y = df_feat[class="tok-str">'sales'] class="tok-comment"># ── TimeSeriesSplit cross-validation ────────────────────────────── tscv = TimeSeriesSplit(n_splits=class="tok-num">5, gap=class="tok-num">7) class="tok-comment"># class="tok-num">7-day gap prevents autocorrelation leakage maes = [] for train_idx, val_idx in tscv.split(X): X_tr, X_val = X.iloc[train_idx], X.iloc[val_idx] y_tr, y_val = y.iloc[train_idx], y.iloc[val_idx] model = lgb.LGBMRegressor(n_estimators=class="tok-num">500, learning_rate=class="tok-num">0.05, num_leaves=class="tok-num">31, min_child_samples=class="tok-num">20) model.fit(X_tr, y_tr, eval_set=[(X_val, y_val)], callbacks=[lgb.early_stopping(class="tok-num">50, verbose=False)]) maes.append(mean_absolute_error(y_val, model.predict(X_val))) print(fclass="tok-str">"CV MAE: {np.mean(maes):.2f} ± {np.std(maes):.2f}")
Time Series Pitfalls
Using random train/test split on time series data is the #1 mistake — your model trains on future data, resulting in wildly optimistic evaluation. Always use TimeSeriesSplit or a single temporal split where train comes before test. Second: not adding a gap between train and validation windows — autocorrelation means the last training point and first validation point are highly correlated, making validation look easy. Third: feature leakage — using a rolling mean of y itself without proper shifting means future values contaminate current features. Always shift(1) before rolling.
For production forecasting, retrain your model as new data arrives (online learning or periodic retraining). Models that were accurate 6 months ago may have drifted as the distribution of the time series changes.
?Knowledge Check
Progress is saved in your browser — no account needed.
Need an AI engineer or data scientist?
I build custom ML models, AI agents, computer vision, and automation — from idea to production.