All Projects
Time Series

Food Delivery Time Prediction

16-model regression benchmark. Linear Regression surprisingly wins: RMSE=8.76 min, R²=0.829. Tuned XGBoost: RMSE=9.19. Distance & traffic dominate. Interaction features (distance×traffic) capture non-linearities for linear models.

8.76 min
Best RMSE (Linear Reg)
0.829
Best R²
9.19 min
Tuned XGBoost RMSE
16
Models benchmarked
Dataset

1,000 food delivery orders — 9 features

Approach

Interaction feature engineering → 16-model benchmark → RandomizedSearchCV HPO

Tech Stack
PythonXGBoostLightGBMCatBoostScikit-learn
Keywords
RegressionXGBoostLightGBMFeature EngineeringFood DeliveryRMSE
Visualizations6 Charts
Deep Dive

Comprehensive regression benchmark for food delivery time prediction.

Dataset

  • 1,000 orders: distance, weather, traffic, time of day, vehicle type, preparation time, courier experience
  • 30 missing values (3%) → mode/median imputation
  • Target: Delivery_Time_min

Feature Engineering

  • Ordinal: traffic (Low→High=0,1,2), time of day (Morning→Night=0,1,2,3)
  • One-hot: weather (5 conditions), vehicle (4 types)
  • Interaction: distance×traffic_encoded, courier_experience×distance

16-Model Results

ModelRMSE (min)
Linear Regression8.760.829
Ridge / Lasso8.760.829
SVR (RBF)9.120.816
Random Forest9.090.817
LightGBM9.090.817
CatBoost9.160.813
XGBoost9.200.811
XGBoost (tuned)9.190.812
Decision Tree12.930.600

Surprising Finding Linear Regression wins on this dataset. With proper interaction features, linear regression captures most variance. The engineered terms already encode non-linearities — leaving little for tree models to discover beyond what's explicitly modeled.

Top SHAP Features (XGBoost)

  1. Distance (km) — direct physical constraint
  2. Traffic level — multiplier on distance
  3. Preparation time — delays accumulate
  4. Courier experience — navigation efficiency