All Projects
Time Series
Food Delivery Time Prediction
16-model regression benchmark. Linear Regression surprisingly wins: RMSE=8.76 min, R²=0.829. Tuned XGBoost: RMSE=9.19. Distance & traffic dominate. Interaction features (distance×traffic) capture non-linearities for linear models.
8.76 min
Best RMSE (Linear Reg)
0.829
Best R²
9.19 min
Tuned XGBoost RMSE
16
Models benchmarked
Dataset
1,000 food delivery orders — 9 features
Approach
Interaction feature engineering → 16-model benchmark → RandomizedSearchCV HPO
Tech Stack
PythonXGBoostLightGBMCatBoostScikit-learn
Keywords
RegressionXGBoostLightGBMFeature EngineeringFood DeliveryRMSE
Visualizations6 Charts
Deep Dive
Comprehensive regression benchmark for food delivery time prediction.
Dataset
- ▸1,000 orders: distance, weather, traffic, time of day, vehicle type, preparation time, courier experience
- ▸30 missing values (3%) → mode/median imputation
- ▸Target: Delivery_Time_min
Feature Engineering
- ▸Ordinal: traffic (Low→High=0,1,2), time of day (Morning→Night=0,1,2,3)
- ▸One-hot: weather (5 conditions), vehicle (4 types)
- ▸Interaction: distance×traffic_encoded, courier_experience×distance
16-Model Results
| Model | RMSE (min) | R² |
|---|---|---|
| Linear Regression | 8.76 | 0.829 |
| Ridge / Lasso | 8.76 | 0.829 |
| SVR (RBF) | 9.12 | 0.816 |
| Random Forest | 9.09 | 0.817 |
| LightGBM | 9.09 | 0.817 |
| CatBoost | 9.16 | 0.813 |
| XGBoost | 9.20 | 0.811 |
| XGBoost (tuned) | 9.19 | 0.812 |
| Decision Tree | 12.93 | 0.600 |
Surprising Finding Linear Regression wins on this dataset. With proper interaction features, linear regression captures most variance. The engineered terms already encode non-linearities — leaving little for tree models to discover beyond what's explicitly modeled.
Top SHAP Features (XGBoost)
- ▸Distance (km) — direct physical constraint
- ▸Traffic level — multiplier on distance
- ▸Preparation time — delays accumulate
- ▸Courier experience — navigation efficiency