Achieving AUC 0.9648 on IEEE-CIS Fraud Detection with LightGBM Stacking
A complete walkthrough of building a stacking ensemble that achieved AUC 0.9648 on the IEEE-CIS fraud dataset — feature engineering, model selection, and meta-learner design.
Deep-dive articles on machine learning, AI engineering, and production ML systems
A complete walkthrough of building a stacking ensemble that achieved AUC 0.9648 on the IEEE-CIS fraud dataset — feature engineering, model selection, and meta-learner design.
The 15 feature engineering techniques I use in every Kaggle tabular competition — from target encoding to frequency encoding, lag features, and interaction terms.
How I built a U-Net pipeline for skin lesion segmentation on ISIC 2018 — augmentation strategies, loss functions, and post-processing that pushed Dice from 0.72 to 0.796.
How I deployed Stable Diffusion with ControlNet at Ofoto — architecture decisions, API design, prompt engineering, and handling 100+ concurrent requests.
Everything I've learned fine-tuning BERT across 10+ NLP projects — tokenization, learning rate schedules, layer freezing, and deployment with ONNX.
Architecture and code for a production RAG system — chunking strategies, embedding models, hybrid search, reranking, and hallucination mitigation.
How I built a production WhatsApp AI agent for a Moroccan e-commerce business — architecture, conversation memory, product catalog Q&A, and order tracking.
A practical, benchmark-driven comparison of XGBoost and LightGBM across speed, accuracy, and memory — with concrete recommendations for tabular ML in production.
How CatBoost handles categorical features without data leakage using ordered target encoding — and why this gives it an edge on datasets with many categoricals.
After 20+ imbalanced classification projects — fraud, medical, churn — here is what actually moves the needle: SMOTE, class weights, threshold tuning, and cost-sensitive learning.
How to use Optuna for hyperparameter optimization beyond random search — pruning, multi-objective optimization, and persistent study databases.
A practical guide to SHAP values — global importance, local explanations, waterfall plots, and how to turn model explanations into business insights.
K-Fold, Stratified, GroupKFold, TimeSeriesSplit — a practical guide to choosing the right CV strategy based on your data structure.
End-to-end guide to training YOLOv8 on a custom dataset — annotation, training, evaluation, and deploying as a FastAPI endpoint with ONNX export.
A step-by-step implementation of the original Attention is All You Need architecture — multi-head attention, positional encoding, encoder-decoder stack.
Mixed precision, gradient checkpointing, DataLoader tuning, torch.compile, and 6 more tricks with measured speedups on real experiments.
A practical guide to Arabic NLP — the best models, preprocessing challenges, dialect handling, and deploying Arabic text classification in production.
Chain-of-thought, few-shot, system prompts, JSON mode, and 5 more patterns with real examples from production LLM applications.
Lead qualification, document processing, social media automation, customer support, and inventory monitoring — real workflows with real ROI.
Orchestrator-worker, peer-to-peer, and hierarchical multi-agent architectures — when to use each, communication patterns, and failure recovery.
How to build a complete MLOps pipeline — data versioning with DVC, experiment tracking with MLflow, model registry, automated retraining, and deployment gates.
Data drift vs concept drift — detection methods, monitoring dashboards with Evidently AI, and automated alerting strategies for production ML systems.
From model pickle to production FastAPI — async inference, input validation with Pydantic, rate limiting, health checks, and Docker deployment.
The exact workflow I follow in every Kaggle competition — EDA, baseline, feature engineering sprints, ensemble building, and the final push before deadline.
When classical time series methods work and when ML wins — feature engineering for time series, backtesting frameworks, and handling seasonality in production.
A complete from-scratch DQN implementation in PyTorch — environment, replay buffer, epsilon-greedy exploration, and the training loop that actually converges.
End-to-end face recognition system — face detection, alignment, embedding extraction with ArcFace, and sub-millisecond search with Faiss.
How to fine-tune EfficientNet for custom image classification — unfreezing schedules, augmentation, label smoothing, and getting the most out of small datasets.
GAN training tricks that prevent mode collapse and training instability — spectral normalization, progressive growing, gradient penalty, and architecture lessons.
From 10 minutes to 30 seconds: downcasting dtypes, vectorization, Dask fallback, and avoiding the most common Pandas performance traps.
How to use PostgreSQL effectively as a feature store — materialized views for aggregations, partitioning for time series, and indexing strategies for ML queries.
How NEAT evolves both the weights and topology of neural networks — speciation, crossover, innovation numbers, and implementing it for game AI.
A clear explanation of MCTS — selection, expansion, simulation, backpropagation — with Python implementation for 2048 and game tree visualization.
Using genetic algorithms for feature selection, hyperparameter tuning, and scheduling — encoding strategies, selection methods, and convergence analysis.
Setting up Ollama for production use — model selection, API integration, performance tuning, and running Llama 3.1 on-premise for data privacy.
A practical benchmark of the top vector databases — indexing speed, query latency, filtering, scalability, and when to use each for RAG applications.
Best practices for containerizing ML code — multi-stage builds, GPU support, model caching, and the Dockerfile patterns that cut image sizes by 70%.
Why you should wrap everything in an sklearn Pipeline — preventing data leakage, proper cross-validation, easy serialization, and custom transformers.
Benchmarking OpenAI, Cohere, E5, BGE, and Jina embeddings on retrieval tasks — MTEB scores, cost, latency, and multilingual support for Arabic and French.
Using autoencoders for unsupervised anomaly detection — reconstruction error thresholding, LSTM autoencoders for time series, and production deployment.
GPU utilization, bottleneck diagnosis, DataLoader optimization, and CUDA memory management — practical techniques for training 2x faster without new hardware.
Building a production sentiment classifier for Arabic customer reviews — dataset curation, preprocessing challenges, model comparison, and deploying with FastAPI.
Mixup, CutMix, AugMix, synthetic data with GANs, and test-time augmentation — what to use when your dataset is tiny and performance is critical.
A structured approach to ML system design interviews — problem framing, data strategy, modeling choices, serving infrastructure, and monitoring.
Tokenization, normalization, stemming vs lemmatization, subword encoding — and when BERT's tokenizer is better than all of them combined.
Matrix factorization, implicit feedback, and neural collaborative filtering — practical implementation and evaluation with RecSys metrics.
Deploying OpenAI Whisper for multilingual transcription — model selection, performance optimizations, and fine-tuning for Moroccan Darija.
INT8 quantization, structured pruning, and distillation — how to shrink model size by 90% while keeping 95% of accuracy for edge deployment.
I build custom ML models, AI agents, computer vision, and automation — from idea to production.