Fine-Tuning BERT for Production NLP: A Battle-Tested Guide
Everything I've learned fine-tuning BERT across 10+ NLP projects — tokenization, learning rate schedules, layer freezing, and deployment with ONNX.
Deep-dive articles on machine learning, AI engineering, and production ML systems
A practical guide to Arabic NLP — the best models, preprocessing challenges, dialect handling, and deploying Arabic text classification in production.
Benchmarking OpenAI, Cohere, E5, BGE, and Jina embeddings on retrieval tasks — MTEB scores, cost, latency, and multilingual support for Arabic and French.
Building a production sentiment classifier for Arabic customer reviews — dataset curation, preprocessing challenges, model comparison, and deploying with FastAPI.
Tokenization, normalization, stemming vs lemmatization, subword encoding — and when BERT's tokenizer is better than all of them combined.
Deploying OpenAI Whisper for multilingual transcription — model selection, performance optimizations, and fine-tuning for Moroccan Darija.
I build custom ML models, AI agents, computer vision, and automation — from idea to production.