
Naoufal MahfoudiData Scientist / ML Engineer
Currently ML Engineer at MAYFAIR VILLAGE, improving LLM-based systems for chemistry R&D. Specializing in missions at the interface of advanced data science and business strategy, with expertise in LLMs, Deep Learning, and digital transformation.
Technical Expertise
A comprehensive overview of my technical skills, current proficiency levels, and continuous learning objectives in the field of AI and data science.
Python & Data Science Stack
Production‑grade Python workflows from research prototypes to cloud deployment, spanning classical ML, deep learning and big‑data processing.
Expertise in the full Python data‑science ecosystem: Pandas & PySpark for data engineering, Matplotlib / Seaborn / Plotly for insight‑rich visualisation, and Power BI for business analytics dashboards. Built sub‑metre Wi‑Fi indoor‑positioning and computer‑vision systems (PhD & Post‑doc) using PyTorch and Keras. On AWS processed millions of streaming logs to deliver an 85 %‑recall churn‑prediction model (Docker, GitHub Actions). Published 10 peer‑reviewed papers and mentored 7 graduate interns.
Finish Microsoft Power BI Data Analyst certification, deepen causal‑inference feature engineering for customer‑lifecycle modelling, and explore multimodal sensor‑fusion techniques for localisation.
LLMs & Generative AI
Secure, enterprise‑ready LLM applications with a focus on scientific discovery and R&D acceleration.
Machine Learning Engineer Intern at Mayfair Village, contributing to the enhancement of CHEMYLANE’s “Deep Report” feature using an agentic LLM framework built with LangChain, LangGraph, and the OpenAI API. The mission focuses on improving knowledge synthesis for chemistry R&D. Currently implementing robust guardrail layers—including prompt filtering, input detection, and contextual control—to ensure secure and reliable outputs. Comfortable working with Hugging Face Transformers for model fine-tuning and developing retrieval-augmented generation (RAG) prototypes.
Domain‑specific LLM fine‑tuning, advanced multi‑agent orchestration patterns, and retrieval‑augmented generation over enterprise knowledge bases.
API Development & DevOps
Reliable, scalable APIs for ML model serving and industrial integrations, backed by modern DevOps pipelines.
Production‑grade backend development with FastAPI, Pydantic and async programming, containerised via Docker and deployed on Linux servers. CI/CD with GitHub Actions for automated testing & release. Currently working on ML micro‑services that power CHEMYLANE Deep Report. Solid Git flow and code‑quality practices.
Designing micro‑service meshes with API Gateways, OAuth2/OpenID Connect hardening, and Kubernetes‑based scaling for high‑throughput scientific workloads.
Cloud & MLOps
From experimentation to reproducible, cloud‑native deployment with robust observability and governance.
End‑to‑end ML lifecycle management on AWS: data ingestion with PySpark EMR, experiment tracking with Weights & Biases, pipeline orchestration in Prefect, and CI/CD through GitHub Actions. Delivered a production churn‑prediction service (85 % recall). Currently preparing Azure AI & Data Fundamentals to broaden multi‑cloud proficiency.
Kubernetes for autoscaling, Terraform/Pulumi IaC, advanced Azure ML services, and real‑time model‑drift monitoring.
Loading pinned projects from GitHub...