Profile

About

AI/ML engineering with strong production systems thinking and research depth.

I'm Sardhendu — an ML/AI guy with over a decade of experience building high-performance ML platforms across Computer Vision, NLP, multimodal, and large language models. My work sits at the intersection of distributed training, GPU-optimized inference, observability, data-centric AI, Active Learning and agentic architectures.

Over time I've learned that the hardest problems in AI are rarely about the model itself — they're about the systems around the model.

I've fine-tuned large-scale models including Gemma, LLaMA, and Qwen, and built the MLOps infrastructure around them — governed pipelines and data workflows that compress model iteration cycles from weeks to hours. On the inference side, I've designed distributed serving systems optimized for throughput and cost, delivering 2–3× performance gains and cutting infrastructure costs by roughly half through better batching strategies, memory management, and distributed execution. Across all of it, my focus stays on performance, scalability, economic efficiency, and rapid prototyping.

My approach blends research depth with production pragmatism. I've published work at ICCV and AAMAS, contributed to uncertainty-aware meting log-probability features, and designed RAG and vector-search systems that support enterprise-grade deployments.

Today, I'm particularly interested in agentic AI, LLM evaluation frameworks, distributed training systems, and the infrastructure required to make foundation models dependable in production environments.

If it touches GPUs, LLMs, distributed systems, production AI architecture or data centered model training worflows — I'm at home.

I care about systems that are:

Scalable
Observable
Cost-aware
Reliable under load
Designed for real-world constraints

Want to get in touch? Reach out.