Lead AI / ML Engineer @ ZetaGlobal

Building ML/AI systems that move from research to production and scale.

Published Research

AI vs. Human Moderators LLM Performance Predictors

I help teams turn AI ideas into reliable, high-impact products with measurable outcomes.

Distributed training for SLMs, LLMs and LVLMs
High-throughput, cost-efficient inference at scale
Data-centric AI pipelines with production-grade observability
Production-ready agentic AI and evaluation workflows

Flagship Open Source Product

harneXa/nexa-gauge

A graph-based evaluation toolkit for LLM and RAG systems with repeatable quality checks, upfront cost visibility, and clean per-case outputs for analysis.

Graph-native evaluation flow (scan -> claims -> metrics -> eval)
Cost visibility before runtime with estimate-first execution
Cache-aware runs to avoid duplicate spend and recomputation
Coverage across relevance, grounding, redteam, GEval, and reference scoring
Production-friendly CLI for run, estimate, and cache management
Scales with control across utility and metric nodes

BYOM · Ollama support in progress

Next Product

harneXa/nexa-prism

Coming Soon

10+

Years in ML & AI Research and Engineering

2–3x

Inference Performance Gains

~50%

ML Infra Cost Reduction

Published

ICCV and AAMAS

Weeks → Hours

Governed pipelines and data workflows that accelerate model iteration

6–10%

Use-case specific model gains through architecture tradeoffs and training pipeline design

Capabilities

What I Work With

Weighted from public GitHub activity with recency, stars, and topic signals.

Languages & Core

Python

Deep Learning

Transfer Learning

C++

LLMs

Modeling Stack

Ray

Triton

PyTorch

Voxel51

RAG

CUDA

TensorFlow

HuggingFace

Keras

vLLM

Infra $ Tools

Docker

WandB

Langfuse

AWS

DataDog

Kubernetes

Airflow

Spark

Core Research

Computer Vision

Agentic AI

LLM Evaluation

Small Language Models

Autonomous Driving

Detection

Reinforcement Learning

Sensor Fusion

OpenAI Gym

Projects

Personal Work

A glimpse into how I spend my personal time building, exploring, and learning.

harneXa/nexa-gauge

Released

A graph-based evaluation toolkit for LLM and RAG systems with repeatable quality checks, upfront cost visibility, cache for reusability and clean per-case outputs for analysis. Metric suport: Grounding, Relevance, RedTeam, Geval, Reference-based.

GitHub

Self-Driving Vehicle

Accomplished

Perception and control modules for autonomous vehicles built on the Udacity SDC curriculum. Covers lane detection, traffic sign classification, behavioral cloning, LIDAR/RADAR sensor fusion via Extended Kalman Filter, jerk-minimizing path planning, and a PID controller for steering and throttle.

GitHub

Deep Reinforcement Learning

Accomplished

Core Deep RL algorithms implemented across Unity ML-Agents environments. Covers DQN and Double-DQN for discrete action spaces, REINFORCE for Atari Pong, DDPG for continuous robotic arm control, and Multi-Agent DDPG for a collaborative/competitive multi-agent setting.

GitHub

Explore All Projects

Writing

Latest Posts

Recent articles on practical AI engineering and production model workflows.

Read All Posts

Building ML/AI systems that move from research to production and scale.

harneXa/nexa-gauge

What I Work With

Personal Work

harneXa/nexa-gauge

Self-Driving Vehicle

Deep Reinforcement Learning

Latest Posts

Multimodal llama-nemotron-embed-vl-1b-v2 — (Part 3): Multi-Engine BLS Router on Triton

Multimodal llama-nemotron-embed-vl-1b-v2 — (Part 2): Single Fused TensorRT Engine on Triton

Multimodal llama-nemotron-embed-vl-1b-v2 — (Part 1): The Model and Inference Strategies

GPU Utilization & Profiling — (Part 4): Model Architecture and GEMM Shape