Rakesh Kariya

Software Developer

Applied AI engineer at Votal.ai — building agentic systems, deploying models on GPU infrastructure (cloud and on-prem), and developing LLM security tooling: AI red teaming, guardrails, and fine-tuning.

Electronics engineer by training. Previously built a logistics platform from scratch at Shipyaari, EMS for 5G VRAN at Tidalwave, and wore the PM hat at Thinkly.

I like to build things.

Things I built

vllm-serving-lab

A hands-on benchmarking lab that quantifies the real performance impact of key LLM-serving optimizations — continuous batching, paged KV cache, prefix caching, and FP8/INT8 quantization. Drives a vLLM server through each configuration with an async load harness and produces concrete before/after comparisons of p95 latency and throughput (up to 2.64× speedup on Llama-3.1-8B).

claude-skill-voiceprint

A Claude Code skill that rewrites emails using three composite writing archetypes — Decisive Operator, Structured Proposer, and Warm Connector. Built on top of Anthropic's Claude API; style exemplars are derived from the MTSlive/musk-v-altman-exhibits dataset on Hugging Face (CC-BY-4.0). Paste a draft; Claude picks the right archetype automatically or you name one. Deployed as a drop-in skill — two curl commands, no config.