Projects

VoiceAgentBench

Created the first multilingual benchmark (6000+ spoken queries) to evaluate speech-based agents on tool use, multi-turn planning, and adversarial safety across realistic agentic tasks.

GitHub · Hugging Face

Self-Correcting Reinforcement Learning for Physics

Developed a two-stage reinforcement learning framework for physics reasoning in which a base policy generates an initial solution, receives step-level error feedback from a verifier model, and produces a revised solution trained via KL-regularized reward maximization.

GitHub

Dynamic Multi-Agent RAG System

Built an agentic RAG system for long legal and financial documents with interleaved retrieval and reasoning, enabling fault-tolerant tool execution over multi-hop queries.

GitHub · Blog

LLM Agent for Black-Box Tool Access

Proposed a black-box tool planning framework leveraging program synthesis to generate executable tool-call sequences without actuall API call access, and built an end-to-end agent pipeline integrating retrieval with structured plan generation.

GitHub · arXiv

HyPost: Joint ASR Post-Processing Framework

Developed a joint ASR post-processing model for generative error correction and inverse text normalization using Llama-2-7B and Mixture of LoRAs.

GitHub