Projects
VoiceAgentBench
Created the first multilingual benchmark (6000+ spoken queries) to evaluate speech-based agents on tool use, multi-turn planning, and adversarial safety across realistic agentic tasks.
Self-Correcting Reinforcement Learning for Physics
Developed a two-stage reinforcement learning framework for physics reasoning in which a base policy generates an initial solution, receives step-level error feedback from a verifier model, and produces a revised solution trained via KL-regularized reward maximization.
Dynamic Multi-Agent RAG System
Built an agentic RAG system for long legal and financial documents with interleaved retrieval and reasoning, enabling fault-tolerant tool execution over multi-hop queries.
LLM Agent for Black-Box Tool Access
Proposed a black-box tool planning framework leveraging program synthesis to generate executable tool-call sequences without actuall API call access, and built an end-to-end agent pipeline integrating retrieval with structured plan generation.
HyPost: Joint ASR Post-Processing Framework
Developed a joint ASR post-processing model for generative error correction and inverse text normalization using Llama-2-7B and Mixture of LoRAs.