Education 3
Ph.D., Electrical Engineering
2022 – 2026 (Expected)University of Southern California · Los Angeles, CA
GPA 4.0 / 4.0 · Awarded the USC Graduate School Fellowship · Reviewer for ICML, ICLR, NeurIPS, AAAI, and CDC.
M.S., Computer Science & M.S., Electrical Engineering
2023 – 2025University of Southern California · Los Angeles, CA
GPA 4.0 / 4.0.
B.Tech., Mathematics & Computing
2015 – 2019Indian Institute of Technology Guwahati · Guwahati, India
GPA 8.80 / 10.0 · Institute Merit-cum-Means Scholarship for four consecutive years.
Professional Experience 4
Student Researcher
May 2026 – PresentGoogle · Sunnyvale, CA
- Research on long-horizon agents.
Applied Scientist Intern
May 2025 – Aug 2025Amazon · San Diego, CA
Built an end-to-end RL-based LLM agentic AI system for critical fraud detection.
- Designed a reward model and a novel Monte-Carlo Tree Search (MCTS) planner, guided by the reward model, to select top-k specialized LLM agents and generate reasoning that enriches context for multi-agent LLM debate.
- Implemented an RL-based debate moderator that determines optimal debate termination based on query complexity.
- Designed retrieval-based planning at test time from pre-computed trees, cutting latency from ~10 minutes to ~20 seconds.
- Improved AUC-ROC from 0.729 to 0.798 on a real-world dataset.
Research Assistant
Jan 2022 – PresentUniversity of Southern California · with Prof. Rahul Jain & Prof. Ashutosh Nayyar
- On-policy self-distillation for LLMs: proved limitations of existing OPSD methods in guaranteeing monotonic policy improvement and correct credit assignment, and developed a distributional DAgger algorithm with monotonic-improvement and regret guarantees, achieving superior performance on science, coding, and math benchmarks.
- Robust offline imitation learning & foundation models: developed robust offline methods under transition uncertainty — an offline IL algorithm and a foundation model that uses data from a single environment to generalize to diverse, unseen DeepMind Control Suite environments without online interaction.
- Strictly offline imitation learning: developed a normalizing-flow-based IL algorithm that learns optimal policies solely from expert demonstrations, reaching state-of-the-art performance on MuJoCo without environment interaction or auxiliary data.
Research Engineer
Jun 2019 – Dec 2021Samsung R&D Institute · 6G Lab · Bangalore, India
- Radio resource scheduler: applied reinforcement learning to radio-resource scheduling under Quality-of-Service and fairness constraints.
- Autoencoder communication: studied autoencoder-based communication over AWGN and Rayleigh-fading channels, and devised an RL-based autoencoder achieving detection accuracy comparable to traditional methods with fewer computations.
Teaching 2
Reinforcement Learning & Large Language Models
EE599 · Aug – Dec 2025Teaching Assistant · University of Southern California
Held office hours to help students with their coursework and projects, and graded project proposals and reports.
Stochastic Systems & Reinforcement Learning
EE556 · Aug – Dec 2023Teaching Assistant · University of Southern California
Created homework assignments and solutions, taught discussion sessions, graded exams, and held office hours to help students with coursework and projects.