About

I am a Ph.D. candidate in Electrical & Computer Engineering at the University of Southern California, where I completed master's degrees in both Computer Science and Electrical Engineering in May 2025. My research, advised by Prof. Rahul Jain and Prof. Ashutosh Nayyar, centers on reinforcement learning (RL), with a particular focus on offline and robust imitation learning, behavior foundation models, and post-training LLMs.

In May 2026, I joined Google as a Student Researcher. In summer 2025, I was an Applied Scientist intern at Amazon, working on reinforcement learning for agentic AI systems. Earlier, as a Research Engineer at Samsung Research, I applied deep RL to network resource problems in the 6G Lab. As an undergraduate I worked with Prof. Sriparna Bandopadhyay at IIT Guwahati on data augmentation with r-cyclic matrices.

Before that, I spent summer 2018 with Prof. Richard James at the University of Minnesota modeling light-induced phase transitions, and summer 2017 with Prof. Frank Chung-Hoon Rhee at Hanyang University, estimating the fuzzifier parameter for alpha-planes of general type-2 fuzzy sets.

Reinforcement Learning Imitation Learning Behavior Foundation Models Agentic AI Large Language Models Post-training
Selected Research
DistIL: distributional DAgger with future-aware credit assignment
NEW ICML 2026 · RLxF Workshop · arXiv

Reinforcement Learning from Rich Feedback with Distributional DAgger

Rishabh Agrawal, Jacob Fein-Ashley, Paria Rashidinejad

Abstract

The dominant RL-from-verifiable-rewards recipe rewards each response with a single correctness bit, yet many settings provide far richer feedback — execution traces, tool outputs, expert corrections, self-evaluations. We study how to use such feedback through DistIL, a distributional variant of DAgger that optimizes a forward cross-entropy objective. Unlike reverse-KL or Jensen–Shannon self-distillation, DistIL guarantees monotonic policy improvement and sublinear regret, performs future-aware credit assignment, and improves Pass@N across scientific reasoning, coding, and hard mathematics.

RBFM: robust task inference under dynamics shift
NEW arXiv · 2026

When Dynamics Shift, Robust Task Inference Wins: Offline Imitation Learning with Behavior Foundation Models Revisited

Rishabh Agrawal, Rahul Jain, Ashutosh Nayyar

Abstract

Behavior Foundation Models (BFMs) enable scalable imitation learning but assume fixed dynamics, leaving them brittle to real-world shifts in friction, actuation, or sensor noise. We recast BFM task inference as a robust minimax problem and introduce RBFM-Light and RBFM-Heavy — two variants that add robustness only at inference, with no change to pretraining and using offline data from a single nominal environment. Both substantially outperform standard BFM and robust offline IL baselines under dynamics shifts.

BE-DROIL method overview
ORAL NeurIPS 2025 E-SARS · L4DC 2026

Balance Equation-based Distributionally Robust Offline Imitation Learning

Rishabh Agrawal, Yusuf Alvi, Rahul Jain, Ashutosh Nayyar

Abstract

Standard imitation learning implicitly assumes the environment stays fixed between training and deployment — an assumption that rarely holds. We learn robust policies from expert demonstrations alone by solving a distributionally robust optimization over an uncertainty set of transition models, and show the worst-case objective can be rewritten entirely in terms of the nominal data distribution, enabling tractable offline learning with stronger robustness under shifted dynamics.

AAAI 2025

Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning

Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar

Abstract

We study imitation in a strictly offline setting — no environment interaction, no auxiliary data, no transition model. Our method uses the Markov balance equation with a conditional density estimation framework, employing conditional normalizing flows for dynamics, and consistently outperforms many state-of-the-art IL algorithms across Classic Control and MuJoCo.

View all publications, patents & projects
News
  1. Jun 2026
  2. Jun 2026
  3. May 2026
    Started as a Student Researcher at Google.
  4. May 2026
  5. Jan 2026
    Attended the AAAI 2026 Doctoral Consortium in Singapore to present my thesis research.
  6. Jan 2026
    BE-DROIL accepted at L4DC 2026.
  7. Nov 2025
    BE-DROIL accepted at the NeurIPS 2025 E-SARS Workshop for an oral presentation.
  8. Nov 2025
    Passed my Ph.D. Qualifying Exam — officially a Ph.D. Candidate.
  9. Nov 2025
    Selected for the AAAI 2026 Doctoral Consortium in Singapore.
  10. Aug 2025
    Wrapped up my Applied Scientist internship at Amazon (RL for agentic AI systems).
  11. Jun 2025
    Presented CKIL at L4DC 2025 in Ann Arbor, Michigan.
  12. May 2025
    Started as an Applied Scientist Intern at Amazon.
  13. May 2025
    Awarded two M.S. degrees — Computer Science and Electrical Engineering — at USC.
  14. Apr 2025
    Gave a talk on offline imitation learning at the 45th SoCal Control Workshop, UC San Diego.
  15. Feb 2025
  16. Feb 2025
    Presented MBIL at AAAI 2025 in Philadelphia.
  17. Dec 2024
    Markov Balance Satisfaction accepted at AAAI 2025.
  18. Dec 2024
    Presented Policy Optimization for Strictly Batch IL at NeurIPS 2024, Vancouver.
  19. Nov 2024
    Received the Outstanding Poster Award at USC's 14th Annual Research Festival.
  20. Sep 2024
    Policy Optimization for Strictly Batch IL accepted at OPT-ML, NeurIPS 2024.
  21. Jan 2024
    Awarded the Graduate School Fellowship by USC.
  22. Aug 2023
    Began serving as Teaching Assistant for EE556: Stochastic Systems & Reinforcement Learning.
  23. Aug 2023
    CKIL preprint released on arXiv.
  24. Dec 2022
    Patent on radio-resource scheduling granted.
  25. Jan 2022
    Started my Ph.D. at USC with a broad focus on reinforcement learning.
  26. Aug 2020
  27. Sep 2019
    Presented CoPASample at LOD 2019 in Siena, Italy.
  28. Jun 2019
    Joined Samsung Research as a Research Engineer in the 6G Lab.
  29. May 2019
    Graduated from IIT Guwahati with a B.Tech in Mathematics & Computing.
  30. Mar 2018
  31. May 2018
    Summer research at the University of Minnesota, Twin Cities.
  32. May 2017
    Summer research at Hanyang University, South Korea.
  33. Jul 2015
    Began undergraduate studies at IIT Guwahati (Mathematics, CS & Financial Engineering).
Contact

I'm always glad to talk research and open to new collaborations. The quickest way to reach me is email, feel free to say hello.

Email
Office
335 Hughes Aircraft Electrical Engineering Center
3740 McClintock Ave, Los Angeles, CA 90089