Yingru Li
Yingru Li
Home
Posts
Research
Contact
RL-Seminar
Light
Dark
Automatic
Reinforcement Learning
Information Bandwidth in Reinforcement Learning
An information-theoretic analysis showing that scalar advantage formulations learn ≤ log₂(B) bits per episode, while per-timestep advantages preserve full reward entropy.
Yingru LI
Oct 1, 2025
16 min read
Research
,
Theory
When Speed Kills Stability: Demystifying RL Collapse from the Training-Inference Mismatch
The relentless push for faster inference creates a dangerous training-inference mismatch that silently kills RL with LLMs. We reveal the vicious cycle—particularly acute in reasoning and agentic RL—and show that sequence-level importance sampling is the principled solution.
Jiacai Liu
,
Yingru LI
,
Yuqian Fu
,
Jiawei Wang
,
Qian Liu
,
Yu Shen
Sep 17, 2025
1 min read
Research
,
Theory
HyperAgent - A Simple, Efficient, Scalable and Provable RL Framework
Practically and provably efficient RL under resource constraints!
Mar 23, 2024 1:30 PM
Rice University
Yingru LI
Slides
Video
Follow
HyperAgent - A Simple, Efficient and Scalable RL Framework for Complex Environments
Practically and provably efficient RL under resource constraints!
Jan 13, 2024 1:20 PM
Daoyuan Building
Yingru LI
Slides
Follow
News
Towards AGI for Humanity through Efficient Reinforcement Learning
Addressing efficiency chanllenge in RL by HyperFQI algorithm
Oct 21, 2023 2:30 PM
Teaching B Building
Yingru LI
Slides
Follow
HyperDQN - Randomized Exploration for Deep Reinforcement Learning
Dec 14, 2021 12:00 AM
NeurIPS 2021
Yingru LI
Slides
Video
Follow
«