Yingru Li
Yingru Li
Home
Posts
Research
Contact
Resume
RL-Seminar
Light
Dark
Automatic
Posts
Information Bandwidth in Reinforcement Learning
An information-theoretic analysis explaining why policy gradient learns 1 bit per episode and why LoRA works for RL fine-tuning.
Yingru LI
Oct 1, 2025
16 min read
Research
,
Theory
Cite
×