Yingru Li
Yingru Li
Home
Research
Contact
Resume
RL-Seminar
Light
Dark
Automatic
policy-gradients
Divergence-augmented policy optimization
Stabilizing policy optimization when off-policy data are reused, addressing the data efficiency issue in RL for real-world problems.
Qing Wang*
,
Yingru Li* (equal)
,
Jiechao Xiong
,
Tong Zhang
PDF
Cite
Code
Poster
Cite
×