agent

Proactive Agents for Multi-turn Hospital Outpatient Referral under Uncertainty

Yingru Li, Xuheng Shen, Xiaoxiao Liu, Gehan Hu, Benyou Wang

Uncertainty-Aware Search: Mitigating Test-Time Search Scaling Flaws in LLMs

Yingru Li, Fei Yu*, Benyou Wang

Scalable Exploration via Ensemble++

Yingru Li, Jiawei Xu, Baoxiang Wang, Zhi-Quan Luo

Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation

We prove HyperAgent closes a theoretical gap in scalable exploration. Further, GPT-HyperAgent addresses risk and efficiency challenges in human-Al interplay for automated content moderation with human feedback.

Yingru Li, Jiawei Xu, Zhi-Quan Luo

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

Addressing data and computation efficiency challenges in real-world deployments of RL Agents. It achieves significant efficiency gains in deep RL benchmarks as well as theoretical milestones.

Yingru Li, Jiawei Xu, Lei Han, Zhiquan Luo

Prior-dependent analysis of posterior sampling reinforcement learning with function approximation

Has implications on how the integration of prior knowledge enhances the efficiency of RL agents without extensive online exploration.

Yingru Li, Zhi-Quan Luo

Optimistic Thompson Sampling for No-Regret Learning in Unknown Games

Game-theoretic decision-making in multi-agent systems. I developed optimistic TS type algorithm that significantly reduce experimental costs in applications such as traffic management and radar communications.

Yingru Li, Liangqi Liu, Wenqiang Pu, Hao Liang, Zhi-Quan Luo

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning

TL;DR: We design a practical randomized exploration method to address the sample efficiency issue in online reinforcement learning.

Ziniu Li, Yingru Li* (corresponding), Yushun Zhang, Tong Zhang, Zhi-Quan Luo

Divergence-augmented policy optimization

Stabilizing policy optimization when off-policy data are reused, addressing the data efficiency issue in RL for real-world problems.

Qing Wang*, Yingru Li* (equal), Jiechao Xiong, Tong Zhang