Yingru LI

Ph.D. Candidate

The Chinese University of Hong Kong

About me

I am completing my PhD degree in Computer Science at The Chinese University of Hong Kong (CUHK) under the supervision of Prof. Zhi-Quan (Tom) Luo, with expected graduation in March 2025. Previously, I received my M.S. & B.E. degrees with honors from Huazhong University of Science & Technology, conducted research at Cornell University with Prof. John E. Hopcroft, and gained industry experience at Microsoft Research and Tencent AI & Robotics X.

Research Vision

I develop intelligent agents that reliably interact with complex, uncertain, human-in-the-loop environments. Through advances in uncertainty quantification, reinforcement learning (RL), and LLM reasoning & planning, I bridge foundational theory with scalable algorithms for trustworthy decision-making in critical domains under data scarcity.

Recent Honors

Presidential PhD Fellowship
SRIBD PhD Fellowship (Gold Class)
Tencent AI PhD Fellowship
Best Paper Award, 2024 Daoyuan Academic Forum
Best Student Paper Award, 2024 IEEE SAM

🎯 I am currently on the job market. View my CV/resumé. 📢 For latest updates, follow me on Twitter/X.

Research Highlights

My research is dedicated to developing trustworthy AI agents that operate reliably in complex, uncertain, and dynamic environments involving human interaction. By advancing fundamental theory in uncertainty quantification, exploration strategies, and LLM reasoning and decision-making, I design scalable algorithms that enhance the trustworthiness of AI agents. This work bridges foundational theory with practical applications across reinforcement learning (RL) and large language models (LLMs), contributing to both theoretical advancements and impactful real-world solutions.

My contributions have led to significant advancements in reinforcement learning, language model reasoning, and human-AI interaction. Recognized at premier venues such as ICML, NeurIPS, ICLR, AISTATS, ISMP, and INFORMS, my work has also received honors like the 2024 Daoyuan Forum Best Paper and the 2024 IEEE SAM Best Student Paper Award.

Click to view the short research statement

Key Contributions

1. Advancements in Uncertainty Representation and Exploration

To enhance the reliability and robustness of AI agents, I advanced methods for representing and handling uncertainty:

Ensemble Sampling Theory:
- Developed a rigorous analysis establishing ensemble sampling as a scalable approximation of Thompson sampling.
- Addressed critical challenges in exploration efficiency and scalability in reinforcement learning.
- Impact on Trustworthiness: Improves agents’ ability to make reliable decisions under uncertainty, enhancing their robustness.
Neural Ensemble++ Architecture:
- Designed to overcome challenges in ensemble scalability and mitigate coupling in shared-layer architectures.
- Enhanced uncertainty quantification, facilitating more effective exploration in deep RL models.
- Impact on Trustworthiness: Provides accurate uncertainty estimates, leading to more trustworthy decision-making.

2. Data-Efficient Reinforcement Learning and Multi-Agent Strategic Learning

Efficient learning under data scarcity is crucial for trustworthy AI agents operating in real-world environments:

HyperAgent Framework:
- Created HyperAgent to quantify and resolve epistemic uncertainty in optimal value estimation ($Q^\star$) based on Ensemble++.
- Achieved 10x improvement in data and computational efficiency on deep RL benchmarks such as Atari.
- Impact on Trustworthiness: Ensures consistent performance even with limited data, enhancing reliability.
Memoire Framework and Divergence-Augmented Policy Optimization:
- Developed a distributed deep RL system utilizing prioritized sampling, achieving 10x throughput.
- Introduced methods to stabilize policy optimization with off-policy data reuse.
- Impact on Trustworthiness: Enhances stability and robustness in learning, crucial for dependable AI agents.
Frameworks for Unknown Repeated Games:
- Integrated structure-aware modeling with no-regret learning.
- Enabled efficient collaboration and competition in domains such as traffic routing and radar sensing.
- Impact on Trustworthiness: Facilitates predictable and fair agent behavior in multi-agent settings.

3. Enhancing Language-Based Reasoning, Decision-Making, and Human-AI Interaction

Transparency and explainability are key aspects of trustworthiness, addressed through advancements in language-based reasoning:

Uncertainty-Guided Search Strategies in LLMs:
- Applied Ensemble++ & Thompson sampling to improve complex multi-step reasoning in LLMs.
- Achieved significant advancements in solving intricate mathematical problems.
- Impact on Trustworthiness: Improves explainability and confidence in AI reasoning processes.
Hospital Referral Agent:
- Implemented a conversational agent serving 16 hospitals nationwide.
- Designed a multi-agent role-play framework to generate synthetic data, addressing privacy concerns.
- Impact on Trustworthiness: Enhances reliability and efficiency in critical medical decision-making.
Human-AI Collaboration Frameworks:
- Explored methods to enhance the interplay between humans and AI systems.
- Effectively handled uncertainty and prioritized human oversight where necessary.
- Impact on Trustworthiness: Promotes ethical operation and respect for human judgment, essential for user trust.

This research collectively enhances the trustworthiness of AI agents by addressing fundamental challenges and providing scalable, reliable solutions applicable to real-world scenarios.

Yingru Li, Xuheng Shen, Xiaoxiao Liu, Gehan Hu, Benyou Wang

December, 2024

Proactive Agents for Multi-turn Hospital Outpatient Referral under Uncertainty

Yingru Li, Fei Yu*, Benyou Wang

November, 2024

Uncertainty-Aware Search: Mitigating Test-Time Search Scaling Flaws in LLMs

Yingru Li, Jiawei Xu, Baoxiang Wang, Zhi-Quan Luo

October, 2024

Scalable Exploration via Ensemble++

Yingru Li, Jiawei Xu, Zhi-Quan Luo

July, 2024 Preprint. Presentation at ICML 2024 Workshops: (1) “Aligning Reinforcement Learning Experimentalists and Theorists”; (2) “Automated Reinforcement Learning: Exploring Meta-Learning, AutoML, and LLMs”

Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation

We prove HyperAgent closes a theoretical gap in scalable exploration. Further, GPT-HyperAgent addresses risk and efficiency challenges in human-Al interplay for automated content moderation with human feedback.

Yingru Li, Jiawei Xu, Lei Han, Zhiquan Luo

July, 2024 International Conference on Machine Learning (ICML)

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

Addressing data and computation efficiency challenges in real-world deployments of RL Agents. It achieves significant efficiency gains in deep RL benchmarks as well as theoretical milestones.

Yingru Li, Zhi-Quan Luo

May, 2024 International Conference on Artificial Intelligence and Statistics (AISTATS)

Prior-dependent analysis of posterior sampling reinforcement learning with function approximation

Has implications on how the integration of prior knowledge enhances the efficiency of RL agents without extensive online exploration.

Yingru Li, Liangqi Liu, Wenqiang Pu, Hao Liang, Zhi-Quan Luo

February, 2024 Preprint. Presentation at ICML 2023 Workshop “The Many Facets of Preference-Based Learning”

Optimistic Thompson Sampling for No-Regret Learning in Unknown Games

Game-theoretic decision-making in multi-agent systems. I developed optimistic TS type algorithm that significantly reduce experimental costs in applications such as traffic management and radar communications.

Yingru Li

February, 2024 Preprint. Presentation at ICML 2024 Workshop “High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning”

Probability Tools for Sequential Random Projection

First probabilistic framework for sequential random projection, an approach rooted in the challenges of sequential decision-making under uncertainty; A non-trivial martingale extension of Johnson-Lindenstrauss (JL) to sequentially adaptive data processes.

Yingru Li

February, 2024 Preprint. Presentation at ICML 2024 Workshop “High-dimensional Learning Dynamics 2024: The Emergence of Structure and Reasoning”

Simple, unified analysis of Johnson-Lindenstrauss with applications

TL;DR: We simplify and unify various constructions of the Johnson-Lindenstrauss (JL) lemma, including spherical and sub-Gaussian models, and provide the first rigorous proof for spherical construction’s effectiveness. Our work extends the Hanson-Wright inequality and solidifies the JL lemma’s theoretical foundation, enhancing its practical applications in computational algorithms.

Ziniu Li, Yingru Li* (corresponding), Yushun Zhang, Tong Zhang, Zhi-Quan Luo

April, 2022 International Conference on Learning Representations (ICLR)

HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning

TL;DR: We design a practical randomized exploration method to address the sample efficiency issue in online reinforcement learning.

Qing Wang*, Yingru Li* (equal), Jiechao Xiong, Tong Zhang

December, 2019 Advances in Neural Information Processing Systems (NeurIPS)

Divergence-augmented policy optimization

Stabilizing policy optimization when off-policy data are reused, addressing the data efficiency issue in RL for real-world problems.