Yingru LI

Yingru LI

Ph.D. Candidate

The Chinese University of Hong Kong

About me

I am a final-year Ph.D. candidate at The Chinese University of Hong Kong (CUHK), Shenzhen, China. I am advised by Zhi-Quan (Tom) Luo. My doctoral research is generously supported by SRIBD Ph.D. Fellowship, Presidential Ph.D. Fellowship and Tencent AI Ph.D. Fellowship. Previously, I received the bachelor degree in Computer Science (ACM Honors Program) from Huazhong University of Science and Technology. I was a research visiting student at Cornell University with John E. Hopcroft.

I initiated and organized the reinforcement learning seminar from 2019 to 2023.

Now actively seeking postdoctoral & research positions! my resumé.

Latest Updates (Swipe to Explore More! Follow me on X/Twitter!)

✈️July 2024: HyperAgent accepted to ICML. I will be presenting at Vienna in person.

💻July 2024: I will deliver an invited talk at the International Symposium on Mathematical Programming (ISMP), Montréal. The ISMP is the leading triennial conference focusing on mathematical optimization.

💻May 2024: AISTATS, Valencia, Spain. Our paper offers the first prior-dependent analysis of PSRL under function approximation. This helps understand how integrating prior knowledge like historical data or pre-trained models (LLMs) enhances RL agent efficiency.

💻May 2024: Remote presentation HyperAgent at the ICLR in Vienna, Austria, during the Workshop on Bridging the Gap Between Practice and Theory in Deep Learning. HyperAgent represents a significant stride towards aligning theoretical foundations with practical deep RL applications.

💻March 2024: Two Talks at the Informs Optimization Society (IOS) Conference at Rice University. (1) “HyperAgent: A simple, efficient, scalable and provable RL framework for complex environments” and (2) “A Tutorial on Thompson Sampling and Ensemble Sampling”.

🎉Jan 2024: Our work about HyperAgent received Best Paper Award in the third doctoral and postdoctoral Daoyuan academic forum.

✈️December 2023: NeurIPS, New Orleans 🚀 My research addresses efficiency challenges in reinforcement learning (RL). It encompasses both theoretical aspects of high-dimentional probability and practical applications in Deep RL [1]. I have developed a novel random projection tool for high-dimensional sequentially dependent data, a non-trivial martingale extension of Johnson–Lindenstrauss [2]. 🚀

Research Highlights

I work on algorithms and theory for interactive agents that efficiently learn and continuously adapt to complex environments. To this end, I use and develop fundamental tools in probability, optimization, game theory and information theory. See full publication list in the resume.
  • I designed “HyperAgent”, which quantifies epistemic uncertainty, provides scalable solutions for continual alignment and decision-making with foundation models, applied to online content moderation with human feedback.
  • I am working on cloud-end hybrid solutions for customer, healthcare and business operations, leveraging powerful cloud computing services while augmenting necessary algorithmic modules in end devices for reliable and safe decision-making.
(2024). Optimistic Thompson Sampling for No-Regret Learning in Unknown Games.

Cite paper

(2024). Prior-dependent analysis of posterior sampling reinforcement learning with function approximation. The 27th International Conference on Artificial Intelligence and Statistics (AISTATS).

Cite paper

(2024). Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent. The 41st International Conference on Machine Learning (ICML).

Cite Code Video paper

(2024). Radar Anti-jamming Strategy Learning via Domain-knowledge Enhanced Online Convex Optimization. 2024 IEEE 13th Sensor Array and Multichannel Signal Processing Workshop (SAM).

Cite paper

(2024). Uncertainty-aware Multi-turn Language Agents for Medical Decision-making.


(2022). HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning. International Conference on Learning Representations (ICLR).

Cite Code Video paper

(2019). Divergence-augmented policy optimization. Advances in Neural Information Processing Systems.

Cite Code Poster paper

(2018). Hidden community detection in social networks. Information Sciences.

Cite Code paper


You only live once.