Publications

2025

Policy Optimization under Imperfect Human Interactions with Agent-Gated Shared Autonomy

Zhenghai Xue, Bo An, and Shuicheng Yan

In The Thirteenth International Conference on Learning Representations, 2025

HTML
AURO: Reinforcement Learning for Adaptive User Retention Optimization in Recommender Systems

Zhenghai Xue, Qingpeng Cai, Tianyou Zuo, Bin Yang, Lantao Hu, Peng Jiang, Kun Gai, and Bo An

In Proceedings of the ACM Web Conference (Oral), 2025

arXiv
Agentstudio: A toolkit for building general virtual agents

Longtao Zheng, Zhiyuan Huang, Zhenghai Xue, Xinrun Wang, Bo An, and Shuicheng Yan

In The Thirteenth International Conference on Learning Representations, 2025

arXiv HTML Code

2024

Modeling User Retention through Generative Flow Networks

Ziru Liu, Shuchang Liu, Bin Yang, Zhenghai Xue, Qingpeng Cai, Xiangyu Zhao, Zijian Zhang, Lantao Hu, Han Li, and Peng Jiang

In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

arXiv Code
S2AC: Energy-Based Reinforcement Learning with Stein Soft Actor Critic

Safa Messaoud, Billel Mokeddem, Zhenghai Xue, Linsey Pang, Bo An, Haipeng Chen, and Sanjay Chawla

In The Twelfth International Conference on Learning Representations, 2024

arXiv Code

2023

Two-Stage Constrained Actor-Critic for Short Video Recommendation

Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, and others

In Proceedings of the ACM Web Conference 2023, 2023

arXiv Code
PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-Term User Engagement

Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, and Bo An

In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

arXiv
A Large Language Model Enhanced Conversational Recommender System

Yue Feng, Shuchang Liu, Zhenghai Xue, Qingpeng Cai, Lantao Hu, Peng Jiang, Kun Gai, and Fei Sun

arXiv preprint arXiv:2308.06212, 2023

arXiv
State Regularized Policy Optimization on Data with Dynamics Shift

Zhenghai Xue, Qingpeng Cai, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, and Bo An

Advances in Neural Information Processing Systems, 2023

arXiv HTML Code
Guarded Policy Optimization with Imperfect Online Demonstrations

Zhenghai Xue, Zhenghao Peng, Quanyi Li, Zhihan Liu, and Bolei Zhou

In The Eleventh International Conference on Learning Representations (Spotlight), 2023

arXiv HTML Code

2022

Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning

Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, and Bolei Zhou

IEEE transactions on pattern analysis and machine intelligence, 2022

arXiv HTML Video

2021

Regret minimization experience replay in off-policy reinforcement learning

Xu-Hui Liu*, Zhenghai Xue*, Jingcheng Pang, Shengyi Jiang, Feng Xu, and Yang Yu

Advances in Neural Information Processing Systems, 2021

arXiv Code Video