1. Two-Stage Constrained Actor-Critic for Short Video Recommendation
    Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, and  others
    In Proceedings of the ACM Web Conference 2023, 2023
  2. PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-Term User Engagement
    Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, and Bo An
    In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
  3. Guarded Policy Optimization with Imperfect Online Demonstrations
    Zhenghai Xue, Zhenghao Peng, Quanyi Li, Zhihan Liu, and Bolei Zhou
    In The Eleventh International Conference on Learning Representations , 2023
  4. AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement
    Zhenghai Xue, Qingpeng Cai, Tianyou Zuo, Bin Yang, Lantao Hu, Peng Jiang, Kun Gai, and Bo An
    arXiv preprint arXiv:2310.03984, 2023
  5. A Large Language Model Enhanced Conversational Recommender System
    Yue Feng, Shuchang Liu, Zhenghai Xue, Qingpeng Cai, Lantao Hu, Peng Jiang, Kun Gai, and Fei Sun
    arXiv preprint arXiv:2308.06212, 2023
  6. State Regularized Policy Optimization on Data with Dynamics Shift
    Zhenghai Xue, Qingpeng Cai, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, and Bo An
    Advances in Neural Information Processing Systems, 2023


  1. Metadrive: Composing diverse driving scenarios for generalizable reinforcement learning
    Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, and Bolei Zhou
    IEEE transactions on pattern analysis and machine intelligence, 2022


  1. Regret minimization experience replay in off-policy reinforcement learning
    Xu-Hui Liu*, Zhenghai Xue*, Jingcheng Pang, Shengyi Jiang, Feng Xu, and Yang Yu
    Advances in Neural Information Processing Systems, 2021