Lü Shuai, 吕帅

在读硕士生


李松霖,男,2000年12月生,吉林省长春市人。

【学术论文】在国内外期刊和会议上发表学术论文1篇,在审学术论文8篇。

  1. Li Songlin, Wu Hao, Chen Huangyang, Zhou Wenbo*, Li Jingyao*. Anchor-based perturbation-driven exploration for offline-to-online reinforcement learning. 2025. (Submitted)
  2. Xiao Wei, Li Songlin, An Daolong, Wu Hao, Zhang Xiaodan, Lü Shuai*. Corrected critic and adaptive constraint for offline-to-online reinforcement learning. 2025. (Submitted)
  3. Li Songlin, Xiao Wei, Wu Hao, Zhang Xiaodan, An Daolong, Lü Shuai*. State proficiency-based adaptive fine-tuning for offline-to-online reinforcement learning. 2025. (Submitted)
  4. Wu Hao, Li Songlin, Xiao Wei, Zhong Taihong, Lü Shuai*. Offline-to-online reinforcement learning with triple-intensity policy constraints. 2025. (Submitted)
  5. An Daolong, Shen Chun, Li Songlin, Xiao Wei, Lü Shuai*, Zhou Wenbo*. Result constraint behavior clone for offline reinforcement learning. 2025. (Submitted)
  6. Lin Dajun, Li Songlin, Lü Shuai*, Zhou Wenbo*, Zhong Taihong, An Daolong. WCPC-TD3: Weighted contrastive policy constraint for offline reinforcement learning. 2025. (Submitted)
  7. Zhou Ruikai, Li Songlin, Lü Shuai*. From simple to complex: Mitigating the impact of critic accuracy fluctuations by multi-agent reinforcement learning. 2025. (Submitted)
  8. Zhou Ruikai, Zhong Taihong, Li Songlin, Lü Shuai*. A Kullback-Leibler divergence perspective on policy gradient methods in reinforcement learning. 2025. (Submitted)
  9. Shu Man, Lü Shuai*, Gong Xiaoyu, An Daolong, Li Songlin. Episodic memory-double actor-critic twin delayed deep deterministic policy gradient. Neural Networks, 2025, 187: 107286. (中科院2区TOP期刊, CCF推荐B类期刊, SCI, 目前IF: 6.3)

【荣誉奖励】

【联系方式】


袁健会,男,1999年06月生,吉林省长春市人。

【学术论文】在国内外期刊和会议上发表学术论文1篇,在审学术论文3篇。

  1. Tan Lei, Guo Dong, Fang Wensi, Li Guixiang, Yuan Jianhui, Zhang Xiaodan, Lü Shuai*. Divide and correct: Alternating normalization and prototype alignment for continual test-time adaptation. 2025. (Submitted)
  2. Yuan Jianhui, Zhang Xinyu, Li Guixiang, Tan Lei, Li Jingyao*, Zhou Wenbo*. GRACE: Enhancing source-free universal domain adaptation via gradient-aware contrastive learning and entropy-aware alignment. 2025. (Submitted)
  3. Lü Shuai, Yuan Jianhui, Zhang Xinyu, Zhang Shaojie, Fang Wensi, Li Jingyao*. Pre-trained initialization and memory-enhanced correction for source-free universal domain adaptation. 2025. (Submitted)
  4. Li Zhuang, Yuan Jianhui, Li Guixiang, Wang Hao, Li Xingcan, Li Dan, Wang Xinhua*. RSI-YOLO: Object detection method for remote sensing images based on improved YOLO. Sensors, 2023, 23: 6414. (中科院2区期刊, SCI, IF: 3.4)

【荣誉奖励】

【联系方式】


肖威,男,2001年11月生,山东省菏泽市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文6篇。

  1. Xiao Wei, Zhang Tao, Chen Huangyang, Li Jingyao*, Zhou Wenbo*. Q-bounded and adaptive Q-value constraints for offline-to-online reinforcement learning. 2025. (Submitted)
  2. Xiao Wei, Li Songlin, An Daolong, Wu Hao, Zhang Xiaodan, Lü Shuai*. Corrected critic and adaptive constraint for offline-to-online reinforcement learning. 2025. (Submitted)
  3. Li Songlin, Xiao Wei, Wu Hao, Zhang Xiaodan, An Daolong, Lü Shuai*. State proficiency-based adaptive fine-tuning for offline-to-online reinforcement learning. 2025. (Submitted)
  4. Wu Hao, Li Songlin, Xiao Wei, Zhong Taihong, Lü Shuai*. Offline-to-online reinforcement learning with triple-intensity policy constraints. 2025. (Submitted)
  5. An Daolong, Shen Chun, Li Songlin, Xiao Wei, Lü Shuai*, Zhou Wenbo*. Result constraint behavior clone for offline reinforcement learning. 2025. (Submitted)
  6. Zhu Wenbo, Xiao Wei, Lü Shuai*. Soft-penalty guided exploration in reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


李贵祥,男,2003年04月生,山东省聊城市人。

【学术论文】在国内外期刊和会议上发表学术论文2篇,在审学术论文2篇。

  1. Tan Lei, Guo Dong, Fang Wensi, Li Guixiang, Yuan Jianhui, Zhang Xiaodan, Lü Shuai*. Divide and correct: Alternating normalization and prototype alignment for continual test-time adaptation. 2025. (Submitted)
  2. Yuan Jianhui, Zhang Xinyu, Li Guixiang, Tan Lei, Li Jingyao*, Zhou Wenbo*. GRACE: Enhancing source-free universal domain adaptation via gradient-aware contrastive learning and entropy-aware alignment. 2025. (Submitted)
  3. Li Zhuang, Li Guixiang, Song Xiangyang, Wang Xinhua*. EVD-YOLO: An efficient and dynamic framework for multi-scale target detection of underwater organisms. Journal of Ocean University of China, 2025. (中科院2区期刊, SCI, 目前IF: 1.2)
  4. Li Zhuang, Yuan Jianhui, Li Guixiang, Wang Hao, Li Xingcan, Li Dan, Wang Xinhua*. RSI-YOLO: Object detection method for remote sensing images based on improved YOLO. Sensors, 2023, 23: 6414. (中科院2区期刊, SCI, IF: 3.4)

【荣誉奖励】

【联系方式】


吴昊,男,2002年02月生,内蒙古自治区额尔古纳市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文7篇。

  1. Liu Xuejie, Zhang Shunhao, Wu Hao, Hou Zhibin. Non-parametric behavior policy density estimation for offline reinforcement learning. 2025. (Submitted)
  2. Wu Hao, Zhang Shunhao, Chen Huangyang, Zhang Tao, Zhou Wenbo*, Li Jingyao*. UDPBC: Uncertainty-guided dual-perspective behavior cloning for offline-to-online reinforcement learning. 2025. (Submitted)
  3. Li Songlin, Wu Hao, Chen Huangyang, Zhou Wenbo*, Li Jingyao*. Anchor-based perturbation-driven exploration for offline-to-online reinforcement learning. 2025. (Submitted)
  4. Xiao Wei, Li Songlin, An Daolong, Wu Hao, Zhang Xiaodan, Lü Shuai*. Corrected critic and adaptive constraint for offline-to-online reinforcement learning. 2025. (Submitted)
  5. Li Songlin, Xiao Wei, Wu Hao, Zhang Xiaodan, An Daolong, Lü Shuai*. State proficiency-based adaptive fine-tuning for offline-to-online reinforcement learning. 2025. (Submitted)
  6. Wu Hao, Li Songlin, Xiao Wei, Zhong Taihong, Lü Shuai*. Offline-to-online reinforcement learning with triple-intensity policy constraints. 2025. (Submitted)
  7. Zhu Sheng, Wu Hao, Shen Chun, Zhu Wenbo, Han Shuai, Lü Shuai*. Actor-critic of multi-agent collaboration on single-agent task. 2025. (Submitted)

【荣誉奖励】

【联系方式】


孙耕浩,男,2001年07月生,山东省德州市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文1篇。

  1. Chen Huangyang, Chen Juan, Zhang Tao, Sun Genghao, Lü Shuai*. Reward shaping based on trajectory quality for offline and hybrid reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


章晓丹,女,2002年01月生,山东省威海市人。

【学术论文】在国内外期刊和会议上发表学术论文1篇,在审学术论文4篇。

  1. Tan Lei, Guo Dong, Fang Wensi, Li Guixiang, Yuan Jianhui, Zhang Xiaodan, Lü Shuai*. Divide and correct: Alternating normalization and prototype alignment for continual test-time adaptation. 2025. (Submitted)
  2. Zhang Xiaodan, Fang Wensi, Tan Lei, Lü Shuai*. AutoVote: Adaptive learning rate modulation for continual test-time adaptation via sensitivity voting. 2025. (Submitted)
  3. Xiao Wei, Li Songlin, An Daolong, Wu Hao, Zhang Xiaodan, Lü Shuai*. Corrected critic and adaptive constraint for offline-to-online reinforcement learning. 2025. (Submitted)
  4. Li Songlin, Xiao Wei, Wu Hao, Zhang Xiaodan, An Daolong, Lü Shuai*. State proficiency-based adaptive fine-tuning for offline-to-online reinforcement learning. 2025. (Submitted)
  5. Xiong Xi, Shen Chun, Wu Junhong, Lü Shuai*, Zhang Xiaodan. Combined data augmentation framework for generalizing deep reinforcement learning from pixels. Expert Systems with Applications, 2025, 264: 125810. (中科院1区TOP期刊, CCF推荐C类期刊, SCI, 目前IF: 7.5)

【荣誉奖励】

【联系方式】


陈黄洋,男,2002年08月生,福建省漳州市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文4篇。

  1. Wu Hao, Zhang Shunhao, Chen Huangyang, Zhang Tao, Zhou Wenbo*, Li Jingyao*. UDPBC: Uncertainty-guided dual-perspective behavior cloning for offline-to-online reinforcement learning. 2025. (Submitted)
  2. Xiao Wei, Zhang Tao, Chen Huangyang, Li Jingyao*, Zhou Wenbo*. Q-bounded and adaptive Q-value constraints for offline-to-online reinforcement learning. 2025. (Submitted)
  3. Li Songlin, Wu Hao, Chen Huangyang, Zhou Wenbo*, Li Jingyao*. Anchor-based perturbation-driven exploration for offline-to-online reinforcement learning. 2025. (Submitted)
  4. Chen Huangyang, Chen Juan, Zhang Tao, Sun Genghao, Lü Shuai*. Reward shaping based on trajectory quality for offline and hybrid reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


张涛,男,2002年10月生,河南省濮阳市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文3篇。

  1. Wu Hao, Zhang Shunhao, Chen Huangyang, Zhang Tao, Zhou Wenbo*, Li Jingyao*. UDPBC: Uncertainty-guided dual-perspective behavior cloning for offline-to-online reinforcement learning. 2025. (Submitted)
  2. Xiao Wei, Zhang Tao, Chen Huangyang, Li Jingyao*, Zhou Wenbo*. Q-bounded and adaptive Q-value constraints for offline-to-online reinforcement learning. 2025. (Submitted)
  3. Chen Huangyang, Chen Juan, Zhang Tao, Sun Genghao, Lü Shuai*. Reward shaping based on trajectory quality for offline and hybrid reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


檀磊,男,2000年11月生,安徽省安庆市人。

【学术论文和发明专利】在国内外期刊和会议上发表学术论文1篇,在审学术论文3篇,申请发明专利(目前实质审查)1项。

  1. Tan Lei, Guo Dong, Fang Wensi, Li Guixiang, Yuan Jianhui, Zhang Xiaodan, Lü Shuai*. Divide and correct: Alternating normalization and prototype alignment for continual test-time adaptation. 2025. (Submitted)
  2. Zhang Xiaodan, Fang Wensi, Tan Lei, Lü Shuai*. AutoVote: Adaptive learning rate modulation for continual test-time adaptation via sensitivity voting. 2025. (Submitted)
  3. Yuan Jianhui, Zhang Xinyu, Li Guixiang, Tan Lei, Li Jingyao*, Zhou Wenbo*. GRACE: Enhancing source-free universal domain adaptation via gradient-aware contrastive learning and entropy-aware alignment. 2025. (Submitted)
  4. 马慧敏*, 檀磊, 张京会, 张鹏飞, 宁孝梅, 刘海秋, 高彦伟. 基于深度学习的合成孔径成像系统共相误差检测研究综述. 量子电子学报, 2022, 39(6): 927-941. (第一作者为指导教师)
  5. 檀磊, 马慧敏, 王小申, 戴明宇, 代腾辉, 焦俊, 刘倩, 辜丽川. 基于多尺度生成对抗网络的大气湍流图像复原方法及系统. (申请号: CN2023 1 1725750.0, 申请日: 2023.12.14, 目前实质审查)

【荣誉奖励】

【联系方式】


侯志斌,男,1999年08月生,山东省菏泽市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文1篇。

  1. Liu Xuejie, Zhang Shunhao, Wu Hao, Hou Zhibin. Non-parametric behavior policy density estimation for offline reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


张顺浩,男,2002年03月生,山东省济南市人。

【学术论文】在国内外期刊和会议上发表学术论文0篇,在审学术论文2篇。

  1. Liu Xuejie, Zhang Shunhao, Wu Hao, Hou Zhibin. Non-parametric behavior policy density estimation for offline reinforcement learning. 2025. (Submitted)
  2. Wu Hao, Zhang Shunhao, Chen Huangyang, Zhang Tao, Zhou Wenbo*, Li Jingyao*. UDPBC: Uncertainty-guided dual-perspective behavior cloning for offline-to-online reinforcement learning. 2025. (Submitted)

【荣誉奖励】

【联系方式】


巩锦程,男,2002年09月生,山东省淄博市人。

【荣誉奖励】

【联系方式】


甄德杰,男,2003年05月生,河北省邢台市人。

【荣誉奖励】

【联系方式】


钟金运,男,2003年06月生,江西省瑞金市人。

【荣誉奖励】

【联系方式】


常钰,女,2003年01月生,辽宁省大连市人。

【荣誉奖励】

【联系方式】


姜文康,男,2003年07月生,山东省德州市人。

【荣誉奖励】

【联系方式】


黄会敏,男,2004年03月生,山东省日照市人。

【荣誉奖励】

【联系方式】


邱天,女,2004年07月生,黑龙江省齐齐哈尔市人。

【荣誉奖励】

【联系方式】


崔永权,男,2004年09月生,河南省平顶山市人。

【荣誉奖励】

【联系方式】