Yang Sheng

I am a Research Engineer at Z.ai, working on post-training methods for Vision-Language Models (VLMs), with a focus on reward modeling, reinforcement learning, and multimodal reasoning. As a core contributor, I have contributed to GLM-4.1V, GLM-4.5V, GLM-4.6V, GLM-5V-Turbo, and GLM-OCR. My research interests include scalable VLM post-training, model alignment, and robust multimodal understanding. I received my Master’s degree from Tsinghua University and my Bachelor’s degree from Huazhong University of Science and Technology (HUST).

selected publications

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

V Team, Wenyi Hong, Xiaotao Gu, and 15 more authors

arXiv preprint arXiv:2604.26752, 2026

arXiv
GLM-OCR Technical Report

Shuaiqi Duan, Yadong Xue, Weihan Wang, and 20 more authors

arXiv preprint arXiv:2603.10910, 2026

arXiv
GLM-4.1 V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Wenyi Hong, Wenmeng Yu, Xiaotao Gu, and 8 more authors

arXiv preprint arXiv:2507.01006, 2025
Not all prompts are secure: A switchable backdoor attack against pre-trained vision transfomers

Sheng Yang, Jiawang Bai, Kuofeng Gao, and 3 more authors

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Backdoor defense via suppressing model shortcuts

Sheng Yang, Yiming Li, Yong Jiang, and 1 more author

In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023