Xiao Yang

Principal Research Scientist at Canva

Email/ X/ LinkedIn/ Google Scholar

About

I am a Principal Research Scientist at Canva on Generative AI, specifically in efficient image/video generation.

Prior to that I worked at Bytedance as a Senior Staff Manager, managing a team of 10+ people on AIGC foundation model development and applications. Before that I also worked at Meta shortly on video advertisement products.

I got my Ph.D degree from Computer Science department, UNC Chapel Hill, advised by Dr. Marc Niethammer; I obtained my bachelor degree in Electrical Engineering from Tsinghua University.

My past works spans across various aspect of generative models, including:

Efficient model: Helios, FSVideo, SDXL-Lightning, AnimateDiff-Lightning
Foundation models: Seeddream, zero-terminal SNR, MVDream
Applications: MagicPose, IP-Prompter, and 20+ high-profile product launches (such as this one and this one).

News

Hiring! I am hiring people interested in efficient generative model (distillation, qualtization, efficient model architectures) and video generative model post-training (SFT, RL) in China. Please feel free to contact me if you are interested!

Selected Publications

Helios: Real Real-Time Long Video Generation Model. arXiv, 2026.

A 14B real-time long video generation model capable of generating video at 19.5 FPS on a single H100.

Project Paper Code
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space. arXiv, 2026.

A 28B video generation model achieving high quality, 720P video generation while being 42.3× faster than WAN 14B video models.

Project Paper
IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting. SIGGRAPH, 2025.

Training-free personalization method achiving state-of-the-art character identity preserving, style consistency and text alignment results.

Project Paper Code
SDXL-Lightning: Progressive Adversarial Diffusion Distillation. arXiv, 2024.

Industry-standard few-step SDXL acceleration method.

Paper Model
AnimateDiff-Lightning: Cross-Model Diffusion Distillation. arXiv, 2024.

SOTA few-step video generation for animatediff; >60 million downloads on Huggingface.

Paper Model
MVDream: Multi-view Diffusion for 3D Generation. ICLR, 2024.

First method to solve Janus problem for 3D generation.

Project Paper Code
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion. ICML, 2024.

High performance pose and expression retargeting.

Project Paper Code
Common Diffusion Noise Schedules and Sample Steps are Flawed. WACV, 2024.

Zero-terminal SNR. Widely-used noise schedule fix for diffusion model for image/video/3D generation

Paper
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation. ECCV, 2024.

Training-free image personalization method via multimodal LLM prompting.

Project Paper Code
PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters. CVPR, 2023.

GAN-based single-view-to-3D reconstruction method for anime characters with strong stylization handling capability.

Paper Code
Shifted Diffusion for Text-to-image Generation. CVPR, 2023.

SOTA open-source text-to-image generation model in DALL-E 2 era.

Paper Code
SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing. CVPR, 2022.

Stylegan-based model for compositional image generation and editing.

Project Paper Code

Last update date: 2026/03.