Zhizheng Wu | Speech AI Research

Associate Professor • CUHK-Shenzhen

Zhizheng Wu is Associate Professor at The Chinese University of Hong Kong, Shenzhen, Jointly Appointed Professor at Shenzhen Loop Area Institute, Director of the Artificial Intelligence and Robotics Taught Post-Graduate Program, and Deputy Director of the Shenzhen Key Laboratory of Cross-Modal Cognitive Computing.

Selected Publications Work History

Research Focus

Speech generation, codec modeling, deepfake detection, and open-source foundation systems.

Open Science

Projects including Merlin, Amphion, and Emilia adopted by more than 1,000 organizations.

Recognition

National-level Young Talent, Stanford Top 2% Scientist, multiple best paper awards.

Biography

Academic profile

Professor Wu has been selected as a National-level Young Talent and has been consecutively listed in Stanford University's "World's Top 2% Scientists." He has received multiple Best Paper Awards and has held research and technical leadership roles at Meta, Apple, JD.com, the University of Edinburgh, and Microsoft Research Asia.

He initiated several influential open-source efforts, including Merlin, Amphion, and Emilia. Amphion has repeatedly appeared at the top of GitHub Trending, and Emilia became one of the most liked audio datasets on HuggingFace. His work spans speech synthesis, voice conversion, speech restoration, speaker security, and speech generation at scale.

Latest Updates

Selected news

View full archive

2025-11-08

Best Poster Award at 2025年声纹处理研究与应用学术研讨会

Yuancheng Wang received the Best Poster Award.

2025-11-08

GenSR-Pref accepted to AAAI 2026

GenSR-Pref is accepted to AAAI 2026. Congrats to Junan Zhang.

2025-11-06

Invited talk at Huawei Media Tech Summit

Research progress in Speech Tokenize/Codec at 华为2025媒体技术峰会.

2025-11-01

Invited talk at RTE Conference

Research progress in speech processing technologies at RTE大会.

Publications

Selected research output

Full list on Google Scholar

MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

Yuancheng Wang, et al., Zhizheng Wu • ICLR 2025

A novel zero-shot TTS model utilizing masked generative codec transformers.

Metis: A Foundation Speech Generation Model with Masked Generative Pre-training

Yuancheng Wang, et al., Zhizheng Wu • NeurIPS 2025

Foundational model for speech generation leveraging masked generative pre-training techniques.

AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement

Junan Zhang, et al., Zhizheng Wu • IEEE/ACM TASLP 2025

A unified generative approach for voice enhancement using prompt guidance.

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation

Haorui He, et al., Zhizheng Wu • IEEE/ACM TASLP 2025

High-quality training data from under-explored in-the-wild sources that capture spontaneous human speech in real-world contexts.

ASVspoof: the Automatic Speaker Verification Spoofing and Countermeasures Challenge

Zhizheng Wu, et al. • IEEE Journal of Selected Topics in Signal Processing, 2017

A benchmark challenge paper defining the standards for spoofing detection in speaker verification.

Professional Path

Work history

Aug 2022 - Present

Associate Professor

The Chinese University of Hong Kong, Shenzhen

2020 - Present

Founding member / Advisor

Sanas

Apr 2019 - Jul 2022

Tech Lead / Research Scientist

Meta Platforms Inc, USA

Feb 2018 - Apr 2019

Engineering Director / Research Scientist

JD.COM Silicon Valley Research Center, USA

May 2016 - Feb 2018

Research Scientist

Apple Inc, USA

May 2014 - May 2016

Research Fellow

University of Edinburgh, UK

Nov 2007 - Jun 2009

Research Intern

Microsoft Research Asia

Honors

Awards & recognition

2025Huawei Spark Award
2025Best Paper Finalist, APSIPA ASC
2025Spotlight Presentation, ICLR
2024Best Paper Finalist, IEEE SLT
2021 - NowWorld Top 2% Scientist, Stanford University
2016Best Student Paper Award, INTERSPEECH
2015Top 1 in Blizzard Challenge

Services

Academic leadership

Associate Editor, IEEE Signal Processing Letters
General Chair, IEEE Spoken Language Technology Workshop 2024
Area Chair, INTERSPEECH 2024
Associate Editor, IEEE/ACM TASLP
Committee Member, IEEE Speech and Language Technical Committee

Speaking

Invited talks

Sep 2024

Prospects of AI Voice and Audio Generation Applications

Sep 2024

Zero-Shot Text-to-Speech Synthesis

Aug 2023

Recent Advances in Voice Spoofing Detection

Oct 2022

Detecting manipulated and synthetic audio

Team Culture

Open science, ambitious execution, and useful AI.

Values: be a leader not a follower, be bold and fight for excellence, seek expertise and experience. The group actively collaborates with academic and industry partners to push AI systems into practical impact.

Research Group

PhD Student

Aug 2022 - Present

Alumni

Past members

XXX

Alumni (XXX)

Jun 2020 - Sep 2020