Zyphra Releases ZONOS2 Voice Clone AI With Japanese Support
ZONOS2 treats Japanese as a top-tier language from launch, a rare priority for a Western-developed voice AI, and its open release could accelerate Japanese-language voice cloning applications.
Reporting from 1 sources: GIGAZINE.
Zyphra announced ZONOS2, a real-time voice synthesis AI capable of cloning a person's voice from reference audio. The model supports Japanese as a Tier 1 language alongside English and Chinese, and is released as an open model under the Apache License 2.0. Demo audio includes President Donald Trump speaking about Evangelion characters.
Zyphra announced ZONOS2 on June 12, 2026, a voice synthesis AI that clones a person's voice from a short audio sample and reads arbitrary text in real time. The model uses a mixture-of-experts architecture with 8 billion total parameters and 900 million active parameters, achieving four times the real-time throughput of its predecessor Zonos-v0.1. Training data grew from 600,000 hours to over 2 million hours, which Zyphra says improves robustness against noise and atypical speech patterns.
Japanese is positioned as a Tier 1 language alongside English and Chinese, with performance gains attributed to treating text input as raw UTF-8 data. Zyphra released demo audio of President Donald Trump speaking about Evangelion's Shinji Ikari and Gendo Ikari, and former President Barack Obama speaking about a Gundam development plan. The model is available on Hugging Face under the Apache License 2.0 and on Zyphra Cloud using AMD AI chips.
Synthesized by Yomimono from the 1 cited source below, including Japanese-language reporting where cited, then editorially reviewed before publishing.