ElevenLabs Releases Dubbing v2 AI Model for Multilingual Audio
Dubbing v2 shifts AI dubbing from text-script-based translation to performance-based generation, which could reduce the cost and complexity of professional multilingual localization for creators and companies.
Reporting from 1 sources: GIGAZINE.
ElevenLabs has released Dubbing v2, an AI dubbing model that preserves the original speaker's emotion, tone, tempo, and speaking style across over 90 languages. The model generates audio directly from the speaker's performance rather than from a text script, aiming to produce natural-sounding dubbing that syncs with the original content.
ElevenLabs, the AI company founded by former Google engineers, has launched Dubbing v2, a model that reproduces the original speaker's emotions, tone, tempo, and speaking style in over 90 languages. Unlike conventional AI dubbing systems that relied on text scripts and often lost natural speech patterns, Dubbing v2 generates audio directly from the speaker's performance, capturing intonation and rhythm. The company says this approach solves long-standing problems in AI dubbing, making translated audio feel as if the person is actually speaking. A demonstration video shows YouTuber MrBeast's speech translated into multiple languages with closely matched voice atmosphere and tempo. Dubbing v2 is available on ElevenCreative and ElevenProductions, with API access planned later. One user noted the price: $22 per month for 9 minutes of dubbing.
Synthesized by Yomimono from the 1 cited source below, including Japanese-language reporting where cited, then editorially reviewed before publishing.