UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Ігор Вишневський AI Eng
16 December 2025, 14:25
2025-12-16
The Gemini AI model is now ready to provide simultaneous voice translation directly into headphones in 70 languages. How it works and in which products the technology is available
The updated version of Gemini 2.5 Flash Native Audio is capable of real-time speech translation right in your headphones.
The updated version of Gemini 2.5 Flash Native Audio is capable of real-time speech translation right in your headphones.
As noted in the Google blog, this technology is available in Google products, including Google AI Studio, Vertex AI, and is also being implemented in Gemini Live and Search Live.
«For two-way conversations, Gemini’s live speech translation feature translates between two languages in real time, automatically switching the output language depending on who’s speaking. For example, if you’re speaking English and want to chat with a Hindi speaker, you’ll hear the English translation in real time in your headphones, while your phone will translate the Hindi after you finish speaking,» explains the mechanics of the technology on the Google blog.
At the same time, the technology preserves the intonation, tempo, and pitch of the speaker’s voice.
It is noted that the beta version is rolling out to the Google Translate app starting today.
«Starting today, you can try out a new beta version of the Google Translate app for real-time translation on your headphones by connecting them to your device and tapping ‘Real-time translation.’ This feature is available for all Android devices in the US, Mexico, and India, with support for iOS and other regions coming soon,» the company said.
The company also reports a significant improvement in the quality of multi-channel conversations, as Gemini 2.5 Flash Native Audio is able to more effectively extract context from previous conversations, making them more coherent.
The technology is capable of translating speech in over 70 languages and 2,000 language pairs.
It is also capable of understanding multiple languages simultaneously during a single session, which helps you follow multilingual conversations without having to adjust the language.
This uses automatic language detection, so the user doesn’t even need to know what language is being spoken to start translating.