UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Олександр КузьменкоAI Eng
28 February 2025, 15:23
2025-02-28
AI startup ElevenLabs launches Scribe model that converts voice to text and supports Ukrainian language with "excellent accuracy"
ElevenLabs, an AI startup valued at $3.3 billion whose product was used to dub President Volodymyr Zelenskyy’s interview with US blogger Lex Friedman, has launched a new standalone model, Scribe, that supports Ukrainian, one of the languages with the lowest error rates.
ElevenLabs, an AI startup valued at $3.3 billion whose product was used to dub President Volodymyr Zelenskyy’s interview with US blogger Lex Friedman, has launched a new standalone model, Scribe, that supports Ukrainian, one of the languages with the lowest error rates.
As TechCrunch reports, ElevenLabs' Scribe model supports over 99 languages at launch. The company classifies over 25 languages as having «excellent accuracy» for the model, with a word error rate of less than 5%. This list includes English, Ukrainian, French, German, Hindi, Indonesian, Japanese, Polish, Portuguese, Spanish, Vietnamese, and others.
Other languages are divided into different categories:
with high accuracy — from 5% to 10% of errors in words;
good accuracy — from 10% to 20% of errors in words;
moderate accuracy — from 25% to 50% of errors in words.
The company said the model outperformed Google Gemini 2.0 Flash and Whisper Large V3 in FLEURS and Common Voice tests in various languages.
ElevenLabs developed a speech-to-text component for its AI conversational agent platform, which was released last year, but this is the first time the company has released a separate speech recognition model.
«We want to better understand what you’re saying in a conversation. We’re working to move beyond just generating content and into understanding and transcribing speech. Many people say that converting speech to text is a solved problem. But for many languages, it’s very bad. We believe we can build better speech recognition models because we have internal teams that annotate the data and give us quick feedback,» said CEO Mati Staniszewski.
Screenshot from the Ukrainian language page in Scribe
The model also features intelligent speaker dialogization to tell the user who is speaking, word-level timestamping for accurate captioning, and automatic tagging of audio events such as audience laughter. The startup gives customers the ability to directly transcribe video content for subtitles or captioning in their studio.
Currently, Scribe only works with pre-recorded audio formats. The company says it will soon release a low-latency, real-time version of the model. This means it’s not yet effective for transcribing meetings or voice notes.
Scribe costs $0.40 per hour of transcribed audio. While that price is competitive, some of its competitors offer lower prices for audio transcription with some feature differentiation, TechCrunch notes.
Recall that in 2023, the startup ElevenLabs, which creates a universal machine for dubbing with artificial intelligence, added support for more than 20 languages. Among them were Ukrainian, Polish, Hindi, Portuguese, Spanish, Japanese, and Arabic.
In late January 2025, ElevenLabs raised $180 million in a new funding round and tripled its valuation to $3.3 billion. The Series C funding round was co-led by Andreessen Horowitz and Iconiq Growth with additional new investors NEA, World Innovation Lab, Valor, Endeavor Catalyst Fund, and Lunate.
Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua
Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент.
Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.
У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами
У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.