Олександр Кузьменко Startup 12 December 2024, 18:26

The Twelve Labs startup is working on an AI that will be able to analyze video and look for specific moments in it at the user's request. Nvidia, Samsung and Intel are already interested in him

The startup Twelve Labs is creating artificial intelligence models that will be able to understand video content as well as text. Thanks to this, users will be able to search for certain moments in the video, make summaries from the video or ask questions like «When did the person in the red shirt enter the restaurant?».

Leave a comment

The Twelve Labs startup is working on an AI that will be able to analyze video and look for specific moments in it at the user's request. Nvidia, Samsung and Intel are already interested in him

The startup Twelve Labs is creating artificial intelligence models that will be able to understand video content as well as text. Thanks to this, users will be able to search for certain moments in the video, make summaries from the video or ask questions like «When did the person in the red shirt enter the restaurant?».

The co-founder of Twelve Labs Jay Lee believes that such opportunities will allow to create «powerful new applications», writes TechCrunch. The startup has already managed to secure the support of such sponsors as Nvidia, Samsung and Intel.

AI models that understand video as well as text can unlock powerful new applications. At least, that’s what Jay Lee, co-founder of Twelve Labs, thinks.

«Video is the fastest-growing and most data-intensive medium, but most organizations are not going to dedicate the human resources to review all of their video archives. Even if you try to set the tags manually, it will not solve the problem. Finding a specific moment or angle in a video can be like finding a needle in a haystack,» Lee told TechCrunch. That’s why Twelve Labs trains the model to match text to what’s happening in the video, including actions, objects, and background sounds.

Models like Google’s Gemini can search footage, while Microsoft and Amazon, among others, offer video analytics services to detect objects in clips. But Lee says Twelve Labs' products stand out for their customization capabilities, which allow customers to tailor the models to their own data.

«Companies like OpenAI and Google are investing heavily in universal multimodal models, but these models are not optimized for video. Our difference is that we’ve been focused on video from the beginning… We believe that video deserves to be our main focus — it’s not an app,» explained the co-founder of Twelve Labs.

Developers can build applications based on Twelve Labs models to search for videos and more. The company’s technologies can manage such processes as ad insertion, content moderation, and automatic generation of highlights from clips.

One of Twelve Labs' models, Marengo, can search not only video, but also images and audio, and can take a specific audio recording, image or video clip as a benchmark to help guide the search.

In addition, the company offers an API, the Embed API, for creating multimodal embeds for video, text, images, and audio files. Embedding is a mathematical representation that captures the meaning and relationships between different data points, making them useful for detecting anomalies.

Currently, Twelve Labs' two main partners are Databricks and Snowflake, which use its tools in their products. Databricks has developed an integration that allows customers to call Twelve Labs' embedding service from existing data pipelines. Snowflake creates connectors for Twelve Labs models in Cortex AI, its fully managed artificial intelligence service.

«We currently have more than 30,000 developers using our platform, ranging from individuals who are experimenting to large enterprises who are integrating our technology into their workflows,» Lee said.

According to him, over the next few years, the company plans to expand into new and related areas, such as the automotive industry and security.

It will be recalled that recently the famous YouTuber Marques Brownlee received early access to the Sora video generator from OpenAI and shared his first impressions.

Read the main IT news of the country in our Telegram

OpenAI launched Sora video generator. Here are its pros and cons as revealed by a well-known American technoblogger

"The fight continues." How to distinguish AI images, video or text from human-generated content

The people of Haiti have developed the Chameleon video encryption system for the military. What is it useful for and what is its secret

Leave a comment

Text: Олександр Кузьменко Source: TechCrunch Tags: ai, twelve labs

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment