Олександр Кузьменко AI Eng 26 March 2025, 14:05

OpenAI has trained the GPT-4o model to generate images better than DALL-E 3. This update will be available to all users soon.

OpenAI CEO Sam Altman has unveiled a major update to ChatGPT’s image generation capabilities. Now, the AI chatbot can use OpenAI’s GPT-4o model to create and modify images and photos. What are the advantages and disadvantages of image generation in GPT-4o?

Leave a comment

OpenAI has trained the GPT-4o model to generate images better than DALL-E 3. This update will be available to all users soon.

OpenAI CEO Sam Altman has unveiled a major update to ChatGPT’s image generation capabilities. Now, the AI chatbot can use OpenAI’s GPT-4o model to create and modify images and photos. What are the advantages and disadvantages of image generation in GPT-4o?

As TechCrunch reports, the GPT-4o model has long been at the heart of the AI chatbot platform, but until now, the model could only generate and edit text, not images.

When generating images, GPT-4o «thinks» a little longer than the DALL-E 3 model, which it effectively replaces. But in return, it can create more accurate and detailed images, OpenAI says. GPT-4o can edit existing images, including images of people or animals, transforming them or «finishing» details such as foreground and background objects.

One of the advantages of GPT-4o is the consistency in the depiction of characters and objects, which it can transfer to different versions of the image, according to user prompts. OpenAI also emphasizes the accuracy of text transmission in images generated by GPT-4o.

«Because GPT-4o now has built-in image generation, you can refine images using natural conversation. GPT-4o can draw on images and text in the context of the chat, ensuring consistency across the board. For example, if you’re creating a character for a video game, the character’s appearance remains consistent across multiple iterations as you refine and experiment,» OpenAI says.

The company notes that GPT-4o can analyze and learn from user-uploaded images, easily integrating their details into its context to create images. The OpenAI blog demonstrated this with the example of a photo of a cat, which the user used GPT-4o to add details and a video game interface.

GPT-4o’s ability to generate realistic images is also noted:

At the same time, the company listed the shortcomings of image generation in GPT-4o known to developers:

GPT-4o can sometimes crop longer images, such as posters, especially at the bottom.
Like other AI models, image generation can be «hallucinating» (for example, when creating a world map), especially in short prompts with minimal detail.
When creating images that rely on a knowledge base, GPT-4o may have difficulty accurately representing more than 10-20 different concepts at once, such as the complete periodic table of Mendeleev.
The model sometimes has problems rendering languages that do not use the Latin alphabet. In such cases, characters may be inaccurate or «hallucinatory», especially when the query complexity is high.
GPT-4o has problems with eliminating errors in the generated text — other parts of the image may also change.
The model has difficulty when asked to display detailed information of very small size and has problems when plotting graphs.

«Our model is not perfect. We are currently aware of numerous limitations that we will try to address by improving the model after the first run,» OpenAI says.

Altman said native GPT-4o image generation is already available in ChatGPT and Sora, OpenAI’s AI video creation product, for Pro subscribers for $200 per month. However, OpenAI said the feature will soon be available to ChatGPT Plus subscribers, as well as free users.

To power the new image processing feature, OpenAI said it trained GPT-4o on «publicly available data» as well as its own data from partnerships with companies like Shutterstock. Content creators who don’t want OpenAI to use their images can submit a request for removal via a special form.

Read the country's main IT news in our Telegram

Evil ChatGPT clones work for cybercriminals. How WormGPT FraudGPT and other threats work

OpenAI CEO Sam Altman revealed the company's plans for AI models GPT-4.5 and GPT-5. The second will be free for all users

An AI expert tested Google DeepMind's free text-based image editing feature. Here are his findings

Leave a comment

Text: Олександр Кузьменко Photo: OpenAI Tags: openai, gpt-4o, image, ai

Found an error in the text? Highlight it and press Ctrl+Enter. Found an error in the text? Highlight it and press the 'Report an error' button.

Розміщення реклами

Advertising Placement

Roosh запускає нову освітню платформу AI HOUSE CLUB для ML/AI-спеціалістів та дата сайнтистів. Розповідаємо, як подати заявку та чому навчатимуть

Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua

Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент. Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.

У TikTok тепер можна генерувати фон за допомогою нейромережі. Ми протестували її та ділимося результатами

У TikTok з’явилася нова функція «Розумний фон». З її допомогою як фон для тіктоків можна підставляти згенеровані нейромережею зображення. Редакція dev.ua протестувала цю технологію і ділиться своїми враженнями.

1 comment

Які IT-спеціальності будуть потрібні в найближчі п'ять років? Ми з'ясували у голови американського стартапу ADAM Дениса Гурака

Have important news to share? Message our Telegram bot

Key events and useful links in our Telegram channel

No comments yet.

Sign in to leave a comment