OpenAI has introduced a new image generation model called ChatGPT Images 2.0, which enhances its capabilities to create images based on user prompts. This updated model allows users not only to generate multiple images from a single request but it also incorporates text generation in various languages, including Chinese and Hindi. The global rollout is available for both ChatGPT and Codex users, with a premium version accessible to subscribers.
The introduction of a new image model often sparks increased interest in AI tools, especially on social media where trends can quickly gain traction. For instance, Google’s earlier release of its Nano Banana model saw users create and share hyperrealistic images online. Similarly, ChatGPT Images previously garnered attention for generating AI caricatures.
What’s Different?
ChatGPT Images 2.0 utilizes enhanced reasoning capabilities that allow it to browse the internet for recent data and generate multiple images simultaneously. This improved model outputs more granular and detailed images. For example, it successfully created an infographic with an accurate weather forecast for San Francisco, including recognizable landmarks like the Ferry Building and Transamerica Pyramid.
Another significant improvement is the model’s ability to adjust the aspect ratio of images, allowing outputs from wide to tall formats based on user preferences.
First Impressions
Initial testing of Images 2.0 showed notable improvements, particularly in text rendering. Past iterations often produced images with jumbled text, but the current version demonstrated greater accuracy. This advancement aligns with Google’s focus on improving text outputs in their image models.
However, results with different languages presented some challenges. When tasked to create a collage featuring Timothée Chalamet aimed at a Chinese fan base, the output included images and text that were a mix of realistic and nonsensical elements. ChatGPT admitted the generated text was not entirely accurate, including instances of gibberish disguised as coherent language.
While the English output performed impressively, the efficacy of the model in other languages remains uncertain. Nevertheless, given OpenAI’s progress with English outputs, it’s plausible that future updates could enhance multilingual capabilities as more global user data is incorporated.
Overall, while ChatGPT Images 2.0 marks a significant step forward in image generation technology, particularly for English text, it highlights the ongoing work needed to improve multilingual functionalities in AI-generated content.