The AI Daily Brief: Artificial Intelligence News and Analysis - 4 Reasons to Use GPT Image 1.5 Over Nano Banana Pro
Episode Date: December 18, 2025OpenAI has released GPT Image 1.5 inside ChatGPT, and early reactions suggest the gap with Nano Banana Pro has meaningfully narrowed. This episode walks through first impressions from head-to-head tes...ts, benchmark reactions, and creator feedback, then digs into four specific areas where GPT Image 1.5 may be the better choice right now, from hyper-precise instruction following and complex multi-constraint prompts to alternative infographic aesthetics and a consumer-first interface designed for discovery and play. The takeaway is not that one model clearly wins, but that creators suddenly have real choice at the high end of image generation, which is a big shift from just a few weeks ago. Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. https://www.kpmg.us/AIpodcastsRovo - Unleash the potential of your team with AI-powered Search, Chat and Agents - https://rovo.com/Zenflow by Zencoder - Turn raw speed into reliable, production-grade output at https://zenflow.free/LandfallIP - AI to Navigate the Patent Process - https://landfallip.com/Blitzy.com - Go to https://blitzy.com/ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai
Transcript
Discussion (0)
Today on the AI Daily Brief, OpenAI has released a new image generation model.
We are gathering all of the first responses as well as talking about four areas where I think you may prefer it, even over Nanobanana Pro.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements before we dive in.
First of all, thank you to today's sponsors, KPMG, Litsy, Rovo, and Robots and pencils.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcast.
And of course, to learn about sponsoring the show, send us a note at sponsors at AIDailybrief.aI.
A little teaser here.
As you know, we've got the early readout results of the AIROI benchmarking survey coming.
And just in general, if you're interested in that sort of data and research, might I point
you to AIDB Intel.com?
We're going to have a lot more interesting things in these domains coming next year, and you can
sign up to get notified as we share more of that information.
Now, one more note before we dive in, as is usually the case with a new model release.
This episode got long and consumed the entirety of the space that we have.
At this point, we're getting due for an extended headline, so that will be coming soon,
but for now, enjoy this look at ChatGBT-GBT Images.
Another day, another new model.
Look, this competition between the big labs may be stressful for the people working there,
but for us consumers, it means nothing but more choice.
Today we are talking about OpenAI's latest image generation model
and the new house they put it in, which they are calling Chat-GPT images.
Now, overall, this is one that I kind of
expected. You might remember in the December prediction episode, even before I think Sam Altman had
declared Code Red, or at least before we knew about it, my best guess for a response to Gemini
3 and Nanobanana Pro was an OpenAI image model. It had just been a really long time since we
got an update on that. It was clearly an area where they were pretty far behind, and it seemed like
based on the fact that it had been so long since we got an update, and knowing the speed at which
OpenAI delivers, they had to be pretty close, one would think, to being able to release a new
model. Now, I didn't expect a full 5.2 release, and that's obviously the first output of code
red, but yesterday on Tuesday, OpenAI dropped their new chat CBT images. As benefits, they point to
stronger instruction following, precise editing, detail preservation, and a big speed boost as compared
to before. So let's talk a little bit more about what OpenAI points to as the benefits here.
A lot of this is about feature parity with Nanobanana Pro. Remember, the real value of Nanobanana
Pro was not just that it was an improvement in terms of raw generation capability.
It was about the controls that the user had over it.
Whereas in the past, to get exactly what you wanted out of a generation, you'd just
have to kind of prompt it over and over and over again and pick the one that was closest.
Nina Banana Pro allowed for more precise edits.
That capability has now come to chat GPT images as well.
They write, the model adheres to your intent more reliably, down to the small details,
changing only what you ask for while keeping elements like lighting composition and people's
appearance consistent across inputs, outputs, and subsequent edits.
Interestingly, they point to some pretty consumer-centric use cases for that,
which is a theme we'll come back to throughout this episode.
They continue, this unlocks results that match your intent,
more believable clothing and hairstyle try-ons,
alongside stylistic filters and conceptual transformations
that retain the essence of the original image.
Another capability they point to is adding, subtract, and combining blending and transposing.
For example, taking a set of inputs and turning it into a single composition.
Another capability that they're really hammering is what they call creative transformations,
basically taking one image and turning it into a different style preset, a movie poster,
or turning someone into an 80s fitness instructor, taking someone's photo and turning it into an ornament,
etc, etc.
Once again, and as we'll come back to, I actually think that they're highlighting this
says a lot about who they are intending this product for.
Other benefits they point to include better instruction following,
up to an including much more precise prompting,
and they also point to much better text rendering.
Now, this was obviously one of the biggest changes that we got with Nanobanana,
is that in addition to just being able to have text,
with Nanobanana and then Nanobanana Pro,
you could get a ton of high-fidelity text,
opening up new possibilities for things like infographics.
One final interesting thing from their announcement post
is that while in most areas the model improved,
they actually did find some regressions as well.
For example, they write,
the ability to generate some specific art styles
has regressed from the previous version.
The example they give is draw me like I'm in a dark fantasy anime, with the new version completely
100% not being that at all.
There are other limitations as well.
For example, when there's a picture with a lot of different faces in it, keeping all those
faces consistent between generations can be difficult.
Overall, they claim a big improvement, but still a lot more opportunity ahead.
So what were people's first impressions?
I think my sense is that people were kind of prepared to be somewhat underwhelmed.
I'm not exactly sure what the reason for that is.
Maybe it's a concern that because this was part of that code red, that this and basically any other model that they might release would be a rush job.
But for a lot of people, even though they were prepared to be underwhelmed, they were, I would put it, kind of whelmed.
Justine Moore from A16Z writes, in early tests, this is a big step up in maintaining consistency of characters and objects from uploaded images.
In other words, your face still looks like you.
It may be a real competitor to Nanobanna Pro.
Simon Smith from Click Health wrote,
I wasn't expecting OpenAI's new image generator to be comparable to Nanobanana Pro, so I ran it head-to-head-head-on prompts I tried with NBP.
Surprisingly, it did as well or better.
But it has a different personality, at least via ChatGBT, BT.
Less whimsical, more professional.
So here are a couple of the examples he gave.
Research when prominent people, especially the leaders of big AI labs and forecasters think we'll get AGI.
Then illustrate this on a timeline and put the faces of the people on the timeline on the years when they think we'll have AGI.
give this a fun kind of cartoony but not too silly feel.
Now, a couple things.
First, I think this is a good test to see how well integrated
with the rest of the model image generation is.
In other words, this requires not just image generation,
but it's also reason and research.
And the second thing that this brings up
is that inherently the challenge with all of this episode,
and by the way, this is a good one to watch if you're just listening,
is that to some extent quality is going to be subjective.
Although in this case, I certainly see why he prefers chatch-EBT images version
as opposed to nanobanana. He tried creating a cell cutout diagram, which again is a little bit
in the eye of the beholder, but certainly holds its own, alongside a skeleton anatomy chart,
and a prompt that said, search up today's top headlines and then give them to me in the style
of an old newspaper. Now, the two models in this case took the prompt in very different directions,
and I actually prefer aesthetically, nanobanana pros, but overall, Simon says,
I was prepared to be disappointed and I'm not. That's saying something because nanobanana
a pro is amazing. I need more time to play around with the new image generator, but my first impressions
are positive. He then came back and said, slides, however, may be a weakness of GPT image 1.5,
before very quickly returning and saying, okay, I take it back. GPT image 1.5 can do gorgeous slides.
You just need to prompt it. I gave it the same template in the above example, but used GPT
5.2 thinking instead of instant and a broader prompt. He did point out, however, that there are
real limitations to the aspect ratios that you can get with GPT image, which has always been an
issue for chat GPT images. Still, all of this added up for Simon to him actually thinking that
GPT Image 1.5 has beaten Nanobanana Pro on his personal scorecard. And it wasn't just Simon. Alam
Arena tweets, Image Arena shakeup. OpenAI's GPT Image 1.5 is number one in text to image.
ChatGPT Image latest is number one on image edit. GPT Image 1.5 holds a commanding 20
point lead on text-to-image while maintaining a narrow three-point edge over Nanobanana Pro on
on Image Edit. Now, they do say that these scores are preliminary and we'll see where they settle,
but still I think this would surprise a lot of people. Artificial analysis found something similar.
They wrote, on both text-to-image and image editing, GBT Image 1.5 again surpassed Nanobanana
Pro on their tests. They gave a couple of different text-to-image generation examples, a couple of
editing examples like changing a car's color, and inserting a family of ducks crossing a railroad,
ultimately again ranking at number one.
Now, there are a million examples out there
if you want to go see direct head-to-heads
on ChatGBTGBT versus Nanobanana Pro.
And my strong suspicion is that if you don't have a particular horse in the race
or a set of biases that you're bringing in to start,
you're likely to find some where you prefer ChatGBTBT
and some where you prefer Nanobanana Pro.
For myself outside of just exploring a bunch of things that I thought were interesting,
I ran a couple of tests.
For instruction following with multiple constraints,
I ask for one person standing and pointing at a screen, two people are seated.
The screen shows abstract charts with no readable text.
The room is modern and minimalist.
The color palette is black, white, and light gray only, no windows, no plants, no logos.
In that case, both Nanobanana Pro and GPD images were able to do it equally competently.
On a test of photorealism.
I asked for a photorealistic image of a hand holding a clear glass coffee mug filled halfway
with black coffee.
The hand has to have all five fingers and have them all visible.
The glass has to show realistic reflections and refraction.
the coffee surface needs to be flat and level, natural indoor lighting and a neutral background.
Again, in both cases, the models were pretty equally competent.
Getting into more stylistic and aesthetics, I asked for a 1950s retro-futurist style illustration,
with flat, bold shapes, a limited color palette of teal cream and muted orange,
clean lines in an optimistic mid-century modern aesthetic.
Once again, they were both competent, and ultimately the preference here is going to be in
the eye of the beholder.
One of the challenges that this shows is that a single stylistic prompt can mean
different things. These are both examples of 1950s retrofuturism, but one is a little more
Jetsons and the other is a little more abstract. When we created a character and then put them in a
different setting, both models had no problem keeping consistent from one to the next. And of course,
on YouTube thumbnails, a very common use case for me, frankly they were both pretty garbo,
although I know for a fact that I could improve that with different prompting. As you can
probably tell across my test then, what I found was pretty meaningful parity, not necessarily
a clear or huge improvement over Nanobanana Pro, but clearly a huge improvement from where
OpenAI's image generation model was before this. However, it's not hard to find people who feel
the opposite if you go check out Twitter slash X. There were many people who were just kind of
generically underwhelmed. AI News by Small AI said shipping anything is hard so we rarely call
out misses and OpenAI rarely misses, but this was clearly a miss. OpenAI Image 1.5 claims to
beat Nanobanna Pro number one across all arenas, but completely fails vibe checks.
The Ejiko did a test and found that character-faced accuracy was kind of lacking.
Brand designer Darius Krova gave a base input image as well as a product package
and asked both models to make the girl in the input image hold the bottle
and said, well, it's better than before, Chatsybt didn't get the scale and change the product
and the light. And if I ask it to make some edits, it reworks the whole image.
We'll keep testing, but for now it's one to zero for Google.
David Shapiro provided a bunch of images of himself and asked both models to create a YouTube
thumbnail, which in this case, undeniably, nanobanana smashed compared to chat GPT.
Some people were even quite flabbergasted with the arena and artificial analysis results.
I Am Emily 2050, re-shared artificial analysis's post and said, what a joke.
I'm not going into the conspiracy side, but this is really not looking good for artificial
analysis.
When someone said, how, that can't be right, Emily responds, Open AI gained the benchmarks or paid
them to say so, which hold aside the substance of that argument, I think reflects people's
skepticism. The X comments on both artificial analysis post and the Elam Arena post also show
just tons of skepticism. All right, let's talk about the signal versus the noise in enterprise AI.
The challenge right now isn't just about what's possible, it's about what's practical.
That's the entire focus of the You Can With AI podcast I host for KPMG.
Season one cut through the hype to focus on deployment and responsible scaling.
Season two goes a level deeper. We're bringing together panels of AI builders, clients, and KPMG leaders
to debate the strategic questions that will define what's next for AI in the enterprise.
Six episodes packed with frameworks you can actually use.
Find you can with AI wherever you get your podcasts.
Subscribe now so you don't miss the new season.
This episode is brought to you by Blitzy,
the Enterprise Autonomous Software Development Platform with infinite code context.
Blitzy uses thousands of specialized AI agents that think for hours
to understand enterprise-scale code bases with millions of lines of code.
Enterprise Engineering leaders start every development sprint with the Blitzy platform,
bringing in their development requirements.
The Blitzy platform provides a plan, then generates and pre-compiles code for each task.
Blitzy delivers 80% plus of the development work autonomously,
while providing a guide for the final 20% of human development work required to complete the sprint.
Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy
as their pre-IDE development tool, pairing it with their coding pilot of choice to bring an AI-native
SDLC into their org.
Visit Blitzie.com and press get a demo to learn how Blitzie transforms your SDLC from AI a
assisted to AI Native.
Meet Rovo, your AI-powered teammate.
Rovo unleashes the potential of your team with AI-powered search, chat, and agents,
or build your own agent with Studio.
Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform,
so it's always working in the context of your work.
Connect Rovo to your favorite SaaS app so no knowledge gets left behind.
Rovo runs on the teamwork graph, Atlassian's intelligence layer that unifies data across
all of your apps and delivers personalized AI insights from day one.
Robo is already built into Jira, Confluence, and Jira service management standard, premium, and
enterprise subscriptions.
Know the feeling when AI turns from tool to teammate?
If you rovo, you know.
Discover Rovo, your new AI teammate powered by Atlassian.
Get started at ROV as in VictoryO.com.
AI changes fast.
You need a partner built for the long game.
Robots and pencils work side by side with organizations to turn A.O.
AI ambition into real human impact. As an AWS certified partner, they modernize infrastructure,
design cloud native systems, and apply AI to create business value. And their partnerships don't
end at launch. As AI changes, robots and pencils stays by your side, so you keep pace. The difference
is close partnership that builds value and compounds over time. Plus, with delivery centers
across the U.S., Canada, Europe, and Latin America, clients get local expertise and global scale.
For AI that delivers progress, not promises, visit Robots and Pencils.
So what to make of all of this?
I think Peter Gostov from Ella Marina is directionally correct when he writes,
My anecdotal impression of GPT 1.5 versus Nanobanana Pro is that they are pretty neck-and-neck
overall.
I find GPT a lot easier to prompt.
With nanobanana, you often had to iterate several times before getting a good result.
While with GPT, you typically get what you ask for.
But I think Nanobanana has slightly nicer taste, e.g. for infographics, slides, Google has the
advantage.
I found GPT style quite heavy, with the important point in the part I'm saying directionally correct
being the pretty neck and neck overall.
Jimmy Apples had an even simpler version of the same statement, big upgrade over the previous
model.
It's not as smart as banana, but it's going to be subjective on what you like on style versus
style.
Personally, it really hits the image in my head I have for this prompt.
Just use what you prefer.
I'll be using both.
And that is exactly what my overall conclusion is.
Before this, Nanobanana was undeniably and very clearly better than
anything Open AI had going on with image generation. Now, it is not so clearly better,
at least not in all cases. What that means practically is that for really high-quality image
generation, on Tuesday morning you had one option, and now in a lot of cases you're going to have two.
Now, one interesting point that Swix made is that we may also be seeing the limits of how far we
can go in image generation with current methods. He writes,
I think today's Image 1.5 launch illustrates one of the reasons why people are betting so hard on
explicit world models. For the next level in realism, we're going to have to teach the models to
see the world as we live it, not through occasional snapshots. He pointed to a post on R-slash-Chat
CBT that said, the new image gen is nuts. Someone responded, however, yes, but also the details are a little
off. Why is one leg bare and the other covered by pants? What kind of car has a vanity table
behind the front seat? Where is the passenger seat? Maybe it's covered by her, but a lot of the
background and context still seems off. Still, the people at least look human and not like plastic anymore.
So as we round out, let's ask, is there anything that ImageGen, I think, does distinctly better
than Nanabanana right now? And while my answer is no, there's no one use case where I thought,
just in every test that I tried, ImageGen crushed Nanobanana Pro or anything like that,
there are four areas right now, with a fifth potential bonus area in the future, that I think
ImageGen may be a desirable alternative to what Nanobanana can do.
First up, let's talk infographics. One of the incredible things about Nanobanana
Pro when it was released, is that all of a sudden this new capability of making infographics from
text came online. I'm sure that you have seen a ton of these floating around the internet,
and indeed, that ubiquitousness and commonality of style is exactly why I think in some
cases you might want to use ChatGBTDBT images instead of Nanobanana to make your infographics,
for the simple reason that they don't look like a nanomana infographic, which already has a
particular flavor in style that people can spot from a mile away.
I dumped in a recent episode transcript to get an infographic based on it, and both models were
able to do this, although they each had their own quirks.
As it often does, Nanobanana's first iteration gave a bunch of citation references, even though
those are completely useless and wasted space on a visual infographic like this, whereas
chatGBT images just had a few little mistakes here and there.
For example, in the three biggest barriers to agentic AI section, it only has two barriers.
There were also some random spelling mistakes like bigger being spelled BIG,
GER. Now, perhaps the better approach than using chat GBT images is just to try to prompt your
way out of the standard look of Nanomonanapro, but my point here is that you at least now have a
competent visual alternative. I might add to this use case, things that need really high text
fidelity. That was one of the things that OpenAI called out in their announcement post,
and I did some tests around that as well. I asked for an over-the-shoulder shot of Abraham Lincoln
sitting at his desk writing the Gettysburg Address, make the entire address readable,
although in this case I found both models able to do it.
So once again, we're back into stylistic preference area.
A second area where I think genuinely,
chat chag-G-T images right now might have an edge,
is around hyper-precise instructions and complexity.
I took this six-by-six grid idea and really ratcheted up the complexity.
I said, make a six-colums-by-six-rows grid of Lofcraftian artifacts and entities
where each cell contains exactly one distinct illustration
centered within its square and not overlapping grid lines.
Overall style is 1920's Pulp illustration meets a cult manuscript.
Inked line work, muted sepia and see green tones, subtle paper grain, no modern elements,
no text anywhere in the image.
And then just to add another layer, I actually precisely gave it everything I wanted in all 36 squares.
It did just a phenomenal job.
There wasn't a single square that didn't have a strong, competent version of exactly what I asked for.
Nanobananas pros version of this was an absolute mess.
Instead of a 6x6 grid, I got an 8x5, it didn't follow the overall instructions as well,
and tons of the individual squares were just out of the blue in nowhere.
Now, of course, this is just one test, but I noticed a couple others also preferring
chat GPT images for some of these hyper-precise or complex instructions as well.
Ethan Malik writes, I tried something fun that worked better with chat GPT image generator
1.5 than Nanobanana Pro.
Point and click adventure game me, you are the parser, make images as the output and take in
commands, make the world super interesting, keep track of inventory state, et cetera. So you can see it
basically creates a screenshot from a video game, and then Ethan prompts it to go to the next shot
in the game. Look at the laser. Cover the laser with map and inventory. Run through the portal.
Chagibit did a really good job with this. Nanobanana Pro did not. In its first attempt,
the second image was completely different than the first scene, and then it just completely
bowed out, and in the second attempt it sort of did it, but with a much, much harder time.
Then, of course, there was Peter Gostov again, who tweets, I know people like nanobanana,
but I have some important needs that it just cannot meet. His prompt was create a square
image of a hand with six fingers, a wall clock showing 822, a glass of red wine full to the top.
Nanobanana Pro had a normal hand, a clock at 758, and a wine glass that was mostly but not entirely
full, whereas the new image gen model had a completely full wine glass, 822 on the clock,
and seven juicy weird fingers. A third area where I think you might prefer or at least want to
test chat GPT images as opposed to Nanobanana Prob is for aesthetically focused and higher
taste prompts. Flowers shops are a couple of examples where I think that the GPT images version
is just a big step up visually from the nanobanana version. Here's another example with a logo.
And Aziz AI found something similar.
The prompt that he tried was create a clean look website in Apple style for Nike in a four to five aspect ratio.
He said,
The winner was GBT in aesthetics of UI and understanding the prompt.
Now, I will say very clearly here that the point that I am trying to make is not, especially in this case,
that I think that chat GPT images will always be better.
It's that because these models are both so at the high end now,
when you are trying to find something that matches your vibes and reaches the levels of the high taste that you're
going for, you now have a couple of options. Images is, in some cases, going to be better and in some
cases going to be worse. But again, that means you've gone from one option to two options, basically
overnight. The fourth thing that I want to mention in terms of an area where chatch EBT images
excels as compared to nanobanana is the actual interface for using it. And I think this reveals
quite a bit about how they're imagining usage of this tool. Certainly myself, and I'd be willing to bet
many of you are coming at this conversation from a standpoint of a business or power user.
You want these fine-grained editing controls. You're imagining how you can use this for your
solopreneur business. But I think OpenAI is imagining that a lot of the usage of this is in fact
just going to be people messing around and having fun. Whereas with Gemini, there's absolutely no
difference when you're creating an image other than you say create image. In the chat GPT web app now,
there's a whole different section with slightly changed visuals and a whole lot more options.
In addition to your standard text prompt field, you also have a row of styles underneath that you can
try on an image, sketch, holiday portrait, dramatic, plushy, baseball bobblehead, etc.
Then below that, they also have a panel of ideas to just discover something new, like creating
a holiday card.
What would I look like as a K-pop star?
Me as the Girl with the Pearl Earring.
And you get the sense from this that they want to solve the blank slate problem and get
people messing around with this not for a business purpose but just for fun.
I'm sure it's not lost on them that one of their big things.
moments of user growth, if not their biggest moment of user growth ever, and certainly in
2025, was when we got the giblification trend where everyone turned everything into a studio
gibli image. These sort of interface options are very clearly aimed at the average user,
who isn't thinking about business outcomes and ROI, but is just there to have some fun.
Given how much of chat GPT's usage is regular everyday people, I can see why they're making
that bet. So that is four areas where I think you might want to try chat GPT images,
either instead of or at least in addition to Nanobanana Pro.
The bonus, however, in fifth future area,
is of course when you want to make Mickey or Moana or a Disney character.
Now right now, Chad Chbitty Images is much more locked down in my tests at least
than is Nanobanana.
I gave the prompt Sam Altman water skiing behind a boat driven by Andy Jassy.
This obviously relating to the news that OpenAI might be doing a deal with Amazon.
From Gemini, I got this cool Ralph Stedman-looking image.
From Chad GPT, I got this.
The image generation request did not follow our content policy.
Of course, we just learned that OpenAI and Disney had done a deal, a deal that will explicitly
bring Disney's characters into SORA.
If that extends into image generation, it could be a big deal as well.
Simon Smith again writes,
If OpenAI and Disney surprise everyone by allowing character generation with the launch of images
V2, pretty sure it will spark a ton of chat GPT use over the holidays.
Parents alone will burn up GPUs inserting characters into holiday messages for their
kids. Now, one thing Simon references there is V2. Remember that this is version 1.5, and people are
expecting a lot more in the relatively near future from an even better image generation model.
OpenAI staffers are indeed suggesting that this is just the start and that we are in for more
image generation updates in the future, which, as I said right at the beginning, is nothing but good
news for us consumers. So, friends, that is my first look at ImageGen 1.5. Hope this was useful,
certainly if you haven't yet, get in there and start creating.
it continues to come early as we get more and more AI toys.
For now, that is going to do it for today's AI Daily Brief.
Appreciate you listening or watching, as always.
And until next time, peace.
