The AI Daily Brief: Artificial Intelligence News and Analysis - 10 AI Projects to Learn Gemini 3 Nano Banana and Opus 4.5

Starting point is 00:00:00 Today on the AI Daily Brief, 10 AI projects through which you can learn all of these amazing new models that have dropped on us over the last couple of weeks. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Blitzy, Rovo, and Robots and Penciles. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts. And to learn about sponsoring the show or pretty much anything else about the show, you can check out AIDailybrief.A.I. In some cases, like on the sponsorship, there will also be emails you can point to.

Starting point is 00:00:39 In any case, again, it is AIDailybrief. And now, my friends, let's get practical. Welcome back to the AI Daily Brief. If you are in America, right now you are probably experiencing the hangover, either literal or the turkey hangover, of a big Thanksgiving or Friendsgiving. And while for a very short moment, I considered not having episodes as this is a weekend for friends and family and touching grass and hanging out and all that good holiday stuff. But what I decided to do instead was get a little bit more fun and practical all at the same

Starting point is 00:01:14 time. We have been on an absolute tear of incredible new models. In the last two weeks, we've gotten GPT-51, followed by 5-1 Codex Pro and 51 Pro, Gemini 3, Nanobanana 2, Opus 4.5, and even Groch 4.1. And as I said the other day, the biggest takeaway from all of this is that there are just a whole bunch of things that you can do now that you either couldn't do it all before or you really couldn't do well. So what we're going to do today is provide a little bit of weekend homework. For those of you who are catching some of this off time to go dig into all these new tools and toys. So we're going to talk about 10 AI projects you can do to learn these new models and better understand their capabilities. Now first up, this actually isn't a new

Starting point is 00:01:57 model, but if you haven't done it yet, I'm about to speed your life up significantly. One of the most embarrassing parts of the modern computing experience, and certainly the Mac OS and iOS experience, is how bad the voiced text is. If you ever tried to speak into your iPhone, you know you spend basically as much time fixing all the errors as you would have just writing it in the first place. WhisperflowW-W-A-I-Spr-F-L-O-W-A-I fixes that pretty significantly. You can set this up on your phone or on your computer, and so, for example, when I am doing anything on my desktop that I record all these podcasts on, pretty much at this point instead of typing, I'm pressing control and option for it to start listening to the microphone and just dictating things. I'm talking at something like 140

Starting point is 00:02:42 words a minute, and it even does a good job when I'm slightly rambly and repeat thoughts of cleaning things up. So I highly suggest as you dig into all these projects that you download whisperflow and try out dictation and start integrating speech as opposed to just typing, my guess is that you will very quickly find there are certain types of tasks that you will just not want to type for anymore. So technically this is one, but it's kind of just a bonus. Go install, whisper. Next up, moving to these actual new models that are released, we're going to start with Nanobanana. Now, Nanobanana isn't just great because of the photorealism of its image generation or even its ability to listen to instructions. It's great because it opens up this whole new set

Starting point is 00:03:21 of visual modalities that just weren't possible before. If you have been anywhere on social media since Gemini 3 released, you've probably seen some new infographic that would have been totally impossible before. This one, for example, is from Eric's son, who says one-shot infographic for acquired FM's three-and-a-half-hour Trader Joe's episode. Basically, Nanobanana was able to take a podcast, summarize it, and then turn it into an infographic. And what's important here is that there are actually two very different and very important things going on. The first is, of course, that Nanobanana can handle text in a way that is completely different than anything we've ever had. Any other model before couldn't even come close to this level of information density.

Starting point is 00:04:01 It just was not possible at all. But secondly, because it's integrated with Gemini 3, it's got built-in reasoning. So my strong assumption is that Eric probably didn't even have to say, first, summarize this and then make an infographic. Because it is integrated natively with Gemini 3's reasoning, it was just able to figure that all out. I played around with this at the end of last week as well, turning each of the first four episodes of the week into infographics.

Starting point is 00:04:24 I even then animated one with VO3.1 to take it to another level. So what should you do? My suggestion, to keep it really simple, would be to take some work report, either a project summary, maybe a new proposal, drop it into Gemini 3 or Notebook L.M, which we'll talk about in just a few minutes, and ask it to produce an infographic on that basis. But I will say that while the first couple of weeks are just people being impressed that this is a capability that AI has now, I do think very quickly, you're going to have a little bit of a slop sense when it comes to some of these infographics. and there will be a whole bunch of human taste involved in nudging the model in directions that makes the visual presentation, not just the sort of default nanobanana setting, and also doesn't try to compress everything, but maybe really gets at the information that is most impactful.

Starting point is 00:05:12 Basically, as with anything, I think that there very quickly will be a huge difference between general nanobanana infographics and the really good ones. And I think basically as people figure out better strategies, like this one here, which immediately feels really different, that's where a lot of the opportunity lies. Still, don't be afraid to just try things out to start. You can always go back and edit later.

Starting point is 00:05:32 Relatedly, I think that you should try out the combination of Gemini 3 and Nanobanana for data visualization. We are working on a new product behind the scenes called AI maturity maps. And while I don't want to get too much into exactly what that is yet, part of the goal is to create a very quick visual benchmark that organizations can use to see how they stack up relative to others when it comes to AI and agent adoption. I've spent the last few days, digging deep with Gemini 3 with Nanobanana integrated,

Starting point is 00:05:59 to move back and forth between the reasoning and exploration piece that Gemini 3 comes with and the visualization that Nanobanana makes available. And even more than the infographics, this is where it feels to me, like you really see the ultimate power of how these things come together into a hole that's greater than the sum of the parts. So a couple ideas for you at home to do data visualization, assuming that you don't have a product that you're trying to design. one kind of advanced one that I was thinking about

Starting point is 00:06:24 that seems like it could be really interesting is to try to create a visualization that compares how you wanted to spend time in a week to how you actually did. So the simplest version of this idea that I could think of was to at the beginning of a week, write down your major goals, and then maybe even some of your minor goals.

Starting point is 00:06:40 I was thinking from a professional perspective, but there's no reason it has to be, it could be personal as well. Then at the end of the week, give Gemini 3 slash Nanobanana access to your calendar, which you can either do by connecting it directly, or if you want to go analog, just taking a screenshot of it, and ask it to visualize the difference between what your goals were

Starting point is 00:06:58 and what you actually spent time on. Now, obviously, this isn't perfect because a calendar doesn't get at all of the things that you spend time on, so if you can give it other sources of information and context, all the better. But like I said, it's just one idea to think about how to experiment with the new data visualization capabilities of these models. An even simpler version that came from Zara Zhang, she used nanobanana inside of notebook LM to turn a resume into a slide deck.

Starting point is 00:07:21 One of the specific requests was to ask it to visualize competencies in Venn diagrams. She said this is a great way to understand your personal positioning. And I actually think that in the context of a resume, this Venn diagram idea is a pretty cool idea for data visualization. Once again, as you can see, this takes advantage of not just Nanobananas' image generation capabilities, but also the integrated reasoning capabilities of Gemini 3. Next up, this one is going to seem so silly and basic, but is, I think, incredibly valuable. Just go try and edit an image with Nanobanana. as compared to other image generation tools.

Starting point is 00:07:54 You can do this in a couple different ways. You can take an image that you already have and ask to swap something out or change some feature, or you can generate an image with this specifically in mind. One of the things that made Nanobanana 1 really powerful was the fact that you could be so much more precise in your editing, which opened up all sorts of commercial and business types of use cases that weren't possible before. That capability has extended to another level in Nanobanana 2, and just because it's really simple doesn't mean you should be ignoring it.

Starting point is 00:08:20 In fact, you should make sure that you've got mastery of that one before you do anything else. If you listen to yesterday's episode about 10 holiday-themed kids' AI activities, I've got a couple examples of where I wanted it to change certain aspects of some image that it generated for me while keeping the overall. So, for example, for the gratitude podcast idea, this is the image that it produced. And while it's great, I wanted it to be more Thanksgiving-e themed. I also wanted it to be less photorealistic and more cartoony. Now, I did not give this some super-sophisticated prompt. I said, make it cartoony and Thanksgiving-themed.

Starting point is 00:08:50 And this is what I got back. Now, maybe in this case, I would have been fine having a totally new generation, but I liked the setup of the first one. And this was able to change the style in terms of the illustration and the Thanksgiving theming without losing what I liked about the image of the first place. Another example for designing a superhero card for a pet waiting to be adopted, loved the setup, but wanted it to be holiday themed because that was the theme of the episode, and it turned it into this.

Starting point is 00:09:14 Tasteful little tree, lights over here, garland on the mantle, snowing outside. I'm telling you, go try to edit an image. Once you see what is possible, I would bet that it will find its way into your workflow much more frequently. Next up, I'm kind of cheating here because I'm going to do a bundle of new Notebook LM features. Now, once again, obviously a lot of this was playing second fiddle from an announcement standpoint to the big model releases, but Notebook LM is a totally different environment in which to use all these new capabilities.

Starting point is 00:09:40 The studio section of Notebook LM has recently added to their existing tools, video overviews, generating an explainer video, infographics, thanks to the new nanobobiles. banana and slide decks, which also take advantage of these new image generation capabilities. So I've got this notebook loaded up with 20 past super intelligent audits and was able to, for example, create this infographic. And honestly, although it has the look and feel of a nanobanana infographic, when you dig into this, the quality of information is really, really high, meaning that notebook LM was able to take these 22 sources, go through the intermediate steps of doing a generalized analysis on all of them, including finding average readiness scores,

Starting point is 00:10:18 and then turn it into this visualization. Let's do a short slide deck focused on the overall pattern scene in the reports. I don't want it to mention specific companies. I just wanted to provide high-level insights. So we'll click that. So while we're waiting for this to generate, let's talk about what you could do if you don't have an existing notebook yet that you can use. Try taking a topic that you're really interested in,

Starting point is 00:10:38 even if just on a personal level, and go add a bunch of sources just from the general web. This will be especially valuable if you know the topic so you can suss out whether the AI is doing a good job of summing things up or if it's just being lazy. And once you have all those sources, start to play around with what it can do. Try making an infographic, do a slide deck, do a video overview, just to get a sense for how that compares to using the general Gemini 3 interface. All right, let's talk about the signal versus the noise in Enterprise AI.

Starting point is 00:11:09 The challenge right now isn't just about what's possible, it's about what's practical. That's the entire focus of the You Can With AI podcast I host for KPMG. Season one, cut through the hype to focus on deployment and responsible scale. Season two goes a level deeper. We're bringing together panels of AI builders, clients, and KPMG leaders to debate the strategic questions that will define what's next for AI in the enterprise. Six episodes packed with frameworks you can actually use. Find you can with AI wherever you get your podcasts. Subscribe now so you don't miss the new season. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Context. Blitzy uses thousands of specialized AI agents that think for hours to understand Enterprise, scale code bases with millions of lines of code.

Starting point is 00:11:53 Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously, while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-I-D-E development tool, pairing it with their coding pilot of choice to bring an AI-native SDLC into their

Starting point is 00:12:20 Visit blitzie.com and press get a demo to learn how Blitzy transforms your SDLC from AI-assisted to AI-native. Meet Rovo, your AI-powered teammate. Rovo unleashes the potential of your team with AI-powered search, chat, and agents, or build your own agent with Studio. Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure platform, so it's always working in the context of your work. Connect Rovo to your favorite SaaS app so no knowledge gets left. behind. Rovo runs on the teamwork graph, Atlassian's intelligence layer that unifies data across all of your apps and delivers personalized AI insights from day one. Robo is already built into

Starting point is 00:13:01 Jira, Confluence and Jira Service Management Standard, Premium, and Enterprise subscriptions. Know the feeling when AI turns from tool to teammate? If you Rovo, you know. Discover Rovo, your new AI teammate powered by Atlassian. Get started at ROV as in victory, O, dot com. AI changes fast. you need a partner built for the long game. Robots and pencils work side by side with organizations to turn AI ambition into real human impact. As an AWS certified partner, they modernize infrastructure, design cloud native systems,

Starting point is 00:13:32 and apply AI to create business value. And their partnerships don't end at launch. As AI changes, robots and pencils stays by your side, so you keep pace. The difference is close partnership that builds value and compounds over time. Plus, with delivery centers across the U.S., Canada, Europe, and Latin America, clients get local expertise and global scale. For AI that delivers progress, not promises, visit robots and pencils.com slash AI Daily Brief.

Starting point is 00:13:58 And here we have after about five minutes, the AI readiness playbook. It goes through the common starting point, the agent pilot stage, central challenges and the pattern of contradictions, people paradox, technology paradox, opportunity paradox, etc. Now, I haven't had enough time to dig in

Starting point is 00:14:20 to be super fine-grained about just how good a job it did summing up, But I can already tell you that just based on things that I'm seeing here, a lot of these themes are some of the most recurring themes that we see over and over again. Backlog of high-value use cases, for example, as compared to undocumented workflows and no execution pipeline. You can also see visually that this is just a step up from some of the previous AI generations. So yeah, NoPagelm, although not a model release, is definitely deserving of a good second look if you haven't used it for a while. Now, one more that I won't belabor, just to really sum up Gemini III. And by the way, you can tell from this language that I did not use Gemini 3 for this image.

Starting point is 00:14:56 This was from Ideogram, which is great for fast generation. Excellent when I have to do a million of these slides. Ultimately, the thing that makes Gemini 3 so powerful is the integrated reasoning with native multimodality. And so my suggestion for really exploring that full capability is to pick a side project of yours, could be a business, could be a sports team, could be some hobby community, and design an entire integrated brand system, meaning logos, descriptions, website copy, merch even, just as a way to see how well this integrated capability can really be. As we shift out of Gemini 3 a little bit, that's the thing that I think is most powerful,

Starting point is 00:15:31 is this reasoning integrated with native multimodality. So design this whole set of assets to see how that all comes together. Now, next up, let's talk about 5-1, which I'm including 5-1 pro in there. As much as I am loving Gemini 3 for lots and lots of use cases, and think it opens up use cases that were never possible before, when it comes to my core LLM use case, which is basically thinking out loud and thinking through strategy,

Starting point is 00:15:55 5.1 is my favorite model in a very long time. Now, in chat GPT right now, you have access to 5-1 auto where it decides how long to think, which actually has four gradations underneath it, light, standard, extended, and heavy, and you have 5.1 Pro. I shift pretty frequently between these modes.

Starting point is 00:16:12 So, for example, if we're just doing some quick planning or low-stakes exploration, I will sometimes leave it on auto, my default though is probably thinking at the standard level. Then depending on how good the results are around a particular challenge that I'm thinking through, I will sometimes upgrade that to extended or even heavy. The times that I use pro are when I really want the model to capture sum up and synthesize a whole long conversation that we've been having.

Starting point is 00:16:38 So for example, using the same design of that product that I was mentioning before, I had gone back and forth probably 50 or 100 times. And I wanted to turn that into an actionable plan as, as well as memos that I could share with my team to catch them up on where I was. That's something that even though it takes an extended period of time, anywhere from two to five to ten minutes, I want the best that 5-1 can do, and so I turned on pro. So coming back to how you can test this, we are coming up on the new year. Right now is a great time to start planning.

Starting point is 00:17:05 It's almost like you get a 13th month of 2026 if you start now, and a great way to test out the business acumen and strategy capabilities of 5-1 or your chosen reasoning model. You can do this with Gemini 3 as well, or GROC 4.1. give the model any amount of context, which could be background documents, it could be existing performance documents, it could be analytics, whatever context makes sense, it could just be a ramble where you're using your whisper flow to just talk at the thing for 10 minutes to give it way more information than you would be able to otherwise type. And then from there,

Starting point is 00:17:33 after you've given it context, provide it your goals. Now, I would encourage you to not have everything figured out. Part of what makes AI so valuable as a strategic collaborator is that it can pick up and meet you wherever you are. So if you have one goal that you know for sure, but then a bunch of others that you're trying to prioritize, just communicate that. So you give it any amount of context, you provide it your goals. And like I said, I would start if you're using 5-1 on that general standard thinking setting. As you start to come to some hard choice in there, not just overall what could be, but perhaps an area where you have to prioritize one thing or another, where there actually is a fork in the literal or metaphorical road that you have to pick. I have found that 5-1 is much

Starting point is 00:18:12 better than previous chat GPT models at just making a decision and making an argument for it. I'm finding myself having to force it to make decisions far less frequently, but if you do, just force it to make a choice and give you its reasoning. Lastly, once you've gotten pretty deep into this process and you're ready to try to summarize, switch over to the pro mode and have it put together an executable plan. Whether or not you end up sticking with it, this should give you a pretty good read on what these models, 5-1 or another, can do. Right now, this is my most valuable use of 5.1 and 5.1 pro, and I honestly think this should be

Starting point is 00:18:44 a default thing that more people are doing just as a matter of course. Now, here's where we move into the realm of the builder. All of these new models are not just benefiting the core applications through which you access them, but also all the other applications that take advantage of those models. So for example, in the vibe coding realm, lovable, Repplet, all these platforms now have access to some combination of Gemini 3, Opus 4.5, etc. And I think in general it's a pretty good idea to try to keep yourself generally updated to just how much you can do with vibe coding even if or perhaps, especially if you are not technical.

Starting point is 00:19:17 As you'll see from the rest of this, I'm mostly speaking to non-developers with all of these vibe-coding ideas, as there are a whole different set of activities that developers should be doing to test these new models. But for the non-technical vibe coders that couldn't speak in code until 2025, take that executable plan that you produce with 5-1 or Grog 4-1 or Gemini 3 and turn it into a web app. So what do I mean by that? Well, on one hand, it could just be a personal accountability website, where it turns into a timeline and structures your work and to-dos in a way that is visual and interactable. Maybe, however, it also has the ability

Starting point is 00:19:50 to check in with you in a recurring way. Maybe it sends you an email or a push notification once a week to check in to see how things are going, in a way that has all the context about what you're trying to accomplish. Maybe it has the ability to upload files so you can give it more context without having to explain everything verbally.

Starting point is 00:20:05 Mostly what I want you to see here is that if you haven't touched the vibe coding tools for a while, you can now with any of the majors, for example, rep load or lovable, without leaving the experience, build a website that is actually published that you and others can actually interact with, and that's the end-to-end experience that I want you to try. So we are beyond the realm of just prototyping here.

Starting point is 00:20:25 I want you to actually go create a personal accountability and support app for whatever your big 2026 strategy is, and get it all the way to the point of publishing. If you don't want other people to see it, vibe code and password protection. And if you want to get a little bit more advanced, before we got all these amazing models, Google AI Studio also got a really interesting upgrade, they have made it much easier to vibe code Gen AI apps, specifically to integrate all of the AI tooling that is available in the Gemini API directly in a vibe coding experience. So, for example, if you wanted to in that web app have Nanobanana auto-generate an infographic

Starting point is 00:21:03 that describes your progress each week, you could do that. You can also integrate a chatbot, animate images, but the one that I would suggest playing around with if you are on this web app idea is try adding a conversational voice element. Basically, try adding a voice agent to your app that interviews you each week. So instead of it just sending an email that you respond to, have it start a conversation where you can talk and ramble at it and it can actually interact with you

Starting point is 00:21:27 and see how much more powerful it is at extracting context that helps it refine your plans going forward. I've been playing around with this a ton. I think it's incredibly valuable, and I think that the non-technical vibe coders out there are barely scratching the surface of what they can do when all of these Gen AI features become available as part of their vibe coding platforms. And then once you've added that voice agent, go see how much better design in vibe coding can be than it was just a little while ago.

Starting point is 00:21:54 Replit has recently released its design mode that focuses entirely on the visual prototype and just absolutely blows out of the water, some of the old sloppy-feeling purple interfaces and standard templates that you would see across ViveCoded apps. to test it I was playing around with a different visualization of a super-intelligent website, and it did a great job of looking very different and bold and not totally vibe-coded AI-ish, while also doing a good job on some of the copy elements as well. Now, my understanding is that Replitt's design mode is powered by a lot of these new models, and you should be able to try it out for free even if you can't go all that far on the free plan. So again, between this set of vibe-coding goals, by turning your 2026 strategy into a web app,

Starting point is 00:22:32 I'm suggesting you see how far you can go now with vibe-coding to actually create end-to-end experience. without having to go interact with GitHub or anything else like that. I'm suggesting you play around with Google AI Studio to see how you can integrate Gen AI features directly into your Vibe-coded apps, and then I'm suggesting you use Replitts Design mode to see the new visual capabilities that the vibe coding platforms have.

Starting point is 00:22:52 Now, as a bonus, and I'm cheating a little bit here as we wrap up, I prepared all this and then Opus 4.5 dropped, and all the people who are using it seem to be emphatic that it is pretty much the best coding model they've ever used. So if you are a little bit more advanced, and you're using, for example, Claude code, you can try to do all those things that I just suggested using Lovable or Replit or Google AI Studio for,

Starting point is 00:23:12 but directly with Claude Opus 4.5. I also asked Claude to come up with some ideas for applications that people could vibe code that on the one hand were actually useful, but on the other did show what new capabilities Opus 4.5 had that would have been hard with previous vibecoding platforms. If you listen to my Opus episode, you'll have heard that what it's about

Starting point is 00:23:31 and where it really seems to be improved is not getting lost in the sauce of deep coding tasks, which in many cases won't be all that relevant for the average non-technical vibe coder, but still Opus came back with a few ideas. For example, building a content repurposing hub. Drop in a podcast transcript or video script and it generates social posts, LinkedIn article, newsletter version, tweet thread, key quotes for graphics. I should also note that Opus 4.5 claims to be meaningfully better at complex spreadsheet tasks as well. And so if you are someone who in your work deals a lot with Excel, I would go check out Claude for Excel as well.

Starting point is 00:24:04 So yes, if you are keeping track, there have been so many models that even 10 AI projects probably can encapsulate all the new things you can do. But hopefully that gives you some ideas of where you can dive in in this long holiday weekend. I hope you have a ton of fun with it. Let me know what you build. Appreciate you listening or watching as always. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - 10 AI Projects to Learn Gemini 3 Nano Banana and Opus 4.5

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.