The AI Daily Brief: Artificial Intelligence News and Analysis - 5 Prompting Tricks to Make Your AI Less Average

Starting point is 00:00:00 Today on the AI Daily Brief, how to make your LLM not average. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, super intelligent, robots and pencils, Notion and Blitzy. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts. And to learn about sponsorship opportunities, shoot us a note at sponsors at aidailybrief.

Starting point is 00:00:35 A.I. All right. So today's episode is something I've been thinking about for a while. In my head, I've always called it AI's tyranny of the average. And the simple notion here is that because AI has been trained across the entire corpus of everything that humans have output, almost by definition, it is optimized around average conventional wisdom. Sometimes that's fine.

Starting point is 00:00:58 What that does is that it ensures that the output of an LLM has a fairly high floor. If it produces passable content, passable writing, passable imagery, based on how you prompted it, that's good, right? At least it gets you in the zone. The problem is that increasingly, when it comes to production use cases and using AI for things that really matters, average isn't good enough. We want more than average. We want unique. We want distinct. We want really high quality. And for this, we have to turn to some prompting strategies, five of which I'm going to share today, that I have found help me in the ways that I use LLMs to make them excel ahead of that average output.

Starting point is 00:01:36 Now, one of the reasons that I thought this would be a good fit for the weekend big think slash long reads episode is that technology writer Alex Cantrowitz actually dropped a little quick essay on his blog, big technology.com, this week that's pretty much about this. He called it AI sameness problem, and it's short, so we'll read it quickly, and this will be me reading, not AI. For better or worse, you guys have sent the message clearly that as good as AI voice technology is, you prefer me reading it, and that's fine. But we'll read Alex's essay, and then we'll talk about these five techniques that I have found to work for overcoming the problem of AI's

Starting point is 00:02:08 averageness. Again, his essay is called AI's sameness problem, and it reads, Open AI's video generation app SORA sits atop the App Store charts, but I anticipate it'll fall off soon. Creating SORA videos is a genuine but momentary thrill. You can put yourself and your friends in hilarious, scary, or fantastical scenarios, and add Jake Paul or Mark Cuban where appropriate. Editor's note or where highly inappropriate, which is part of the fun. Back to Alex, he writes. But after a while, all SORA videos start to look and feel the same. The novelty wears off and the draw to open the app fades. SORA's sameness problem isn't isolated. It's present in almost all AI generated content. Generative AI tends to produce the average of averages,

Starting point is 00:02:51 seeking to minimize the delta between its output and the mean of human generated work. So AI images, video, and text often exhibit a uniformity that can only be broken with deliberate prompting, and even then not reliably. Editor's note again, that is what we are going to try to do to make it reliable to break that. Coming back to Alex's essay, he continues, To have a shot at long-term relevance, this sameness issue must be broken. It's why Instagram co-founder and current Anthropic Chief Product Officer Mike Krieger didn't appear to think SORA is the successor to the app he created when I asked him about it last week.

Starting point is 00:03:24 To have a shot at replacing modern-day social media, he said, the content must feel, quote, varied over time, and not just sort of like, yeah, okay, I've kind of seen it before. It's really interesting, but I've seen it before. AI-generated images suffered from the sameness problem as well. There's a quality to these images that makes it possible to spot most from a distance. It's as if the same artist responds to every prompt, even though the models have ingested all the world's artwork.

Starting point is 00:03:49 Some prompting can generate a unique AI image, especially when you ask the model to follow a certain artist style. But as the prompt becomes popular, the sameness problem reappears. This was the case with the Studio Ghibli moment that OpenAI's 4-O model kicked off. After some initial novelty, everything eventually became Studio Ghibli. And then the excitement faded and nobody giblifies their images anymore. AI sameness problem is perhaps most apparent in writing. Forget the M-Dash, it seems like most business communication reads exactly the same these days,

Starting point is 00:04:15 since much of it was written via prompt. My inbox now has more PR pitches than ever, and they all seem like they were written by the same agency. It's not that the public relations industry standardized its pitch format. AI's done it for them. I don't want to minimize how impressive this technology is. The Sora videos are a breakthrough, demonstrating AI has some basic understanding of physics in a way that's surprised even the most advanced researchers.

Starting point is 00:04:37 AI images are useful and I often rely on them to illustrate this newsletter. AI text generation, at least within ChatGBT, BT, is incredibly popular and often helpful. But for AI generated content to achieve its potential, it's going to have to increase its variety. And given the technology's fundamentals, that might be a tough problem to solve. Okay, so back to NLW here.

Starting point is 00:04:56 Clearly, Alex and I agree that there is this core underlying problem. The difference, it seems, is that I have spent a lot more time battering these systems to actually get what I want out of them. So what we're going to do for the rest of the show is look at the set of techniques that I have found actually work to help me get what I want out of these systems and how to make them no longer average. The first, let's call a negative style guide. A lot of what makes things average is hackneyed approaches that can be overused words, overused analogies, overused turns of phrase. A lot of what gives us

Starting point is 00:05:33 the feeling of AI and LLMs being in patterns is these common elements that come up way more in AI writing or AI output than they do in human output of the same type. If you follow me on Twitter, you'll probably see me screech about the use of the word telemetry. I literally don't go a day without chat chip BT using the word telemetry at least two or three times in some strategic discussion or another. And I don't know that I've even once in my entire life heard an actual human being in the real world use the word telemetry. Not only does that make it distinctly feel AI. It also gives it the feel of someone who's trying to pretend they're smarter than they are by using bigger words than they need to. So in some cases, negative style guide prompting can be as simple as saying,

Starting point is 00:06:16 don't say telemetry for the love of everything holy. Another common negative style guide that I have to remind the LLM of is that I do not believe in titles that have colons, if there is any way to avoid them. I think the best titles, both on YouTube and in podcasts, are single, strong, clear statements, not things that have dashes and colons and multiple thoughts crammed together. As much as I ask LLMs to put that in their memory, it's something that I have to frequently remind it. Negative Style Guide can go farther than that, though, and you can bundle it all into one prompt. So imagine a prompt, for example, follow this negative style guide, and then from there you list the things that you don't want it to do. Band words, revolutionary, innovative,

Starting point is 00:06:58 leverage, synergy, disruption, telemetry. You could also tell it other things that you don't want it to do in terms of formatting, et cetera. In short, negative style guide is one of the simplest but most effective ways to get at least the landmines of averageness that you've identified off of the table. Next up is an approach I'll call forced divergence in choice. One of the things that makes LLM outputs, I believe, feel incredibly average, particularly with writing or any sort of structured thinking outputs, is that in general these models have hated making decisions that are firm and fixed and cut off other opportunities for fear that you as the prompter will disagree. How many times, for example, have you asked an LLM something like, should I do this or should I do that?

Starting point is 00:07:40 And instead of picking one and making an argument for it, instead the LLM says, well, if you value X, Y, or Z, you should do this, and if you value A, B, or C, you should do that. And it's not that that's not useful in some circumstances. It's that it's an almost pathological unwillingness to pick an option that cuts off other options. And that's not what good thinking in human existence is made of. life is about making choices in which you do and in what you communicate and write. And so something that I very frequently do is force the model to recognize that based on what we're discussing, there are divergent paths, and it needs to pick and argue for one. A very frequent prompt that I

Starting point is 00:08:22 have is to force it to pick a single choice among many and argue vociferously why that is the best choice. Now, by the way, there are still some ways to take advantage of AI's to reason through lots of scenarios. For example, sometimes if I'm engaged in a strategic discussion with an LLM, I'll ask it to steal man the two or three arguments that we're discussing, in other words, make the strongest most compelling argument it possibly can for that scenario or that option, and then after it has done that, after it has put itself in the position of having to make the best argument it possibly can for each of the different options, to then actually decide and commit to one. For whatever reason, that two-step process seems to get better

Starting point is 00:09:04 results for me, particularly around those strategic conversations. And while mostly this comes up for me in the context of those strategy conversations, I think that it's also going to apply to writing as well. One of the things that makes writing strong is the simple clarity of what it's trying to argue. You all remember, if you are of the right age, I'm sure, the five-paragraph essay, the first paragraph which has your thesis statement and then three paragraphs to support and then the conclusion, part of why that was gospel for so long is that it's making sure that the job of the essay is one clear Unfortunately, AI's tendency to wander and equivocate can negatively impact its writing output as well.

Starting point is 00:09:41 Today's episode is brought to you by my company, Superintelligent. You've got 100 what if ideas, but which one becomes an agent? Superintelligent maps every AI use case across your company and helps you create an agent plan that you can actually execute. We match opportunities to your tech stack, your data profile, and your team. No more guesswork, just a clear path from pilot to production. If you want agents that deliver business outcomes, start with planning. Go to B-Super.ai.

Starting point is 00:10:07 And sign up for a demo. AI changes fast. You need a partner built for the long game. Robots and pencils work side by side with organizations to turn AI ambition into real human impact. As an AWS certified partner, they modernize infrastructure, design cloud native systems, and apply AI to create business value. And their partnerships don't end at launch.

Starting point is 00:10:28 As AI changes, robots and pencils stays by your side so you keep pace. The difference is close partnership that builds value and compounds over time. Plus, with delivery centers across the U.S., Canada, Europe, and Latin America, clients get local expertise and global scale. For AI that delivers progress, not promises, visit robots and pencils.com slash AI Daily Brief. Chatbots are great, but they can only take you so far. I've recently been testing Notion's new AI agents, and they are a very different type of experience. These are agents that actually complete entire workflows for you in your stock. and best of all, they work in a channel that you already know and love because they are purpose-built

Starting point is 00:11:07 Notion super users. Notion's new AI agents completely expands the range of what Notion can do. It can now build documents from your entire company's knowledge base, organize scattered information into organized reports, basically do tasks that used to take days and get them complete in minutes. These agents don't just help with work, they finish it. Getting started with building on Notion is easier than ever. Notion agents are now your very own super user to help you onboard in minutes. Your AI teammates are ready to work. Try Notion AI for free at the link in our show notes.

Starting point is 00:11:36 This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-IDE development tool, pairing it with their coding co-pilot of choice to bring an AI-native STLC into their org.

Starting point is 00:12:19 Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises. The team will provide a 5x velocity increase on a real development project in your org. Visit blitzie.com and press book demo to learn how Blitzie transforms your STLC from AI-assisted to AI Native. That's BLITZY.com. Another approach to breaking out of the averageness and sameness of standard AI outputs will call the cliche burn down. The idea here is to make the model expose and then replace the template it wants to use. So, for example, if you were writing some analytical essay, you might ask it to list the most 10 common cliches in terms of analogies or turns of phrase

Starting point is 00:12:59 that you might find in an essay like that. From there, you can ask it what would be better ways to communicate the same idea without falling trap into those cliches and then embed that logic in whatever output it has. Basically, the thing to recognize here is that there is a lot of value in getting the AI to identify the patterns that it's building off of so that it can then more conscientiously avoid those patterns. This is a surprisingly effective technique. If you did nothing else on this list, except at the end of each first pass output, say, what are the most common cliches this fell prey to, and how could you change it to avoid them? Your output would instantly be significantly better than the generic LLM output. Closely related to this is the idea of self-critique.

Starting point is 00:13:41 People, I think, are much too comfortable just using the first pass of whatever AI does. The whole idea, though, of unbounded, unlimited, and basically cost-free intelligence is that you can rerun a prompt over and over and over again. Or for our purposes, you can run an actual process around it, where the first output is just that, the first pass that then gets built upon. This is not dissimilar from the cliche burn down I just mentioned, but is more broad. In a single prompt you could, for example, say, draft a first version of a particular artifact, maybe an essay, maybe a pitch deck, maybe a presentation, then red team it and list the top five ways it's generic. Rewrite a V2 that fixes each issue and then explain why you changed what you changed. This sort of self-critique is incredibly

Starting point is 00:14:22 valuable and you can even add an additional dimension where you give it a context lens through which to critique itself. So for example, instead of just generically saying, list the top five ways it's generic, you could say, list the top five ways it feels too generic for an undergraduate audience. Further, one additional element that you can add to that sort of self-critique is to have different models do the critiquing as well. Now, not everyone is an insane person like me with premium subscriptions to every single LLM. But even for those of you who are, for example, just using open AI models, there is a big breadth of different approaches that these different models represent. Something I did yesterday is I was architecting a whole new

Starting point is 00:15:03 pitch for a part of the super intelligent business, which is coming in 2026. I had been working through the architecture of the pitch with GPT5 thinking, which was a combination of inline research, strategic thinking, and messaging discussion. And it was doing a pretty good job. But if you've used GPT5 thinking and O3, they feel fundamentally like very different models, not better or worse, just very different. O3 is much more clinical. It's much more likely to give you lists and charts and tables. There's a certain concision and precision of thought that O3 goes for that GPT5 thinking doesn't

Starting point is 00:15:38 have in the same way, which is not to say that O3 is better for all use cases. O3 was very hard to get certain types of writing out of because of how badly it wanted to apply that sort of concision and chart-based information presentation. But what I did as part of this process was that at some point, fairly deep into the conversation, I mean, after I had been talking back and forth with it across about 15 different outputs in a single thread, I turned that whole thread into a link, and shared it, and flipped over to a new chat in the same app, toggled that new chat to the O3 model instead of the five thinking model, asked it to review, and basically make a set of critiques and changes and argue for what it thought, we, which was me and GBT5 thinking together,

Starting point is 00:16:20 were missing as part of the whole conversation. And sure enough, it had a bunch of interesting insights and a bunch of additional dimensionality to it, and all I had to do to give it the relevant context was, again, just give it that other chat GPT link. So even within the environment of chat GPT itself, without switching between Gemini and Grock, et cetera, I was able to get more from each of these different models by having them critique and go back and forth between each other. The last technique to get your LLM to be not average is an obvious one, a tried and true method, to the extent that you have, an example of an output that you think is better than average, give the LLM that example.

Starting point is 00:16:58 However, the important thing that I think to add, which many people miss, is to actually take the time to explain why that example is better, and in particular why the consensus or conventional wisdom that it flouts is wrong or at least limited. A really bright blinking example of this for me is around pitch decks. There are an infinite number of articles across the internet about the standard 10 slide pitch deck. The problem statement, the solution statement, the product in what we do, the go-to-market, the team slide, usually in a very similar order. There is nothing wrong a priori with that. It's a fine starting point, especially as you were trying to architect your story.

Starting point is 00:17:38 But in point of fact, decks that stand out very rarely. follow that template. Not because there's anything wrong with it, but usually because there is something distinct about a company or project that wants to find its way to the very first slide, even if that's not the appointed place in its order, as based on the average random blogger who said that this was the way that you should do decks back in 2014 that's now become conventional wisdom in LLMs. Super intelligent right now is growing 41% month over month when it comes to revenue. You better believe, I'm not waiting until business slide six or whatever to show that. That is going on slide 9.

Starting point is 00:18:12 Number one, I am finding a way to get it there right up front. And this to me is a quintessential example of the LLM not doing anything wrong, but where its process of aggregating the collected and conventional wisdom of people who have built decks just makes for a generic product that is almost doomed to not do what the creator needs it to do. And to continue this example, if I had just shared a different deck that I had made that had numbers up front, it could have easily interpreted that as saying, oh, you always have to have the numbers up front. But that's not at all the point that I was trying to say.

Starting point is 00:18:45 What I would say about Dex, as someone who is both created and consumed an infinite number of them, is not that you want necessarily the numbers up front or any one particular number up front, it's that whatever is the most special thing about what you're trying to tell the story of needs to be as close to the front as humanly possible so as to avoid losing people's attention before they get to that special thing. That's the type of instruction that you can give in LLM that it totally grocks, pun intended. but it would be very easy for it to not understand that that's why you were saying that this example of a deck was better than the conventional wisdom version of a deck without you explaining

Starting point is 00:19:19 it. So to recap, the ways to make your LLM not average, negative style guide telling it what you don't want it to do, force divergence in choice, don't let it fall into its normal patterns of equivocation, cliche burn down, make it identify the templates that are shaping it and then change them, self-critique, ask it more broadly to be critical of itself after its output something in order to make a second version better. Switch models get other models involved in doing that sort of critique so you can get the best of different models for different purposes, and finally use examples and explain why the

Starting point is 00:19:52 consensus is wrong. I think you will find that if you use these strategies, while the AI sameness problem that Alex Cantowitz wrote about won't go away entirely, it is something that you can manage, work with, and overcome for your own purposes, and especially right now, as every single everyone shifts to these methods of creation, there is big leverage in using them better. Anyways, friends, that is going to do it for this weekend episode of the AI Daily Brief. Hope you're having a great weekend wherever you are. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - 5 Prompting Tricks to Make Your AI Less Average

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.