The AI Daily Brief: Artificial Intelligence News and Analysis - OpenAI's 6 Strategies For Better ChatGPT Prompting

Episode Date: June 7, 2023

OpenAI has published a new guide with 6 recommendations for better ChatGPT prompting. The recommendations come in response to numerous complaints on Twitter that ChatGPT has gotten worse over the last... few weeks. The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, in response to claims that chat GPT has gotten dumber, OpenAI releases a guide with six strategies for better chat GPT prompting. Before that on the brief, Instagram is working on a chatbot and Mark Andresen on why AI will save the world. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Like, subscribe, and share, and go to breakdown.network for more information. Welcome back to the AI breakdown brief, all the AI headline news you need in five minutes or less. If you want this via email, check out the link in the description to get the AI breakdown newsletter. Now, recently, we have had a lot of discourse around big tech and AI. In particular, with Apple's big WWDC event this week, people were wondering if Apple was going to jump into the generative AI fray.
Starting point is 00:00:48 Now, it didn't in a big way. Obviously, its emphasis was on its new Vision Pro, which is its entrance into the mixed reality, AR, VR, type of space. But there was still a lot of speculation around the AI strategies that may be lurking in the future. And in fact, in an interview on Good Morning America the day after the event, Tim Cook took the time to discuss chat GPT. He said that when it comes to Apple itself, they tend to try to integrate AI in rather than leading with it as the marketed feature, but that he uses chat GPT personally and that the company is taking a close look on its applications and how it might be used in the future. Now, the social media networks have been a little bit more aggressive about integrating AI into their experiences. In April, Snapchat integrated an AI chatbot into its application, and it was immediately the subject of questions.
Starting point is 00:01:30 and concerns. Now, over time, some of that has moderated, but it still hasn't been, let's call it, a breakaway hit feature. However, known tech leaker Alessandro Paluzi says that Instagram is now working on bringing AI bots into their chats for a, quote, more fun and engaging experience. Paluze shared a screenshot, allegedly from a forthcoming Instagram experience that says chat with an AI. Bring AI to your chats for a more fun and engaging experience. Answer questions, gives advice. Choose from different personalities, chat with 30 AI personalities, and find which one you like best. Inspire your creativity. Find the best ways to express yourself and get help writing messages. Now, that's not very much to go on, but that hasn't stopped people from
Starting point is 00:02:09 speculating wildly about what an Instagram bot might do for the app's experience. Moving over into the policy realm, the U.S. Senate is getting more serious about trying to understand AI. Senate Majority Leader Chuck Schumer has announced that over the next few weeks there will be three hearings on the topic. In a letter to his colleagues, he wrote, AI is already changing our world and experts have repeatedly told us that it will have a profound impact on everything from our national security to our classrooms to our workforce, including potentially significant job displacement. Given that, the first hearing will be a general overview of AI.
Starting point is 00:02:39 The second will be a focus on how to retain American leadership in the space. And the third will be a classified briefing concerning defense and intelligence issues surrounding the technology. The dates of those hearings have not yet been finalized. While the politicians might just be learning about AI, the consultants are already using it. McKinsey says that about half of its employees are already using generative A. This came from the senior partner and global leader at Quantum Black, which is McKinsey's artificial intelligence consulting arm. He said the McKinsey had created guardrails for employees using the tools, including, quote, guidelines and principles about what information the
Starting point is 00:03:11 workers could input into these services. In the world of applied business, artificial intelligence is finding its way into well everything. Yesterday, Google and Priceline announced a new partnership through which Google's AI will power a virtual travel agent. If you've seen a Twitter threat about generative AI or chat GPT or anything like that, you'll probably have seen this of travel use cases, one of the examples they give. With this partnership, price line is trying to get out ahead of that and integrate AI directly into its experience to give people a more tailored concierge type experience. Moving over to the technical side of things, one of the limitations of AI currently is its
Starting point is 00:03:44 context window. ChatGBTGBTGBT is currently has a context window of 8,000 tokens, which in English means somewhere between 4,000 and 5,000 words on average. That means that any more than that, and you have to split up what you're entering into Chatchipt, to multiple embeddings to try to help chat GPT understand. ChatGBTGPT will at some point roll out a 32,000 token context window, although it's said that one of its limitations is availability of GPUs, and about a month ago Anthropic introduced a 100,000 token context window on its Claude LLM, that corresponds to around 75,000 words, which is roughly the length of the Great Gatsby. While now a company
Starting point is 00:04:20 called Magic has introduced something called LTM1. This, they say, is a prototype of a neural network architecture that is designed for giant context windows. In fact, they say, this is an LLM that can handle 5 million prompt tokens. Five million prompt tokens, they say, is around 500,000 lines of code, which is enough to fully cover most repositories. They write, We tried to scale standard GPT context windows, but quickly got stuck. So we designed a new approach, the long-term memory network.
Starting point is 00:04:48 Training and serving LTM nets required a custom machine learning stack, from GPU kernels to how we distribute the model across a cluster. What's next? More compute. LTM nets see more context than GPTs, but LTMNets see more context than GPTs, but LTM1 has fewer parameters than today's frontier models, making it less smart. They're looking now for developers to help test the product as well as expand its performance, so I'm sure we'll be hearing more about this in the future.
Starting point is 00:05:10 Lastly, today, the man behind the Netscape browser and the software is eating the world thesis has written a new 7,000-word missive called why AI will save the world. He starts, The Era of Artificial Intelligence is here, and boy, are people freaking out. Fortunately, I am here to bring the good news. AI will not destroy the world and in fact may save it. Now, as I said, this is a 7,000 word piece. And so, of course, I'm training a voice model on myself so that I can have it read it to you on a long grade Sunday. But to briefly sum up, he goes through five risks that people have talked about with AI. Will AI kill us all? Will AI ruin our society? Will AI take all our jobs? Will AI lead to
Starting point is 00:05:48 crippling inequality? Will AI lead to people doing bad things? And tries to systematically dismantle each of those arguments. The conclusion that he comes to is that there is a different risk that he says is probably the scariest of all. He writes, AI isn't just being developed in the relatively free societies of the West. It is also being developed by the Communist Party of the People's Republic of China. China has a vastly different vision for AI than we do. They view it as a mechanism for authoritarian population control full stop. They are not even being secretive about this. They are very clear about it, and they are already pursuing their agenda. The single greatest risk of AI is that China wins global AI dominance and that we, the United States and the West, do not. Now, this is the type of piece that
Starting point is 00:06:28 is guaranteed to get a lot of discussion. So like I said, I'm working on reading a full version of it for Long Read Sunday this week. And as people write more responses to it, I'll also include those in future episodes. That's it. However, for today's AI breakdown brief, if you're enjoying, please like, subscribe and share and click that notification button. And I'll be back soon with the main AI breakdown. OpenAI releases a new guide giving six stress. for getting better results from chat GPT. Over the last couple weeks, there has been some scuttlebut about chat GPT getting noticeably worse. Jason DeMayo tweets, is it just me, or does chat GPT seem like it's getting dumber?
Starting point is 00:07:06 Maybe I'm just getting dumber. I mean, the dream writes, chat GPT 4 is dumber than it was a week ago. Why? DT deeper thrill writes, ChatGPT 4 from March was significantly smarter than whatever is being deployed now. They neutered my boy with all their fancy reinforcement learning to make it aligned and their quantized models to run faster. Starting to become unusable. They thought no one would notice. Well, us smart people noticed when something close to our level starts to sound like a normie. Don't demean us and
Starting point is 00:07:31 pretend you haven't changed anything under the hood. I can tell it's way dumber now. Shannon Sands responds to him and says, it's objectively worse on code generation. As in things it could do previously are suddenly difficult. It's ignoring supplied context and even pointing out its hallucinating doesn't help much. It'll literally output the same code unfixed or with new bugs. Francisco and San Francisco says, I've been using it to help me create an eye. iOS app since March and I've gone through three iterations of it since. No doubt this last version of my app has taken me way more prompts to get the results I got months ago. Now, Robert Scoble retweeted that post and said, has OpenAI gotten stupider for you? Arena Cronin says,
Starting point is 00:08:06 Yes, something's been done. Feels like an experiment which means something else is coming soon. Nick Dobos writes, yep, noticed a significant change in the way my prompts behaved. Unsure if model change or perhaps a new system prompt or RLHF safety pass. Ali Jules writes, yes, I've especially noticed this within the last week. I'm prompting same and I'm having to correct 90% plus of the output. I feel like I'm arguing with it and getting frustrated with the I apologize responses. I even ran a few English and language arts high school level test questions through chat GPT and it gave wrong answers to four of the five questions, frustrating and concerning. Now chat GPT definitely noticed all this chatter. Logan on their developer
Starting point is 00:08:41 relations team says, there have been a lot of threads and comments around the models in chat GPT and the API outputs getting much worse in the last few weeks. This is a huge reason why we open sourced OpenAI avals. You can write an eval and test the quality over time. No guesswork. I said it before, but I will say it again. The models in the API do not change unless we announce they have changed. For chat GPT, things are always in motion, but the most effective thing you can do to help us prevent and identify regressions is make an eval. However, OpenAI went a little bit farther. Ben's bytes wrote yesterday, man, open AI is damn sharp. I don't know if you've noticed, but Twitter has been pretty jam-packed lately with folks claiming that the quality of chat GPT
Starting point is 00:09:20 has been strangely slipping over the past few weeks. And only a few days later, outcomes a guide on prompting from OpenAI. Yeah, it's not us, it's you, is the message I'm getting here. So today, let's go over their six best practices with some applied examples. OpenAI writes, GPT best practices. This guy shared strategies and tactics for getting better results from GPTs. So on a high level, the six strategies are one, write clear instructions, two, provide reference text, three, split complex tasks into simpler
Starting point is 00:09:50 sub-tasks, four, give GBT's time to think, five, use external tools, and six, test changes systematically. So let's zoom into the right clear instructions bucket first, as this will be where a lot of people find most of their value add. The first taxic they give is pretty obvious, include details in your query to get more relevant answers. So they say a worse query is who's president, versus a better query is who was the president of Mexico in 2021, and how frequently our elections held. Perhaps a more relevant example for those of you out here who are smart enough to be watching this video is they say summarize the meeting notes is worse than summarize the meeting notes in a single paragraph, then write a markdown list of the speakers in each of their key points.
Starting point is 00:10:32 Finally, list the next steps or action items suggested by the speakers, if any. This is pretty obvious, so I don't want to spend too much time here. I just want to give you one example of something that I've found is often really helpful. And that is to specify at what level of education you want a response to come in at. So for example, let's take a recent AI research paper. It's called Recognize Anything, a strong image tagging model. And we're going to use the archive link with ChatGPT's X paper plugin to get a summary of this paper.
Starting point is 00:11:00 We're writing, can you please summarize this paper in a paragraph of less than 75 words? Okay, so now we've got this first summary. The paper Recognize Anything. A strong image tagging model introduces the Recognize Anything model, RAM, a robust model for image tagging. RAM uses large-scale image text perished for training, eliminating the need for manual annotations. The model's development involves four key steps, including automatic text semantic parsing, preliminary model training, data engine utilization for annotation, and model retraining.
Starting point is 00:11:29 Ram shows impressive zero-shot performance, surpassing clip, blip, and even fully supervised methods. And what we're going to do now is to ask to make it basically simpler, but we're going to put the request contextually. Can you please make this understandable for a early high school level audience explaining the applied relevance to the real world as well in a similar length of 75 to 100 words. What it comes back with then is the paper talks about a new computer program called RAM that's really good at understanding and labeling what's in a picture, like a cat or a car. Instead of needing people to tell it what's in lots of pictures first, which is how most of these programs learn, it uses a clever method to learn from pictures and words found together on the internet. This is useful because it can help
Starting point is 00:12:19 computers understand our world better, which could make things like search engines or photo apps more accurate and helpful. I have found over and over again that actually changing the educational level of the intended response is really, really useful for dialing in understanding, especially as you're trying to learn a new concept. Now, what about another tactic that OpenAI suggests asking the model to adopt a persona? Let's try this using the same example. Can you please imagine a conversation between podcast host Joe Rogan and a guest who is one of the researchers to explain the concept in a back and forth question and answer of no more than 250 combined words. Now, obviously, someone like Lex Friedman might have been a better example, but I wanted to
Starting point is 00:12:55 give ChatGPT a podcast host who was well known enough to have a body of information that it could pull from. So for a result, we have this dialogue. Joe Rogan says, so you've created a computer program that can identify anything in a picture, right? Researcher, yes, that's correct, Joe, we call it RAM. It's a new kind of image tagging model. Joe Rogan says image tagging like when Facebook recognizes your friend's faces?
Starting point is 00:13:17 Researcher, exactly but RAM goes beyond faces. It can recognize any common object in a picture. Joe Rogan, how does it learn to do that? Researcher, well, instead of us manually teaching it with labeled pictures, it learns from pictures and words found together on the internet. Joe Rogan, so it's like it's teaching itself? Researcher, you could say that. It's a more efficient way of training these models.
Starting point is 00:13:37 Joe Rogan, and how can this be used in the real world? Researcher, it can make things like search engines or photo apps more accurate. it. Imagine typing a word and getting exact image results or your photo app automatically organizing your pictures by what's in them. It's a big step forward and computer vision. What you're seeing here is that because we're working off of a reference text, it's not the information underlying it that's changing. It's the way that it's delivering the information in ways that we might be able to understand. And effectively what's going on here is we are helping chat GPT understand how to teach us what's in this paper, what's in this reference text. By the way, adopting a persona can be really interesting for creative examples as well. Last month, I asked ChatGPT to imagine that it was Albert Einstein. I said you've been transported with a time machine to the modern era. Please write a letter to a dear friend from back in your own time about the rise of generative artificial intelligence. I'd like you to explore what Einstein would have been excited about as well as what some of his concerns might be. Please try to mirror Einstein's manner of speaking and writing.
Starting point is 00:14:31 I'll create a shared link to this one so you can read it in full from the show notes, but the result it came back with was pretty fun. My dearest friend, I write to you from a future time unlike any we could have possibly imagined, a world that will fill your heart with awe, but perhaps also a threat of trepidation. Technology has taken leaps and bounds, flying over the hurdles of the unfathomable, landing squarely in the domain of the miraculous. Among the most astounding developments is the rise of what is known as generative artificial intelligence. One might liken this phenomenon to an infant who learns to parrot the words of his mother and father,
Starting point is 00:14:58 and then begins to string these together to create sentences of his own. Except in this case, the infant is a machine and its parents are the vast tracks of data it is fed. And the sentences of forms are so cleverly strung together, so packed with nuance and apparent understanding, that you would swear you were conversing with a fellow human. And therein lies both the promise and the potential peril. All right, but back to OpenAI's GPT best practices. Their second strategy for getting better results is providing a reference text.
Starting point is 00:15:23 Well, I think we've already been over that a little bit, given that I used a research paper and the X papers plug-in to do that. Now, one thing I will note is that this is one area where plugins do really shine. You can often find now the right plug-in to pull in information without just having to copy-paste it in the context window. A third strategy OpenAI gives is to split complex tasks into simpler sub-tasks. They write, For tasks in which lots of independent sets of instructions are needed to handle different cases,
Starting point is 00:15:50 it can be beneficial to first classify the type of query and to use that classification to determine which instructions are needed. This can be achieved by defining fixed categories and hard-coding instructions that are relevant for handling tasks in a given category. The example that they give is customer service queries. They suggest classifying each into a primary category and a secondary category. primary categories are things like billing, technical support, and account management, with the secondary categories, for example, under billing being unsubscribe or upgrade,
Starting point is 00:16:18 add a payment method, explanation for charge, dispute a charge. The idea here is that once ChatGBT is able to understand what category the input relates to, a more specific set of instructions can be provided to handle the next steps. For example, let's say that the query comes in in the form of technical support with the secondary category of troubleshooting. From there, the additional context could be things like ask them to check that all cables to and from the routers are connected. If all cables are connected and the issue persists,
Starting point is 00:16:43 ask them which router model they are using. Now you will advise them on how to restart their device and so on and so forth. Now, what are the thing to note in this split complex tasks into simpler sub-task is a lot of the issues here have to do with the context length for chat GPT. As we discussed before, chat GPT is currently limited to 8,000 tokens, which is around 4 to 5,000 words.
Starting point is 00:17:04 As larger context lengths come online, some of these tactics like summarizing long documents piecewise, and constructing a full summary recursively may no longer be necessary. The next strategy OpenAI suggests they call give GPT's time to think, although I kind of think they've mislabeled this. What they're pointing out is that chat GPT isn't really thinking, it's guessing in a sophisticated way. And so when we want a particular type of output,
Starting point is 00:17:27 we need to instruct it on almost how to guess or how to think. They write, sometimes we get better results when we explicitly instruct the model to reason from first principles before coming to a conclusion. So as a for example, I asked, do the benefits of AI outweigh the risks? Now, its first response is it started writing a very basic list of benefits versus risks. It was really just a summary of what others had said on the topic. It, in no way, reflected an actual thought process.
Starting point is 00:17:53 Now, this is useful if all I wanted out of this was a list of the benefits versus a list of the risks so that I could decide for myself. But if I wanted something from GBT that approximated a thought process, I needed to prompt it a little bit differently. So then I said, can you please now reason from first principles? on whether the benefits of AI outweigh the risks. Chat TPT responds, sure. Reasoning from first principles allows us to break down complex problems into fundamental truths and then reason up from there.
Starting point is 00:18:17 Let's apply that to the question of whether the benefits of AI outweigh the risks. The first principles it writes are that AI is just a tool. AI is powerful. AI is not conscious. AI capabilities are growing. Reasoning up from those principles, it says, considering these basic truths, we can start to form a perspective on whether the benefits of AI outweigh the risks.
Starting point is 00:18:35 One, dependence on human usage. two, risk mitigation. Three, lack of consciousness, four, future potential. Now, the conclusion is still not the same as if you ask the human to actually firmly take a stance one way or another, but at least now we have not just a list of benefits and a list of risks, but a thought process that we can cross-reference. Now, in this case, the example that they give is even more clear. The tactic they suggest is instructing the model to work out its own solution before rushing to a conclusion. It gives the example of a student who's trying to figure out the total cost of the first year of for a new solar power installation.
Starting point is 00:19:09 The first prompt they ask is just to determine if the student solution is correct or not, while the second prompt says, first work out your own solution to the problem, then compare your solution to the student solution to the student solution and evaluate if the student's solution is correct or not. Don't decide if the student's solution is correct until you have done the problem yourself. That second prompt allowed ChatChap ET to figure out that, one, the student solution was incorrect,
Starting point is 00:19:30 and two, where they made their specific error. The fifth strategy they suggest is using external tools. This is another area where plugins shine. Plugins allow us to point specific information that might be relevant to a particular query or question directly into chat GPT, which makes it less likely to hallucinate and more likely to get the information that we actually require. They also point to the example of using code execution to perform more accurate calculations or call external APIs.
Starting point is 00:19:56 The sixth strategy they suggest is test changes systematically. And this kind of gets back to Logan's points about how they're now open sourcing evaluation model so that people can get more involved in helping them improve chat GPT over time. However, for the vast majority of day-in, day-out use cases, I don't think this one is relevant as, for example, write clear instructions. So when all of a sudden done, how useful are these tips? I'm honestly not totally sure. I think content creators have done a great job of writing different tips and prompts and tricks, and even the ones that are doing it for engagement still can have some really valuable insights. I wouldn't be surprised if it would have been an even more valuable use of
Starting point is 00:20:32 Open AI's time to actually aggregate and create a repository of those free resources for people. But hey, maybe that is an opportunity for someone out there who is trying to help people figure out this new AI powered world. Anyways, guys, that is it for today's AI breakdown. Those are OpenAI's six strategies for getting better results from GPT. Let me know what you think. Let me know if any of these are actually useful for you. And of course, if you're liking this content, please like, subscribe and share it. Click the notification button so you don't miss an episode. Go listen to the podcast or subscribe to the newsletter. And until next time, Peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.