The AI Daily Brief: Artificial Intelligence News and Analysis - OpenAI's 6 Strategies For Better ChatGPT Prompting
Episode Date: June 7, 2023OpenAI has published a new guide with 6 recommendations for better ChatGPT prompting. The recommendations come in response to numerous complaints on Twitter that ChatGPT has gotten worse over the last... few weeks. The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/
Transcript
Discussion (0)
Today on the AI breakdown, in response to claims that chat GPT has gotten dumber, OpenAI releases a guide with six strategies for better chat GPT prompting.
Before that on the brief, Instagram is working on a chatbot and Mark Andresen on why AI will save the world.
The AI breakdown is a daily podcast and video about the most important news and discussions in AI.
Like, subscribe, and share, and go to breakdown.network for more information.
Welcome back to the AI breakdown brief, all the AI headline news you need in five minutes or less.
If you want this via email, check out the link in the description to get the AI breakdown newsletter.
Now, recently, we have had a lot of discourse around big tech and AI.
In particular, with Apple's big WWDC event this week, people were wondering if Apple was going to jump into the generative AI fray.
Now, it didn't in a big way.
Obviously, its emphasis was on its new Vision Pro, which is its entrance into the mixed reality, AR, VR, type of space.
But there was still a lot of speculation around the AI strategies that may be lurking in the future.
And in fact, in an interview on Good Morning America the day after the event, Tim Cook took the time to discuss chat GPT.
He said that when it comes to Apple itself, they tend to try to integrate AI in rather than leading with it as the marketed feature,
but that he uses chat GPT personally and that the company is taking a close look on its applications and how it might be used in the future.
Now, the social media networks have been a little bit more aggressive about integrating AI into their experiences.
In April, Snapchat integrated an AI chatbot into its application, and it was immediately the subject of questions.
and concerns. Now, over time, some of that has moderated, but it still hasn't been, let's call it,
a breakaway hit feature. However, known tech leaker Alessandro Paluzi says that Instagram is now
working on bringing AI bots into their chats for a, quote, more fun and engaging experience.
Paluze shared a screenshot, allegedly from a forthcoming Instagram experience that says chat
with an AI. Bring AI to your chats for a more fun and engaging experience. Answer questions,
gives advice. Choose from different personalities, chat with 30 AI personalities, and find
which one you like best. Inspire your creativity. Find the best ways to express yourself and get help
writing messages. Now, that's not very much to go on, but that hasn't stopped people from
speculating wildly about what an Instagram bot might do for the app's experience. Moving over
into the policy realm, the U.S. Senate is getting more serious about trying to understand AI.
Senate Majority Leader Chuck Schumer has announced that over the next few weeks there will be
three hearings on the topic. In a letter to his colleagues, he wrote,
AI is already changing our world and experts have repeatedly told us that it will have a profound
impact on everything from our national security to our classrooms to our workforce,
including potentially significant job displacement.
Given that, the first hearing will be a general overview of AI.
The second will be a focus on how to retain American leadership in the space.
And the third will be a classified briefing concerning defense and intelligence issues surrounding the technology.
The dates of those hearings have not yet been finalized.
While the politicians might just be learning about AI, the consultants are already using it.
McKinsey says that about half of its employees are already using generative A.
This came from the senior partner and global leader at Quantum Black, which is McKinsey's
artificial intelligence consulting arm. He said the McKinsey had created guardrails for employees
using the tools, including, quote, guidelines and principles about what information the
workers could input into these services. In the world of applied business, artificial intelligence
is finding its way into well everything. Yesterday, Google and Priceline announced a new partnership
through which Google's AI will power a virtual travel agent. If you've seen a Twitter threat
about generative AI or chat GPT or anything like that, you'll probably have seen this
of travel use cases, one of the examples they give.
With this partnership, price line is trying to get out ahead of that and integrate AI directly
into its experience to give people a more tailored concierge type experience.
Moving over to the technical side of things, one of the limitations of AI currently is its
context window.
ChatGBTGBTGBT is currently has a context window of 8,000 tokens, which in English means
somewhere between 4,000 and 5,000 words on average.
That means that any more than that, and you have to split up what you're entering into Chatchipt,
to multiple embeddings to try to help chat GPT understand. ChatGBTGPT will at some point roll out a 32,000
token context window, although it's said that one of its limitations is availability of GPUs, and
about a month ago Anthropic introduced a 100,000 token context window on its Claude LLM, that corresponds
to around 75,000 words, which is roughly the length of the Great Gatsby. While now a company
called Magic has introduced something called LTM1. This, they say, is a prototype of a neural
network architecture that is designed for giant context windows.
In fact, they say, this is an LLM that can handle 5 million prompt tokens.
Five million prompt tokens, they say, is around 500,000 lines of code,
which is enough to fully cover most repositories.
They write,
We tried to scale standard GPT context windows, but quickly got stuck.
So we designed a new approach, the long-term memory network.
Training and serving LTM nets required a custom machine learning stack,
from GPU kernels to how we distribute the model across a cluster.
What's next?
More compute.
LTM nets see more context than GPTs, but LTMNets see more context than GPTs,
but LTM1 has fewer parameters than today's frontier models, making it less smart.
They're looking now for developers to help test the product as well as expand its performance,
so I'm sure we'll be hearing more about this in the future.
Lastly, today, the man behind the Netscape browser and the software is eating the world thesis
has written a new 7,000-word missive called why AI will save the world.
He starts, The Era of Artificial Intelligence is here, and boy, are people freaking out.
Fortunately, I am here to bring the good news.
AI will not destroy the world and in fact may save it. Now, as I said, this is a 7,000 word piece.
And so, of course, I'm training a voice model on myself so that I can have it read it to you on a long
grade Sunday. But to briefly sum up, he goes through five risks that people have talked about with
AI. Will AI kill us all? Will AI ruin our society? Will AI take all our jobs? Will AI lead to
crippling inequality? Will AI lead to people doing bad things? And tries to systematically dismantle
each of those arguments. The conclusion that he comes to is that there is a different risk that
he says is probably the scariest of all. He writes, AI isn't just being developed in the relatively
free societies of the West. It is also being developed by the Communist Party of the People's Republic
of China. China has a vastly different vision for AI than we do. They view it as a mechanism for authoritarian
population control full stop. They are not even being secretive about this. They are very clear about it,
and they are already pursuing their agenda. The single greatest risk of AI is that China wins global
AI dominance and that we, the United States and the West, do not. Now, this is the type of piece that
is guaranteed to get a lot of discussion. So like I said, I'm working on reading a full version of
it for Long Read Sunday this week. And as people write more responses to it, I'll also include those
in future episodes. That's it. However, for today's AI breakdown brief, if you're enjoying,
please like, subscribe and share and click that notification button. And I'll be back soon with the
main AI breakdown. OpenAI releases a new guide giving six stress.
for getting better results from chat GPT.
Over the last couple weeks, there has been some scuttlebut about chat GPT getting noticeably worse.
Jason DeMayo tweets, is it just me, or does chat GPT seem like it's getting dumber?
Maybe I'm just getting dumber.
I mean, the dream writes, chat GPT 4 is dumber than it was a week ago.
Why?
DT deeper thrill writes,
ChatGPT 4 from March was significantly smarter than whatever is being deployed now.
They neutered my boy with all their fancy reinforcement learning to make it aligned and their quantized
models to run faster. Starting to become unusable. They thought no one would notice. Well, us smart
people noticed when something close to our level starts to sound like a normie. Don't demean us and
pretend you haven't changed anything under the hood. I can tell it's way dumber now. Shannon Sands responds
to him and says, it's objectively worse on code generation. As in things it could do previously
are suddenly difficult. It's ignoring supplied context and even pointing out its hallucinating doesn't
help much. It'll literally output the same code unfixed or with new bugs. Francisco and San Francisco
says, I've been using it to help me create an eye.
iOS app since March and I've gone through three iterations of it since. No doubt this last
version of my app has taken me way more prompts to get the results I got months ago. Now, Robert
Scoble retweeted that post and said, has OpenAI gotten stupider for you? Arena Cronin says,
Yes, something's been done. Feels like an experiment which means something else is coming soon.
Nick Dobos writes, yep, noticed a significant change in the way my prompts behaved.
Unsure if model change or perhaps a new system prompt or RLHF safety pass.
Ali Jules writes, yes, I've especially noticed this within the last week. I'm prompting
same and I'm having to correct 90% plus of the output. I feel like I'm arguing with it and getting
frustrated with the I apologize responses. I even ran a few English and language arts high school
level test questions through chat GPT and it gave wrong answers to four of the five questions,
frustrating and concerning. Now chat GPT definitely noticed all this chatter. Logan on their developer
relations team says, there have been a lot of threads and comments around the models in chat GPT and
the API outputs getting much worse in the last few weeks. This is a huge reason why we open
sourced OpenAI avals. You can write an eval and test the quality over time. No guesswork.
I said it before, but I will say it again. The models in the API do not change unless we
announce they have changed. For chat GPT, things are always in motion, but the most effective
thing you can do to help us prevent and identify regressions is make an eval. However, OpenAI went
a little bit farther. Ben's bytes wrote yesterday, man, open AI is damn sharp. I don't know if you've
noticed, but Twitter has been pretty jam-packed lately with folks claiming that the quality of chat GPT
has been strangely slipping over the past few weeks.
And only a few days later, outcomes a guide on prompting from OpenAI.
Yeah, it's not us, it's you, is the message I'm getting here.
So today, let's go over their six best practices with some applied examples.
OpenAI writes, GPT best practices.
This guy shared strategies and tactics for getting better results from GPTs.
So on a high level, the six strategies are one, write clear instructions,
two, provide reference text, three, split complex tasks into simpler
sub-tasks, four, give GBT's time to think, five, use external tools, and six, test changes
systematically. So let's zoom into the right clear instructions bucket first, as this will be where
a lot of people find most of their value add. The first taxic they give is pretty obvious, include
details in your query to get more relevant answers. So they say a worse query is who's president,
versus a better query is who was the president of Mexico in 2021, and how frequently our elections
held. Perhaps a more relevant example for those of you out here who are smart enough to be watching
this video is they say summarize the meeting notes is worse than summarize the meeting notes in a
single paragraph, then write a markdown list of the speakers in each of their key points.
Finally, list the next steps or action items suggested by the speakers, if any. This is pretty
obvious, so I don't want to spend too much time here. I just want to give you one example of something
that I've found is often really helpful. And that is to specify at what level of education you want
a response to come in at.
So for example, let's take a recent AI research paper.
It's called Recognize Anything, a strong image tagging model.
And we're going to use the archive link with ChatGPT's X paper plugin to get a summary of
this paper.
We're writing, can you please summarize this paper in a paragraph of less than 75 words?
Okay, so now we've got this first summary.
The paper Recognize Anything.
A strong image tagging model introduces the Recognize Anything model, RAM, a robust model for
image tagging.
RAM uses large-scale image text perished for training, eliminating the need for manual annotations.
The model's development involves four key steps, including automatic text semantic parsing,
preliminary model training, data engine utilization for annotation, and model retraining.
Ram shows impressive zero-shot performance, surpassing clip, blip, and even fully supervised methods.
And what we're going to do now is to ask to make it basically simpler, but we're going to put the request contextually.
Can you please make this understandable for a early high school level
audience explaining the applied relevance to the real world as well in a similar length of 75 to 100 words.
What it comes back with then is the paper talks about a new computer program called RAM that's really good at
understanding and labeling what's in a picture, like a cat or a car. Instead of needing people to tell it
what's in lots of pictures first, which is how most of these programs learn, it uses a clever method to
learn from pictures and words found together on the internet. This is useful because it can help
computers understand our world better, which could make things like search engines or photo apps
more accurate and helpful. I have found over and over again that actually changing the educational
level of the intended response is really, really useful for dialing in understanding,
especially as you're trying to learn a new concept. Now, what about another tactic that OpenAI
suggests asking the model to adopt a persona? Let's try this using the same example. Can you please
imagine a conversation between podcast host Joe Rogan and a guest who is one of the researchers
to explain the concept in a back and forth question and answer of no more than 250 combined words.
Now, obviously, someone like Lex Friedman might have been a better example, but I wanted to
give ChatGPT a podcast host who was well known enough to have a body of information that it could
pull from.
So for a result, we have this dialogue.
Joe Rogan says, so you've created a computer program that can identify anything in a picture,
right?
Researcher, yes, that's correct, Joe, we call it RAM.
It's a new kind of image tagging model.
Joe Rogan says image tagging like when Facebook recognizes your friend's faces?
Researcher, exactly but RAM goes beyond faces.
It can recognize any common object in a picture.
Joe Rogan, how does it learn to do that?
Researcher, well, instead of us manually teaching it with labeled pictures,
it learns from pictures and words found together on the internet.
Joe Rogan, so it's like it's teaching itself?
Researcher, you could say that.
It's a more efficient way of training these models.
Joe Rogan, and how can this be used in the real world?
Researcher, it can make things like search engines or photo apps more accurate.
it. Imagine typing a word and getting exact image results or your photo app automatically organizing your pictures by what's in them. It's a big step forward and computer vision. What you're seeing here is that because we're working off of a reference text, it's not the information underlying it that's changing. It's the way that it's delivering the information in ways that we might be able to understand. And effectively what's going on here is we are helping chat GPT understand how to teach us what's in this paper, what's in this reference text. By the way, adopting a persona can be really interesting for creative examples as well.
Last month, I asked ChatGPT to imagine that it was Albert Einstein.
I said you've been transported with a time machine to the modern era.
Please write a letter to a dear friend from back in your own time about the rise of generative artificial intelligence.
I'd like you to explore what Einstein would have been excited about as well as what some of his concerns might be.
Please try to mirror Einstein's manner of speaking and writing.
I'll create a shared link to this one so you can read it in full from the show notes,
but the result it came back with was pretty fun.
My dearest friend, I write to you from a future time unlike any we could have possibly imagined,
a world that will fill your heart with awe, but perhaps also a threat of trepidation.
Technology has taken leaps and bounds, flying over the hurdles of the unfathomable,
landing squarely in the domain of the miraculous.
Among the most astounding developments is the rise of what is known as generative artificial intelligence.
One might liken this phenomenon to an infant who learns to parrot the words of his mother and father,
and then begins to string these together to create sentences of his own.
Except in this case, the infant is a machine and its parents are the vast tracks of data it is fed.
And the sentences of forms are so cleverly strung together,
so packed with nuance and apparent understanding,
that you would swear you were conversing with a fellow human.
And therein lies both the promise and the potential peril.
All right, but back to OpenAI's GPT best practices.
Their second strategy for getting better results is providing a reference text.
Well, I think we've already been over that a little bit,
given that I used a research paper and the X papers plug-in to do that.
Now, one thing I will note is that this is one area where plugins do really shine.
You can often find now the right plug-in to pull in information
without just having to copy-paste it in the context window.
A third strategy OpenAI gives is to split complex tasks into simpler sub-tasks.
They write,
For tasks in which lots of independent sets of instructions are needed to handle different cases,
it can be beneficial to first classify the type of query
and to use that classification to determine which instructions are needed.
This can be achieved by defining fixed categories and hard-coding instructions
that are relevant for handling tasks in a given category.
The example that they give is customer service queries.
They suggest classifying each into a primary category and a secondary category.
primary categories are things like billing, technical support, and account management,
with the secondary categories, for example, under billing being unsubscribe or upgrade,
add a payment method, explanation for charge, dispute a charge.
The idea here is that once ChatGBT is able to understand what category the input relates to,
a more specific set of instructions can be provided to handle the next steps.
For example, let's say that the query comes in in the form of technical support
with the secondary category of troubleshooting.
From there, the additional context could be things like ask them to check
that all cables to and from the routers are connected.
If all cables are connected and the issue persists,
ask them which router model they are using.
Now you will advise them on how to restart their device
and so on and so forth.
Now, what are the thing to note in this split complex tasks
into simpler sub-task is a lot of the issues here
have to do with the context length for chat GPT.
As we discussed before, chat GPT is currently limited
to 8,000 tokens, which is around 4 to 5,000 words.
As larger context lengths come online,
some of these tactics like summarizing long documents piecewise,
and constructing a full summary recursively may no longer be necessary.
The next strategy OpenAI suggests they call give GPT's time to think,
although I kind of think they've mislabeled this.
What they're pointing out is that chat GPT isn't really thinking,
it's guessing in a sophisticated way.
And so when we want a particular type of output,
we need to instruct it on almost how to guess or how to think.
They write,
sometimes we get better results when we explicitly instruct the model
to reason from first principles before coming to a conclusion.
So as a for example, I asked, do the benefits of AI outweigh the risks?
Now, its first response is it started writing a very basic list of benefits versus risks.
It was really just a summary of what others had said on the topic.
It, in no way, reflected an actual thought process.
Now, this is useful if all I wanted out of this was a list of the benefits versus a list of the risks
so that I could decide for myself.
But if I wanted something from GBT that approximated a thought process, I needed to prompt it a little bit differently.
So then I said, can you please now reason from first principles?
on whether the benefits of AI outweigh the risks.
Chat TPT responds, sure.
Reasoning from first principles allows us to break down complex problems
into fundamental truths and then reason up from there.
Let's apply that to the question of whether the benefits of AI outweigh the risks.
The first principles it writes are that AI is just a tool.
AI is powerful.
AI is not conscious.
AI capabilities are growing.
Reasoning up from those principles, it says,
considering these basic truths,
we can start to form a perspective on whether the benefits of AI outweigh the risks.
One, dependence on human usage.
two, risk mitigation. Three, lack of consciousness, four, future potential. Now, the conclusion is still
not the same as if you ask the human to actually firmly take a stance one way or another, but at least now
we have not just a list of benefits and a list of risks, but a thought process that we can cross-reference.
Now, in this case, the example that they give is even more clear. The tactic they suggest is
instructing the model to work out its own solution before rushing to a conclusion. It gives
the example of a student who's trying to figure out the total cost of the first year of
for a new solar power installation.
The first prompt they ask is just to determine if the student solution is correct or not,
while the second prompt says,
first work out your own solution to the problem,
then compare your solution to the student solution to the student solution
and evaluate if the student's solution is correct or not.
Don't decide if the student's solution is correct until you have done the problem yourself.
That second prompt allowed ChatChap ET to figure out that,
one, the student solution was incorrect,
and two, where they made their specific error.
The fifth strategy they suggest is using external tools.
This is another area where plugins shine.
Plugins allow us to point specific information that might be relevant to a particular query
or question directly into chat GPT, which makes it less likely to hallucinate and more
likely to get the information that we actually require.
They also point to the example of using code execution to perform more accurate calculations
or call external APIs.
The sixth strategy they suggest is test changes systematically.
And this kind of gets back to Logan's points about how they're now open sourcing evaluation
model so that people can get more involved in helping them improve chat GPT over time. However,
for the vast majority of day-in, day-out use cases, I don't think this one is relevant as,
for example, write clear instructions. So when all of a sudden done, how useful are these tips?
I'm honestly not totally sure. I think content creators have done a great job of writing different
tips and prompts and tricks, and even the ones that are doing it for engagement still can have some
really valuable insights. I wouldn't be surprised if it would have been an even more valuable use of
Open AI's time to actually aggregate and create a repository of those free resources for people.
But hey, maybe that is an opportunity for someone out there who is trying to help people figure out
this new AI powered world. Anyways, guys, that is it for today's AI breakdown. Those are OpenAI's
six strategies for getting better results from GPT. Let me know what you think. Let me know if any of
these are actually useful for you. And of course, if you're liking this content, please like,
subscribe and share it. Click the notification button so you don't miss an episode. Go listen to the
podcast or subscribe to the newsletter. And until next time,
Peace.
