The AI Daily Brief: Artificial Intelligence News and Analysis - Why GPT-4-32k Will Totally Transform What We Can Do With AI
Episode Date: April 24, 2023GPT-4 as most people use it has an 8K token limit. Some already have access to a version with a 32K limit, however, and are reporting hugely different opportunities in terms of what it GPT can do, bas...ed on how much more text it can take in and output at once.
Transcript
Discussion (0)
The episode of the AI breakdown that you're about to hear was originally released on YouTube on Monday, April 24th.
In this show, we discussed GPT432K and just how many new use cases the expansion of the token limits of GPT will truly open up.
An AI developer recently wrote on Twitter,
widespread GPT432K access will be a bigger leap than GPT 3.5 to GPT4 was.
Today on the AI breakdown, we're talking about what GPT-432K actually is and what it's going to change when you have access to it.
Welcome back to the AI breakdown.
So for today's show, I wanted to give a chance for people to decide or weigh in on what they wanted to hear about.
There were two topics that stood out to me as potentially interesting.
The first was about Grimes.
Grimes last night retweeted the New York Times piece about the viral AI Drake weekend track and said,
that she would collaborate and give 50% to any AI music creators who used her voice.
And then today she built on it and talked about what would or wouldn't be allowed.
And there's all sorts of interesting things around the future of creation and new business models for music.
And Grimes is weird and fascinating.
So I thought, hey, maybe that would be something that people were interested in.
But then I also gave folks the choice between that and GPT4, moving to GPT432K.
And of the 150 people who responded nearly 80% wanted to hear about GPT432K,
which I think reflects how interested in this people are.
So today we're going to talk about 32K, what it is, why you should care, what it will likely mean for you.
Now, you might have seen something like this tweet recently that kind of gives you an idea of what it means to move to GPT432K.
So Lee Fang here says, I'm trying to crunch some numbers on various corporate filings, experimenting with chat GPT.
I just tried a 66 page document and it says,
the message was too long, submit something shorter. Does the premium paid version of chat
GPT allow for longer message requests? Mike Kim responds, GPT3 and GPT4 have 4K and 8K token limits in the
context window. A token is approximately 70% of a word. A token is not a character. Soon they will be
rolling out GPT432K, which has a 32K token limit, but that will be more expensive. So all right,
we're getting the idea here that this shift to a 32K limit increases the amount of
information that GPT can ingest and potentially changes what it can output.
So to go a little bit deeper on what this actually means, I turned, of course, to chat GPT.
I asked when discussing an AI model, what are token limits?
ChatGPT says token limits refer to the maximum number of tokens, which are units of meaning
that the model can process or generate in a single input or output sequence.
They're the building blocks of natural language processing models, and they represent words,
subwords or characters depending on the specific tokenization method. So the limits are significant,
they say, for three reasons. One is computational constraints. So how much it takes from a
computational resource perspective to generate text and responds to questions. A second is memory
limitations, how much they can deal with at a certain time, right? If the input or output sequence
they write exceeds the token limit, the model may lose context, which can affect its performance.
And that gets to the third part response quality.
Token limits it writes can also influence the quality of the generated text.
If the input or output sequence is too long, the model might generate less coherent or
less accurate responses.
So if you've ever messed around with chat GPT, if you had a long string of conversations
with it or questions that build on other questions, at some point it might get confused
about what the original context was.
Likewise, if you try to feed it a huge block of text, it might start to produce proportionally
less coherent responses. So going down a little bit, I also asked ChatGPT, what would it change to
move from the 4,000 token limit? That's what it was at GPT3 to a 32,000 token limit. It says it would
significantly increase the amount of text that an AI model can process and generate in a single
sequence. This expansion could lead to new uses and improvements in existing applications.
It gave six examples, one longer text analysis, so it could ingest, in
entire books, entire research papers, entire contracts, without having to break them into smaller chunks.
That means that it could provide potentially more context.
That's number two, improved context retention.
With a higher token limit, the AI model can retain context over longer input sequences.
So you have, one, the ability to ingest bigger sets of text or bigger inputs,
but then two, over a longer series of questions, the AI will be able to remember or keep track of that context.
Number three, it says better summarization.
The ability to process longer text could also improve the quality of the text summarization.
If you don't have to break a book into chapters, then the AI can ingest the entire book
in the context of the entire book and come back presumably with a better summarization
because it has the entire context all at once.
That applies to enhanced translation as well.
Now it says in the realm of chatbots and conversational AI,
a higher token limit would allow for more extensive conversations.
for kind of the same reason that it enables the model to maintain context over a larger number of messages.
And finally, creative writing and content generation.
With the ability to generate longer pieces of text, the model could be used for more extensive creative writing tasks,
such as writing full-length articles, stories, or even novels.
Now, going back to the quote that I started with at the beginning of this episode,
McKay wrote, widespread GPT432K access will be a bigger leap than GPT3.5 to GPT4 was.
That was the quote that I started with.
The 4x larger context window combined with the power of GPT4 will be absolutely insane.
People will be completely blown away by the complexity of workflows and use cases it will unlock.
And indeed, you're starting to see people actually talking about this more and more.
Christian here says GPT432 context window will immediately unlock two cost-effective use cases, in my opinion.
One, turbocharged programming can fit an entire project's worth of code into context.
And two, meeting summarization without chunking.
Average hour-long meeting is 15,000 words for better info sharing and recap.
Now, Matt Schumer here went even farther and wrote a whole thread this morning.
GPT432K makes regular GPT4 look like a toy.
Here are some things it can do.
One, summarize and answer questions about an entire research paper.
He says, I literally just pasted the whole paper in the prompt, no embeddings required.
It's exactly what we were talking about when we were seeing what chat chat.
GPT said about this. When you can paste an entire big piece in there, it has the entire context of
that piece when it comes to trying to summarize it. Likewise, very similarly, Matt says you can take
in an entire codebase plus supporting documentation and make changes in improvements. The long
context length opens up fundamentally new opportunities to make ridiculously powerful developer
tools. A more personalized use case Matt suggests is pass in dozens of full articles and get a
personalized summary of the day's news. Summing up, Matt says,
we're moving towards a world of infinite context lengths. Massive new possibilities are opening up. Get ready.
Now, one more that I wanted to share from someone who's actually thinking about this on the front lines of an application they're coding.
This is Nate Chan, and he built an application called Storytime AI a few weeks ago. I love Storytime AI.
I think it's a super cool idea, and it actually is similar to an experiment I'm doing with my four-year-old daughter.
So basically, what Storytime AI is, is it's an interface for using GPT to create.
bedtime stories for your kids. You can toggle a bunch of different options in terms of how old they are and how long you want the app to be. Now, the experiment that I'm running that I'll have a video about soon is basically my four-year-old daughter and I are taking a character that she creates from her head. We turn it into a visual with mid-journey, and then we use chat GPT in a mediated interface between her and chat GPT to write a story together. So it's very similar kind of instincts. I think a lot of parents of young kids are having that idea. But the relevant part for our conversation today is that,
writes, I built this iOS app mostly over the course of a week, and it includes a nice
handful of features and some simple API and some simple API calls and user interfaces.
GPT4 can write straight line code pretty well, and Auto-GPT will eventually be able to stitch
code and project files together with agents. I'm optimistic.
GPT4's API will soon be available with a 32K token context window.
According to Open AI's tokenizer tool, MyApps codebase is only 25,000.
and tokens. It feels like an opportunity is brewing where a combination of GPT432K and AutoGPT will be able
to entirely write my app and further debug and add features with the full scope of the project
in one GPT4 API call. Pretty insane. We don't need GPT5 right now. With multimodal input in 32K
context length, not even public yet, GPT4 is just getting started. So what Nade's talking about is the
ability to, in a single API call, input the entirety or create the entirety of the code needed
for this application in terms of building it and debugging it, which is just pretty amazing to see.
Now, what he's talking about with this idea of auto-GPT stitching code together, other people
are talking about as well.
Philippe Scheiber here says trying to get Baby AGI to write code combining multiple files.
Not quite there yet, but it seems possible.
This is with GPT 3.5, so 4K tokens is the context.
window. We'd love to try this in GPT4 with 32K tokens, anxiously waiting. So again, what's going on here
is the developers are seeing that just around the corner, this increase in the token limits,
is going to dramatically change the nature of the applications that GBT can help code. And that I think
is why coming back to McKay and his original post that got this whole thing started, it's such a
bigger deal than even this shift from GPT 3.5 to GPT4. It's really going to open up.
up this entirely new set of use cases. But as if that weren't enough, there's already research
showing that we could be looking at something entirely different and an entirely different scope
in the future. This research just came out called scaling transformer to one million tokens
and beyond with RMT. It's about a process called recurrent memory transformer that can retain
information up to two million tokens. Cigar here says this could fundamentally upend the current
state of L-M-O-M-Ps. But remember, it is just a research paper so far. Mar Golan writes, is this the future
of large language models with unlimited tokens? Recurrent memory transformer is able to retain information
across up to 2 million tokens. What? Nymar continues. During inference, the model effectively used memory
for up to 4,096 segments with a total length of 2,048,000 tokens, significantly exceeding the largest
input size reported for transformer models. Most importantly, this augmentation maintains the base
model's memory size at 3.6 gigabytes in their experiments. But here's the real juice. Itamar writes
pros and cons. Pros and cons. Pros, same memory consumption, almost unlimited length. Cons decrease quality,
probably the same decay patterns we observed in RNNs, also potentially very long inference time.
To summarize, he writes, this is still not a revolution, but it may become the foundation for the next
paradigms that will achieve much longer, allegedly unlimited token limitations.
I want to sum up going back to Nate Chan's assertion here.
We don't need GPT-5 right now.
With 32K context length, not even public yet, GPT4 is just getting started.
I think that when you look at the type of applications that people are imagining,
inputting entire books or research papers or legal contracts, and having that whole context
be able to be spit back out, the expansion of token limits really does seem like one of the
immediate term frontiers that could totally transform even already how we are using these tools.
Now, unfortunately for now, we don't know when exactly GPT-432K will come to everyone, but I'm starting
to see more and more people who have access to it, who are playing around with it, and who are
consequently writing threads about all the cool things you can do with it. So that says to me that it
can't be all that far on the horizon. I will certainly keep you up to date with all of the most
interesting applications when I see them. And you better believe that you.
I will be thinking of some to use on my own.
That's it for now, guys.
Until next time, appreciate you listening or watching.
Peace.
