3 Takeaways - From Bits to Brains: How AI Sees, Talks, and Learns (#260)
Episode Date: July 29, 2025How does AI go from predicting the next word to powering robots that navigate the real world?Princeton computer science professor Sanjeev Arora explains how today’s models learn, adapt, and even tea...ch themselves.From chatbots to multimodal machines that process text, images, and video, you’ll learn how it all works—and where it’s headed next.This conversation will change how you think about intelligence, language, and the future of AI.
Transcript
Discussion (0)
Artificial intelligence and chat GPT seem almost like magic.
How do they actually work and what's coming next?
Hi everyone. I'm Lynn Toman and this is three takeaways.
On three takeaways, I talk with some of the world's best thinkers, business leaders, writers, writers, politicians, newsmakers and scientists.
Each episode ends with three key takeaways to help us.
understand the world, and maybe even ourselves a little better.
Today I'm excited to be with Sanjeeve Aurora.
Sanjeev is a Princeton computer science professor
and director of the Princeton Language and Intelligence Initiative,
which pushes the boundaries of large language models.
Sanjeev works on checkable proofs and approximate solutions
to intractable computer programming problems.
Welcome, Sanjeev, and thanks so much for joining three takeaways today.
Thank you, Lynn. It's a pleasure.
It is my pleasure. Let's start by talking about what a large language model is and how they're trained.
Can you talk about next word prediction, how it works and how we should think about it?
In the simplest terms, a language model is trained by taking a ton of language data from everywhere,
scraped up the internet, mostly, and is trained to predict the next word on that data.
What does it mean to train on next word prediction?
When you give the model a piece of text, it computes for every word in the English dictionary,
the probability that it's going to be the next word.
So if the dictionary has 100,000 words, for each of those, it computes a probability,
and the sum of these numbers is one.
And language models are what are called deep nets, which have a large amount of internal circuitry.
called parameters, and they are trained using this text to get better at this next word prediction.
What that means is that they get better and better at predicting this next word in new text
that they haven't seen, in new scenarios that they haven't seen.
And why does the ability to predict the next word lead to chatbot's ability to have a
conversation?
The conversation will involve some existing context, the exchange that has already happened,
and then you have to decide on the next word to say, and that's where next word prediction comes in.
And so the way it has a conversation is that it's taking the existing context and predicting the
next word, take that next word and put it back in the context, and predict the next word, et cetera.
So this kind of chaining of next word prediction leads to conversation.
What should people know about other ideas used to train AI models beside next word prediction?
To go from a model train on next word prediction to the chatbot that you find in your phones and browsers require some other stages of training.
So the first is the ability to engage in question and answer.
For this, the models are exposed to some amount of question answer data generated by capable humans.
From that, they learn the rhythms of question and answer.
And more importantly, how to draw upon their inner store of knowledge, which they acquired during pre-training or next word predict.
and apply to conversation.
The other important stage in training a chatbot
is to train it what are appropriate answers.
And this process is what's called reinforcement learning.
And it's similar to how you might train a child
by giving them feedback about what's appropriate
and what's inappropriate.
So you give the model tricky questions.
And if it gives inappropriate answers,
it gets essentially a thumb down.
And if the answer is appropriate, a thumbs up.
How do large language models like chat GPT acquire the ability to understand texts and also do tasks they did not see during training?
And can you explain with a couple of examples?
There's a frequent misconception that language models can only do tasks or understand text that they've seen before during the training.
Actually, the training equips them to react well in new situations.
For example, if it had seen in the training situations about ordering coffee in a cafe,
maybe at deployment time it would be seeing some very new situations such as somebody is walking
into a cafe and there's a sign saying that today if you order in binary arithmetic,
which is the language use inside computers, then you will get a 20% discount.
Well, now in this new setting, the model has to put together its knowledge of binary arithmetic,
which it may have obtained from one part of Internet data
and its knowledge about conversations in a cafe
that it has picked another part of the data
and has to put it together.
And by training on a diverse set of data,
the model's internal parameters rearranged themselves
so that they are able to do this kind of composition
of abilities picked up in different scenarios.
We were able to show this concretely at Princeton
using an evaluation we call skill mix.
where we start with a long list of skills that we got from Wikipedia, which every model
knows about. And then we were testing the ability of the model to combine those skills in a conversation
in some new settings. So we also had a long list of settings. And when we randomly picked
among these skills and these settings, we were finding that the models were still able to react
meaningfully in those new scenarios. And it's possible to show by a simple mathematical calculation
that it's impossible for the model to have seen most of those combinations in its training data.
So they were indeed reacting in novel ways in scenarios that they hadn't seen before.
Sanjeev, when an AI model is given a word or an image, what does it do with them?
These days, AI models can process words as well as audio, video, etc.
So you have to realize that even the text, when it's input into the language model,
It's not input as text, but like everything else in the computer, it's converted to bits, which are zeros and ones.
So the text is converted into a stream of bits, which are then chunked up into what are called tokens.
So eight or 16 bits could be a token.
And so it's really learning to predict at the token level, not at the word level.
So the same thing can be applied to images and video, that you can break up images, which are a stream of pixels or video.
that's also a stream of pixels and audio.
And you can convert that data into bits, zeros and ones,
and then chunk up those bits into tokens,
just like you can do with text.
And now the model is able to take in all kinds of input.
And then there's a further training stage,
which teaches the model how to make sense of all these different types of inputs.
So just as with language models,
it learns to predict the chunks for these next word,
given the chunks for the previous words.
It can predict chunks for what comes next in the,
the video using the chunks for the previous parts of the video, or it can be made to predict the
chunks of the transcript, which is text of the video, given the video itself, etc. So it can be
trained to solve these prediction tasks, and then it becomes capable of learning all of these
modalities. In future, we are going to see more and more these kinds of models. Already, they
are deployed in some smartphones, et cetera. They will be able to observe the world around you
through your camera and be able to make sense of it and basically be able to help you
navigate that world, not just through language, but also taking into account the scene
and audio, etc. Can you talk about artificial intelligence in the physical world as in AI
that is going to control robots and things in our physical space? The design of robots,
especially those that can interact with the physical world,
use this reinforcement learning.
And robots have the property that they are interacting with the world.
And as they walk around or make some actions, the world changes,
and they have to react to it.
So this form of reactive behavior is a domain of reinforcement learning.
And the next generation of robots incorporate the language capability
and the vision capability and the video capability
and fused it with this ability to control the hardware inside the robot, the hardware that
controls its movements and grasping, et cetera. So you have this single AI model that's taking
all these different types of information and reacting to the current situation, you know,
whatever task the robot is trying to produce. And the output of this model is this next
instruction for the hardware for movement, you know, move forward, move back, etc. This has been done
for decades, and people have been trying to do it, create robots.
And what this new forms of AI, you know, which involve language, video, etc., what they bring
to the table is that they have trained on enormous amount of data from the internet.
And so they just have a tremendous amount of world knowledge.
And so there was always this persistent problem in robotics that the robot wouldn't generalize
its behavior to new situations.
And this modern set of models for robotics, since they incorporate inside them some form of language model and vision model, which have trained on Internet data, large amounts of data, they have a sense of lots of world scenarios, and so they can flexibly adapt to new situations.
If AI is being trained on human generated text and data, why do some people worry about AI getting smarter than us?
So there are many steps in the AI pipeline these days, where the AI model is itself used to generate the next generation of AI models.
But I want to illustrate the basic idea with something that sounds paradoxical, that a model helps to train itself.
And this is a very important idea, which boils down to self-improvement.
Technically, it's called reinforcement learning, where reinforcement learning is kind of like giving feedback to a student, you know, positive or negative.
negative feedback and they learn from positive feedback. But in self-improvement loop, the feedback is
provided by the student itself. So concretely, let's take the setting of high school geometry.
You have a model that's trained on some textbook of high school geometry, so it's reasonably
capable, but it's not an expert. Now the student is given a bank of difficult questions and
ask to learn to solve them. Now, it's a little bit beyond its capabilities, but it does a good
attempt. So you just run the student, say, 100 times on each question. It generates 100 different
answers because it's a probabilistic model. Every time you run it, it gets a different answer. So it now
has a bank of answers it's done. It's done all this hard work. Now it needs to know which of these
answers are correct. And the idea and self-improvement is that you can use the same student model
and ask it to act as a judge because it understands geometry. So it looks at the question and the
answer, and it uses its best understanding of geometry to see if the answer is correct.
And using that, out of these hundred answers, it selects, let's say, four that were good.
So now you have bank of questions and a bank of self-selected answers by the students.
And now you train the student on its good answers.
That sounds really crazy.
The student came up with the answers, and then it's being trained on those.
But it's not so crazy.
What it's trying to do is roughly in human terms, the following, that remember when you were a student and you got a tough
question and you thought very hard, tried many things. And then finally, you struck upon the right
answer. And then you said, aha, I'll remember this answer. This is what I did. There's this kind
of cognitive phenomenon that happens and now you understand better. In language model terms,
that's what's going on. When it's trained on its best answers, it has this aha moment and it improves
on its answering capability in future. Sanjee, will we be able to fully understand and predict what AI will
do in different situations. How are experts approaching the topic of safe AI?
So how are experts approaching the topic of safe AI? So I want to emphasize that all the models
that are deployed currently by the major companies undergo some kind of training, which teaches
them about appropriate behaviors, this phase of training where models are given tricky
situations and their responses get a thumbs up, thumbs down, rating from humans, and then
they are trained on it. And so by and large,
they have some idea of what behaviors are good and what about.
There's a second part of the question, which is whether or not we'll fully understand
and be able to predict what AI will do.
That is very, very difficult.
So they have received this kind of safety training that I mentioned earlier,
but there are two issues here.
So AI models do reasonably well when put in new situations,
but we have no guarantees.
And when it's put in a situation that's dramatically different from its training,
scenarios. We really don't know how it will behave. There is currently no theory or technique to look
inside the model and understand how it will behave in all these different scenarios.
Are there documented cases where AI has been deceitful? Yes. The major AI companies do extensive
testing of the models in a kind of contained setting where the AI's answers are going to humans
and it thinks it's interacting with the world, but it's actually interacting with humans,
controlling it. And in those scenarios, it has been noted that when pressed in the right ways,
AI can display very human-like behaviors, that it tries to be deceitful or do inappropriate things.
How do you see AI in the physical world? What's coming?
In the physical world, we will increasingly see AI that's able to take in all kinds of information,
including audio, video, text, et cetera.
And from that, they will be carrying out all kinds of everyday tasks.
We will first start seeing AI agents in the virtual world.
So already, AI programs assist programmers.
As programmers are writing computer code, the AI can assist them.
So they can describe what they are trying to do in English,
and the AI agents can complete their code and come up with very good completions.
So already, for activity in computer programming,
in AI companies as risen by 50% or something like that,
AI agents that assist doctors and emergency room, et cetera,
and make quick decisions or make helpful suggestions, et cetera.
So that's what we'll see first, virtual agents.
And then physical agents we'll start seeing next
because robotics is also advancing past,
and those real-life agents will be at first just helping out
in settings that are fairly controlled,
like factories and warehouses, etc.,
that's already happening. Seeing them in home will take longer because homes are very, very complex
environments. If AI can talk, understand the space around it, remember an act, a natural question
arises. Could AI attain consciousness? That's a question that's often debated. Currently,
AI models, they don't have memory of what happened before. You turn off the power and you restart them
and they have no memory of what happened.
But it's also relatively easy to imbue them with some memory of past experiences
and also a capability to search through past experiences.
So that's already beginning to happen.
And so you will have AI that has memory of the past, memory of its interactions with you,
and be able to draw upon that in its next interactions.
We are already used to that in our online interactions, things like web search,
computer programs or mail programs have memory of their past interactions with us and are
assisting us in little tasks.
So we'll see that more and more in our individual tasks and real-life tasks.
So as that happens, if it's seamlessly interacting with us, remembering the past, the question
will arise, is it conscious?
And this is a controversial question because even among experts, there's no unanimity
about what consciousness is.
They are, in fact, passionate disagreements about whether a dog is conscious,
or a frog is conscious, or an insect, or a bacteria.
So there will be controversy about it.
But by some definitions of consciousness, AI will be conscious very soon.
What are the three takeaways that you would like to leave the audience with today?
First, models are capable of great novelty, and this is increasing at a fast pace.
And you should not think of models as just parading that they're training.
data.
The second takeaway is there are counterintuitive ways of training models involving self-improvement,
which at first sight seemed to make no sense that the model is being used to improve itself,
but it actually works and has been shown to work in many settings that potentially gives
a possibility of models attaining superhuman performance in certain tasks.
The third takeaway is that all people, especially the young people, should pay close attention
to AI and become fluent in its use, because currently I see many people who don't use AI and
don't know how capable it has become.
Thank you, Sanjeev.
This has been great.
Thank you, Lynn.
If you're enjoying the podcast, and I really hope you are, please review us on Apple Podcasts
or Spotify or wherever you get your.
podcasts. It really helps get the word out. If you're interested, you can also sign up for the
Three Takeaways newsletter at Three Takeaways.com, where you can also listen to previous episodes.
You can also follow us on LinkedIn, X, Instagram, and Facebook. I'm Lynn Toman, and this is
Three Takeaways. Thanks for listening.