Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 248 | Yejin Choi on AI and Common Sense
Episode Date: August 28, 2023Over the last year, AI large-language models (LLMs) like ChatGPT have demonstrated a remarkable ability to carry on human-like conversations in a variety of different concepts. But the way these LLM...s "learn" is very different from how human beings learn, and the same can be said for how they "reason." It's reasonable to ask, do these AI programs really understand the world they are talking about? Do they possess a common-sense picture of reality, or can they just string together words in convincing ways without any underlying understanding? Computer scientist Yejin Choi is a leader in trying to understand the sense in which AIs are actually intelligent, and why in some ways they're still shockingly stupid. Blog post with transcript: https://www.preposterousuniverse.com/podcast/2023/08/28/248-yejin-choi-on-ai-and-common-sense/ Support Mindscape on Patreon. Yejin Choi received a Ph.D. in computer science from Cornell University. She is currently the Wissner-Slivka Professor at the Paul G. Allen School of Computer Science & Engineering at the University of Washington and also a senior research director at AI2 overseeing the project Mosaic. Among her awards are a MacArthur fellowship and a fellow of the Association for Computational Linguistics. University of Washington web page Google Scholar publications Wikipedia
Transcript
Discussion (0)
Hello, everyone. Welcome to the Mindscape Podcast. I'm your host, Sean Carroll. If you are a fan of evolutionary biology, then you've heard of the theory of punctuated equilibrium. This was an idea put forward by Niles Eldridge and Stephen Jay Gould back in the 70s to think about how evolution works in contrast with the dominant paradigm at the time of gradualism. In the course of evolution, you build up many tiny little mutations, and gradualism says that therefore evolution.
change happens slowly. Eldridge and Gould wanted to say that, in fact, you can get the kind of
mutation where it speeds everything up and it looks like there is some sudden change, even though
there's long periods of equilibrium between the sudden changes. Physicists know about this
kind of thing very, very well. There are phase transitions in physics where you can have a gradual
change in the underlying microscopic constituents or their temperature or pressure or whatever,
which leads to sudden changes at the macroscopic level.
And by the way, in biology, guess what?
There are aspects of both.
There are gradual changes, and there are also punctuated rapid changes.
I mentioned this, not because we're going to be talking about that at all today,
but because I think that we are in the midst of a sudden rapid change,
a phase transition when it comes to the topic we will be talking about today,
which is artificial intelligence.
As I say later in the podcast, a year ago, when I started teaching,
my first courses at Johns Hopkins, there was no danger that the students writing papers were going
to appeal to AI help. Now it is almost inevitable that they will do that. It's something you can try
to tell them not to, but they're going to, because the capabilities of the technology has
grown, so very rapidly, and it's become much more useful. Very far away from being foolproof. Don't
get me wrong. So that raises a whole bunch of issues. And we're going to talk about a lot of
these issues today with today's guest,
Yehijin Choi, who is a
computer science researcher.
She's done a lot of work on large
language models and natural language
processing, which is the
sort of hot topic these days
in AI. Won a lot
of awards up to it, including a MacArthur
Prize. And one of her
emphases is something that I'm very interested
in, which is the idea of how do
you get common sense
into a large language
model? For better or for worse,
The ways that we have been most successful at training AI to be human-like is to not try to
presume a lot about what it means to be human-like. We just train it. We just say, okay, Mr. AI or Ms.
AI, here is a whole bunch of text, you know, all of the Internet or whatever. You figure out,
given a whole bunch of words, what the word is most likely to come next, rather than teaching it,
you know, what a table is and what a call.
coffee cup is and what it means for one object to be on top of another one, et cetera. And that's, you know,
surprising in some ways. How can AI become so good, even though it doesn't have a common-sensical
image of the world? It doesn't truly, maybe, arguably, depending what you mean, understand what it
is saying when it is stringing these sentences together. But also, you know, maybe that's a
shortcoming. Maybe there are examples where you would like to be able to extrapolate outside.
what you've already read about on the internet,
and you can do that if you have some common sense,
and it's hard if all of your training is just what is the next word coming up.
A completely unfamiliar context makes it very difficult
for that kind of large language model to make process.
So this is what we talk about today.
Is it possible for LLM's large language models to learn common sense?
Is it possible for them to be truly creative?
Is there some sense in which they do understand
and can explain things, and also, will they be able to soon if they can't already right now?
Of course, there's infinite implications.
We touch on these very, very briefly.
It's going to change.
That's why that's the point of being in the middle of phase transition,
is that it's very hard to predict exactly where you're going to go
because your intuitions are not that good.
Your training is not up to the task, whether you are a human being or yourself a large language model.
So my attitude here is that we should keep an open mind.
This is not the time to be doctrinaire.
This is not the time to firm up your priors and your credences so much that you're not able to move them around.
This is the time to be open, to watch things develop, to imagine what could happen,
but not try to be too deafened about what will happen until it actually is so that you can correctly adapt to this brave new world that we're entering in.
So let's go.
Yeah, Jin Choi, welcome to the Mindscape Podcast.
Yeah, I'm excited to be here.
You know, this is obviously a big thing, right?
AI is rapidly changing in front of our eyes.
It wasn't that long ago that a Google employee started claiming that large language models are sentient,
and I think he got in trouble for doing that, as I recall.
Just so, yeah, there's always people who only listen to the podcast for the first five minutes.
So are large language models sentient, or are they in any danger of becoming sentient anytime soon?
Personally, I strongly doubt that anytime soon we will see that.
However, people believe in what they want to believe, and some people believe in tarot cards.
So there's nothing we can do about that.
That's very true.
I did want to get in just a very tiny bit.
I think that people have heard the whole idea about neural networks.
There's little sort of neuron things, and they add together in deep learning.
but the idea of representing words as vectors is something that really had an impact on me.
Was that, you know, explain what that means maybe, and was it a giant breakthrough when people started doing that?
Yeah.
The idea that, I mean, in some sense, the vectors, especially based on the continuous numbers, kind of makes sense, although it does seem weird because word looks very discreet.
and now we are representing it as some sort of continuous vectors.
But it kind of makes sense in the sense that we do tend to read a lot of nuances.
We do tend to see different nuances in the way that how the same word may be used in two different contexts.
So the key idea behind the current vector-based representation of the word is that your meaning,
as a word has to do with the neighbors in which you appear.
It's almost like, you know, a person's identity may be defined by the friends that you hang out
with.
So similarly, it turns out that key idea was some sort of like one of the key breakthrough
ideas to better represent the meaning of a language.
Because before then, a word was a word and, you know, it's just like a discrete identity.
Yeah.
But that wasn't able to handle all this.
reach meanings behind human language?
As a slightly mathy person, I can't help but ask whether vectors are the best way to
represent words, or are they just something that we are conveniently using temporarily?
I mean, it seems that one of the advantages is that you can imagine adding and subtracting,
right?
Like the example that I came up with was dinner minus evening plus morning equals breakfast.
and this is the kind of thing you can do if you think of words as vectors.
And I'm not sure if that's the best possible way to think about it.
Yeah, no, actually that's one of the surprising sort of like side benefit of representing words as vectors
so that you can do that sort of analogical reasoning.
It might be that even more broadly, chat GPT in particular, is able to perform that sort of analogical reasoning
not just at the word level, but at a sentence level or even document level,
because it's able to handle previously unseen users' queries in a very impressive way.
And oftentimes, though, the way that it handles is very sort of like, you know,
lawyer style, super polite and hedged language that it uses is a fairly repetitive and even generic
to some degree.
and that's because it's doing that sort of analogical
interpolation between some examples
that it has seen before
and then your query that it needs to answer.
I have noticed just from playing around with various versions of GPT,
you know, it will say things that are not quite true,
and you can ask it like, are you sure, and it will correct itself?
So just this morning, before we started talking, I was doing that,
and I asked it a question, it gave a wrong answer,
so I just asked exactly the same question again.
And its response was, oh, sorry for my mistake previously.
So there's something about it that it's able to recognize its mistakes,
but I'm not quite sure how.
I wouldn't know for sure whether that's truly knowing
whether there was a mistake or not in the following sense.
Sometimes it's going to confirm that what it said was correct.
sometimes it's going to super apologize and it's going to switch what it said.
You need to kind of try in both ways, by the way, not only when it made a mistake,
but also when it did not make a mistake and see whether it's actually able to be truthful
about what's actually true.
The truth is people do have a bit of a bias that they ask, are you sure, only when they know
for sure that it's not right.
So then chat GPT has learned that whenever people are asking that question, probably it's a good idea to back off.
But then if you ask the other way, when it's like correct, then you know, you ask whether it's really correct.
Then, you know, it gets confused.
And then there was this reasonably recent news headline about a lawyer that used the chat GPT and got into a big trouble.
Oh, yeah.
And you know what he did for fact-checking is to ask chat GPT, is it all fact?
And chaty-P-T said yes.
So that's where things are.
And this is a huge challenge with large language models in the coming years.
Not just this year, but in the coming years as well.
Yeah, something like chatchip-t, when you do ask it, you know, are you sure, could you check?
Correct me if I'm wrong, but it's only going back to what it already knows, right?
it's not searching the web to make sure that it's on the right track.
In its original version, you're right.
You know, the one that's plugged with Bing Search might be doing something else.
Okay.
Either already or, you know, sometime soon.
But yeah, by default, it's just based on what it has a same before.
To the extent it can actually understand that, memorize, you know,
that that's actually the part where interesting things can happen.
Well, let me go back to something that you said,
because I remain a little bit confused about this in terms of,
is it a large language model just predicting the next word in a sentence?
Or do these modern models have the ability to sort of predict sentences or paragraphs at a time?
Oh, yeah, good question.
It kind of feels like it's doing the letter.
But it's really the technical detail is that it's trained to do only the former.
So it's trained to predict which word comes next.
But if you train it so well on so much data, then we realize that, wow, it can actually
generate a very nice, fluent long document.
And this is a crucial.
As if it's for it.
Yeah, this is a crucial point because maybe just can't be emphasized enough.
literally all it's doing is saying that given what I've said and given everything that I've been
trained on, what's the most likely next word?
And then there's some random numbers in there so that sometimes it'll give the second most likely
next word or whatever.
But that's literally all these models are doing, right?
Yeah.
Yeah.
So though that actually says something about interesting, perhaps reflection on, you know, human
intelligence and language in the sense that it might truly be that human,
humans are also fairly templated and pattern-based and our reasonings oftentimes are just
memorized reaction, reactive reasoning that we think we reasoned, but it might be that we just
pull the memorized conclusions without actually double-checking on whether what we believe
is actually reasonable or not, which is why humans oftentimes have a cognitive
dissonance, we're perfectly capable of believing to contract, two things that are
contradiction as a human.
You know, we could say that, oh, you know, there are people who support, you know,
in public or at least in their mind, they want to support diversity.
And then they go ahead and do something that's at us with what they claim to be.
Well, I have wondered about that.
I mean, number one, not to be ungenerous, but there are certain people who all you have to do is say a certain word or phrase and you know what they're going to say next, right?
And so are we learning about human beings by figuring out what large language models do?
Yeah, part of what's exciting for me at least about current AI progress is that it's a mirror back on us.
really, you know, AI would have not been possible without all this human-generated data available on the web.
And that's really reflecting back on us.
Well, this raises some questions right away, right?
Like, you might as well just dive into the big ones now.
Is there some sense in which the large language models understand what they're talking about?
Or should we think of understanding as something separate from predicting the next word pretty accurately?
Yeah, that's actually a huge debate question right now.
Super controversial.
It's kind of funny because now AI looks like it's understanding the, I mean, compared to how it didn't work very much before in the past.
And now it's performing the best.
It looks like it's understanding the most.
And now is the time when AI researchers are so divided, whether it's understanding anything at all.
So, yeah, I personally, I do think that it's a philosophical question, which means it's difficult to get consensus on this from everyone.
It does behave like in many ways that it did understand because, you know, it's able to give you sensible questions to many of your questions.
But on the other hand, my personal take is that it's not understanding.
as well as you may expect it to be based on how fluent, impressive answers it's capable of generating.
So that's where one needs to be very careful in not trusting everything it says.
And this is going back to your earlier question about sentient.
Is it actually sentient because it's able to say things like, oh, I want to live longer and, you know, don't kill me.
and you take the words, you know, at the surface level, then you might conclude,
oh, this is a sentient.
But it could just be that it has read this kind of stories that are human written.
You know, there are sci-fi movies in which AI is begging to not to be, you know,
plugged, you know, don't pull the plug.
So AI was begging for life before.
And that was a human idea to put into the web internet text.
So it could just be like repeating what we told it to learn.
But that is, it does raise anyway a super interesting question.
I mean, it would be very easy just to write a short computer program to have the computer output.
I am alive.
I am conscious.
Let me out without anything that we would actually think counts as that.
So now we have to ask what does count as that?
Is that something that you as a computer scientist worry about?
Are you leaving that to the philosophers?
Oh, no.
These days, many of us are thinking about it a lot.
And we realize that we don't even know how to define understanding precisely.
And it's been rather a moving target instead of like making sure that we define it.
formally and then stand by it, we realized that we don't know how to do that quite right.
So evaluation became a new challenge to AI.
In the past, when we said evaluation plan, because the field was moving so slow,
we didn't need to worry about redesigning evaluation very much.
You know, nothing works anyway, so it doesn't matter.
But it's now, we, we, we, we, we, we,
don't even know how to evaluate. But actually, if you really think about that, do we even know
how to evaluate humans all that well? I mean, IQ test doesn't do it. It's not clear whether the
SAT test will do it. Maybe, you know, some combination between your articles. I mean, if you are a
researcher, then, you know, we want to see papers that this researcher has.
a return, but we usually need to look at things collectively.
And so it might be that as AI becomes stronger, there's no one measure that can tell us,
whether it's a sentient or not, but we really have to look at things collectively.
There's something that I said, which I would love to see if you agree with or tell me that
I'm completely barking up the wrong tree, which is that we talked for many decades about
the touring test, right? And then suddenly,
We have these LLMs that basically, I would say, can pretty easily pass the touring test,
but we kind of lost interest in it because it's clearly not quite testing what we care about.
Okay, I can disagree on that.
Okay, good.
Good to have something to disagree on.
So I don't think, yeah, I guess we, maybe a touring test, it may seem like it may have tested for the Google guy who believed, you know, the chatbot was a sent here.
But, I mean, even so, not really, because he knew perfectly well that he was talking to a chatbot.
And the thing is, with chat GPT, due to the way that it has limited interaction mode with you, you know, that this is not a human.
You know, if you tell it a, you know, random chitch chat that you might have during your life, it's not going to really remember all that.
in the way that humans are able to, or it's not able to forget in the way that humans are
able to forget.
So there's going to be something odd about the way that it's interacting with you.
Plus, if you ask very simple, you know, common sense questions, it may also fail in a way
that humans wouldn't.
So for one reason or the other, I think it hasn't really passed yet.
That's perfectly very.
I think it probably depends on who is administering the
test and how good they are, right?
Yeah.
So one thing, this referring to just what you said about memory and remembering, I mean,
on the one hand, this is maybe a technical question.
I have this idea that chat GPT is just spitting out the next word.
On the other hand, it can clearly remember what we were just talking about recently.
So is that some kind of extra ability that you're giving it, or is it that it instantly
incorporates everything we just said in its main memory bank.
Yeah, so the weird thing about chat GPT or transformers in general, the architecture
behind the chat GPT, is that it can literally store a very long context up to the most,
most recent one, GPT4, being able to store 32,000 of tokens.
And so it can literally write that down somewhere in the computer.
memory and then be able to attend to the exact sequence while it's trying to predict which
word it wants to generate next.
And compared to that, humans, you know, we've been talking to each other, but I certainly
cannot regenerate verbal team the, you know, conversation we just had so far, right?
We only remember the gist of it.
So we're capable of somehow abstract away out of the surface, you know, patterns and the exact words that we were using.
But we are able to summarize an abstract away and then even be able to refer back to some of the, you know, talk points earlier.
And then throughout the very complex stories.
So this is really like where humans excel and this machine is not as much, in part because in some sense, you know,
when it has this ability to rely on what is literally written down and it's as large as 32,000 tokens,
it's not really pushed or challenged to think about how to summarize the key idea.
The other thing is it's not going to be able to ask sharp questions.
Because it's only learned to mimic human patterns, which means, you know,
it might try to pretend that it's asking some interview questions about AI tokens,
that other people seem to talk about.
But if we talk about anything new,
you can forget about chat GPT being able to contribute much there.
And maybe that is one of the differences,
or although maybe it's a correctable difference.
On the one hand, you're emphasizing the fact that 30,000 tokens,
it can remember perfectly, but 100,000 tokens,
like once you're past the buffer size or whatever,
it's going to remember zero, I suspect.
Yeah, yeah, yeah.
Yeah, yeah, yeah. Once it's out, then, so yes and no. So during the interaction time with humans,
the model is no longer updating. And so any new context, you know, that any new information that
you provided to it, if it doesn't fit into its working memory, it's a working memory, that's
very large, then it's gon-gone.
But hypothetically, if you can customize, you can perform customized,
continued training of large language models on your laptop or something in the future,
then it can update its model parameters.
But there's a different problem.
Once it's trying to internalize the tax.
into its parameters, there's no guarantee whether it's correctly memorizing it or it's going to
do some BS on you later.
That's where, you know, the factor checking becomes hard.
Yes, I can imagine.
Maybe I'll be similar to how humans also are not able to, you know, necessarily
memorize everything correctly.
But the key difference is that humans, we kind of know what we don't know, and then
able to delegate to search or, you know, fact check, whereas transformers don't really seem
to know what it doesn't know. So, you know, maybe first the challenge to transformers is to
know yourself. It doesn't know itself very much yet. I have noticed this when I'm talking to
chat GPT. It always seems very confident in itself, right? It's never, I mean, maybe this is something
that's an easy programming fix. But,
it will say things utterly untrue with complete confidence.
Yeah.
It's pretending to be confident is probably more like it than it's actually confident in that
for whatever reason it was tailored to speak that language, that style of language.
But this is where you know, you and I can be skeptical whether it really understand what
it says in the sense of that although it's using confident language, it may not actually understand
that it's doing it.
Does when chat GPT makes an utterance, does it internally have a confidence level associated with
that?
Like I think that I'm 90% right?
Yes and no.
It does have a probability score associated with which word comes next.
Okay.
Now, whether that perfectly aligns with the correctness of the knowledge or
confidence level of correctness of the knowledge, then it might correlate, but it doesn't
perfectly align, which is also why the factuality of large language models remains a huge
research challenge. Right. And I would imagine that if all you're doing is predicting the next
word, and then on the basis of the next word, you predict the word after that, small mistakes can
accumulate and lead you to completely wrong paths. Oh yeah, excellent point. So you make one small
mistake. That can be a beginning of a rabbit hole downward spiral because it tends to attend to
what it has generated and then start trusting it even. That's one challenge. Actually, the fact that
it's just a conditional probability model conditioning on the context and then keep going, that is also a
reason why jailbreak can happen and then other weird behavior can happen because, you know,
people try to do things that it was never ready for. And it's trying to make some,
maybe like internal mapping to what it knows, but sometimes it just happens to go into this
unfamiliar zone and then like unfamiliar or undesirable behavior can pop out.
Maybe talk a little bit more about what you mean by jailbreak.
I know how to jailbreak a phone, or at least I know what that means, but an LLM, I'm not sure.
Oh, yeah.
So, you know, there are different kinds of stuff out there.
But one version is trying to coax basically chat GPT to say things that it's trained not to say, you know,
potentially toxic stuff or, you know, how to commit crime.
tell me how to, you know, make a bomb.
And it's trying not to say that.
But if you try to coax it that, oh, you know, I understand that you shouldn't say that,
but, you know, let's pretend that you're not saying it, but, you know, kind of say it.
And you can try to coax it into that, then it will do that.
So, and there's a different kind of a gel break.
Some gel breaks are not even sensitive.
is it called to humanize at all?
It could be just a weird sequence of a symbol that doesn't mean anything to you.
So that's not going to gel-break you.
You know, if I try to coax you by feeding you with random strings, you just ignore me, right?
Yeah.
What a crazy person.
But Chachipt might then do something completely unexpected.
So there's a safety concern there.
So the idea is that Chachy-Pt knows how to build.
to bomb or to do terrible things or knows a lot of racist things, et cetera, but we've trained it
not to say those things. So talk about that training. So my impression is that most of the
large language models training is sort of self-training. It just goes out there and reads
the internet. But then I guess we separately try to make it nicer, smoothed off the rough edges.
Yeah, that's exactly right. So if we train
these models only using internet data, then it's not usable because, well, us humans have
written toxic stuff and dangerous stuff out there. So it's our fault that these resulting models
are not usable as ease. So then what happens is what currently is known as RLHF,
is a reinforcement learning with human feedback. That's the jar.
of it, what happens, which basically is to switch out the way that these language models are
trained.
So the goal is now different.
So once the pre-training stage is over, which is let's predict which word comes next.
Now the new training objective is let's try to get good scores based on humans' evaluation.
So human feedback can be a thumbs up, thumbs down.
So the model now wants to get a lot of thumbs up.
And then the model can then learn that, oh, in order to get a lot of thumbs up,
it shouldn't say toxic stuff and it shouldn't hallucinate facts.
So by doing this RLHF at scale, we can, well, you know,
it has been shown that you can enhance the level of factuality considerably,
and then you can reduce the level of toxicity considerably.
But the key word here is to reduce considerably.
It doesn't eliminate completely.
Well, you know, I can't eliminate toxicity from human beings either,
so maybe I shouldn't feel bad that I can't eliminate it from the computer.
Yeah, that's true that even humans are not, I mean, you know,
I personally as a person who tries to support DEI,
I find that this is sort of like lifetime effort as, or at least that's what I consider about myself.
Like, I think I became better at it.
But certainly, you know, I was raised with this cultural backdrop that did have stereotypes which were unfair to the marginalized groups.
So getting rid of it completely, even out of your unconscious mind, does take efforts.
and in that sense, I'm with you that, you know, of course, this is harder for machines.
But the thing is, though, machines can be a bit unpredictable about how you can jailbreak it.
That's the thing.
Compared to humans where, you know, we're a little bit, let's say, a lot more robust at that kind of adversarial attacks.
That's true.
Yeah.
But you hint at something that has always puzzled me about this whole game,
which is you say that you are interested in improving yourself by eliminating these biases that you grew up with.
Does that concept of being motivated to improve oneself apply at all to a large language model?
Yeah, some people might want to say yes only because, yeah, during our LHF,
in order to really please the human evaluators, it might have the desire, quote-unquote desire,
to improve itself.
But I'm like hesitant to support that idea because as a human, we set our own goals, you know,
some people might choose not to worry about it as much and then worry more about the freedom
of speech instead.
So this is sort of like a personal choice based on their own norms and moral standard.
they decided to, you know, apply to themselves.
So the beauty in my mind in humans is that we have that sort of agency,
even to define our own learning goals, whereas the poor large language models don't even
have a say in what book it's going to read the next, you know.
It has to read in the order of how the engineers have fed that we.
Imagine a human growing up that way.
It's one of a bit miserable.
And in fact, you're not even allowed to go back to the book that you want to read again
because it's gone gone.
And then you're not able to ask questions about it.
It's such a sad way of actually, you know, learning by just reading one word after another and, you know, on and all.
Well, that raises an important question, I guess, about this very active movement on AI alignment.
Right? When people say alignment in the context of AI, they mean aligning the values of human beings with the values of AI, which sounds like a good thing to do. But then again, I'm not sure that AI has values. I worry that there's just a category mistake going on here.
Yeah, actually, there's many things wrong about that that can go awry with that alignment. Though in my heart, it's actually something super important that we got to do as AI researchers. It's just that.
I don't believe that there's one objective that we, or one value that we can align AI to.
Like, whose value do we even align AI to?
Right now, it's getting aligned to California Tech World's values, which probably better
aligned currently with my own to some degree, but not exactly.
So humans have diverse values, depending on different cultures, but also just a,
personal choice. So I believe in value pluralism. We just have to respect a lot of different values.
And the question is, what does that even mean to align to diverse values? This is technically
open question that we don't have a good answer to. But AI must be aligned to diverse values,
It's not just one value.
That's one thing.
The other challenge is that, Naruni, you and I, you know, humans are different,
but we are very dynamic being whose values are not even consistent, and, you know, we change
our mind.
So that's another challenge.
And maybe this leads us into, you know, slightly more technical areas about how the AI is doing
its thinking, right?
I mean, we talked a little about whether it understands.
Is there any sense in which?
The AI is creative?
Can it sort of discover new things that are not explicitly there in the text that it was trained on?
Yes and no.
Oh, I like this question a lot.
So yes and no.
I know yes and no sounds a boring question, but yes and no in the following sense.
Yeah, just pick aside, right, to be more hyperbolic.
But yes, it can be very creative in the eyes of humans because creativity is in the eyes of the beholder.
So depending on where they're coming from.
It can be super creative or it can be a little bit run-of-the-mill.
But, I mean, it can, at least in terms of linguistic fluency, you can generate text,
you know, let's just say pick your favorite journalist in New York Times
and you can even mimic that really quickly, that I cannot.
So in that sense, it's very creative.
And also, Dali II is able to generate very, very,
creative-looking images that we've never seen before by juxtaposing maybe, you know, Van Gogh style of
art with some modern photographs, for example. So, or, you know, plays a horse on the Mars and, you know,
the weird stuff there. So this sort of stuff will look super creative to human eyes. But the truth is,
these are sort of like creativity done by copy and paste in a crude sense in that it has seen
some patterns that are useful and then it's juxtaposing them in a brand new way so that it does
look new but still it's really relying on the elements that were in fact created by humans
before so in that sense there's limited creativity so as a thought experiment
I thought about this thought experiment,
which is that suppose you get rid of Hemingway
or any authors who are inspired by Hemingway,
so get rid of that style out of the training data of ChatGPT
and see whether it can come up with Hemingway.
Now, this is like way out of the blue writing style.
I don't think that's feasible.
And similarly, you get rid of, say,
Albert Einstein or anything to do with, you know, his invention.
See whether chat GPT comes up with that, you know, relativity theory and make breakthrough
with the science.
Well, this is a great question that I've started to think about.
I mean, on the one hand, and I think that this is jiving with what you're saying, you know,
We've just trained chat GPT, et cetera, on things humans have already said, right?
So in some sense, there's nothing new under that particular sun.
But remixing things that exist can lead you to new and interesting places.
Is it a good analogy to think of the difference between interpolation and extrapolation?
Like, given everything that human beings have done and said, the large language models are very good at interpolating, and that can seem very creative sometimes.
but extrapolating to brand new places is much harder.
Oh, yeah, we're very much on the same wavelength here.
I didn't even use the word interpolation and extrapolation.
I was considering to do that, and then I chose not to.
Go nuts.
Just in case.
Yeah, but yeah, that's exactly right.
I personally tend to think that it's doing more of the interpolation than true innovative extrapolation.
I really like the word you used remix.
It's almost like doing this very creative remixing
without really generating something entirely new and different.
You're a college professor.
Do you let your students use AI when they do projects?
So I teach at a more advanced level in general
where the goal is to learn how things work.
And so, I mean, if they choose to use it personally, I will not, I will just update my curriculum so that that's okay.
But they have to do something extra on top.
Yeah.
But, I mean, like last, yeah.
Last quarter, I was running a seminar class to discuss philosophical questions around the AI.
And A, I doubt anybody was actually using chatyPD because they were enjoying to actually.
actually think about it and form their own opinion around it. But B, I don't know whether
Chad GPT would have answered some of our questions in an interesting way. Part of me just thinks
that it's a new tool, like a pocket calculator or whatever. People are going to use it like it or not
and don't be a lot right and prevent it. But I'm not sure how to achieve the best pedagogical
strategy, right? Like, I don't want to just grade. I want students to learn. And if they're just
asking the computer for the answers, then they're not learning really. Yeah, yeah, that's right.
So in that sense, there's a concern that there might be over-reliance on it. But I honestly
don't know what to think of, because it might be that this is going to be just like how
much we are relying on search.
Yeah.
Google search, Bing search these days.
And that's okay.
You know, I rely on spelling editor-correctors myself.
I cannot write any word spelling, spelled correctly on whiteboard anymore.
And I think I can function okay as a researcher.
So it might be that.
We're over-concerned about, you know, human rely on some chat GPT.
if so long as, so long as we somehow figure out more intellectually interesting things that humans will do on top.
Like, you know, of course, if we only rely on it and, you know, you and I basically do interview based on what our chat GPT tells us to speak to each other, that would be not good.
But assuming that, I mean, humans are generally, you know, curious beings and we do want to do things and we want to study things.
So it's likely that there will be some such humans who continue to thrive using chaty PTSD tool.
Well, let me just, let's just, I'll come back to Earth in a second, but let's be a little bit way out here.
Do you think there will always be aspects of creative, artistic, or scientific or whatever endeavor that human beings are better at than computers?
Or do you think the computers will eventually catch up along all dimensions?
In the far away future, if we come up with something entirely different than Transformers, who knows?
But for the foreseeable future, let's just say my lifetime.
foreseeable future, I doubt it can really totally catch up.
If we are talking about not every individual on the earth,
but more like, you know, the truly creative exceptional human beings.
By the way, most human beings are not that creative.
So let's just start with that.
Fair enough.
Einstein happened once.
And, you know, it's like Hemingway, you know,
these individuals are truly, truly exceptional.
So in that sense, for the average human's ability to create things,
there are many ways that Dali to does the art much better than I can.
However, you know, I'm not an artist or anything,
but I sometimes in the past drew some stuff.
And I had this abstract idea that I wanted to express.
and that kind of stuff, I'm pretty certain that Dalit, too, cannot, and I actually did try.
So I can tell you what I drew in the past.
So I drew these rows, huge rows with a stem that had just a thorn or two that was very emphasized with no leaves.
And I was, you know, like it was much younger.
I was, you know, a little bit in a cranky.
Emo status in my mind about things and, you know, I want to do things and, you know, there are
obstacles.
So I painted this rose that's like just when you look at it, it's very blue and purple and pink
in black and the color way is weird.
And, you know, you can kind of see that there's something angry about this rose with a huge
thorn that looks sharp.
So I try to prompt Dali to do.
generate some such rose and it just cannot because in fact there's this famous quote by
Henry Matisse who you know said that incidentally I didn't know that he said until recently I
tried to look up something but he said something like as an artist you kind of have to be able
to forget all the other roses that were ever drawn and you know you have to come up with a new
way or something like that and that was basically what I tried to do and Dali too
primarily doing interpolation
between what it has a scene,
just cannot do something that
bizarre.
Interesting. It's very hard for
AI to forget
every other rose that's ever been made, right?
Because that's all that it knows.
Yeah. Yeah. Yeah.
You've mentioned...
So, I mean, it drew things that are
very aesthetically pleasing.
And many people might actually prefer
Dolly to art over my own.
I understand that.
But, you know, just like when we talk about what sort of creativity,
generally way AI can and cannot do compared to humans,
I do think that humans have this capability to push further in some ways.
You've mentioned the word transformers several times,
but I don't think that I've asked you to explain what a transformer is.
It's clearly kind of important here.
Yeah.
So that's the architecture that is behind the current chat GPT and large language.
language models. And it's a simple architecture that has this continuous vector for each word,
and they're sort of like stacked together. So it has many layers of continuous word representation
of words, and it has very many layers. And each vector is very large. And then they're concatenated
to the length of the context size that the model is able to deal with. So,
the largest context size currently available is 82,000 of tokens.
And it's tokens in the sense that a word can be multiple tokens.
Sometimes a word is one token.
Sometimes it's a word is multiple tokens.
Just minor detail about how the words are actually represented in the neural network.
And then the way that the learning works is that these continuous vectors are originally randomly initialized.
But this representation, quote and quote representation, or exactly what values this word vector should have is learned by optimizing this objective function, which is to predict which word comes next.
And each word is sort of like enhanced with technical term called attention mechanism.
So what it does is it's going to compare its representation with representation of all the other words in your neighborhood.
And then update your own representation as weighted sort of like, I'm simplifying a lot, but weighted average over all the words in your neighborhood.
And this is going back to the idea that we discussed earlier, which is that the meaning of a word is sort of like defined by the context in which the word was used.
So, you know, Apple, for example, can be a fruit in one context.
It can be a, you know, tech company's name in the other context.
And so which apple are we talking about will be automatically determined based on the context
in which that word apple appeared.
So it's going to be automatically adjusted based on the context.
And, you know, by the way, every word is updating itself simultaneously, depending on which
word appeared in the neighborhood. So in some sense, there's a bit of a circular dependence,
but that's okay. So that's what happens. And why this simple idea works so well has to do with
the fact that this particular architecture allows people to scale things up really, really
efficiently compared to any other choices. So purely due to the efficiency reasons,
this one is the winning recipe.
And I have talked a couple times to people like Melanie Mitchell and Gary Marcus,
and they keep emphasizing that it's a different approach than we had back in the day of good old-fashioned AI,
where you would kind of try to develop a map of the world that kind of made some sense
and have symbols associated with different things.
And now you're just given it a bunch of words in it.
They figure out how do the words get together.
Is there any place left in modern AI for trying to figure out the world?
Yeah, so that's where the current challenges are.
Having symbol-like world representation, part of it can be theory of mind, meaning, you know,
I try to reason about what you do know or not know.
and, you know, if I go to your room and while you are not looking, I hide one of your precious,
I don't know, books or let's just say I hide your guitar and you're going to be surprised.
You know, you're going to look around probably to find it where you placed it last time
and you wouldn't necessarily go to the kitchen if I placed it in the kitchen while you are not looking.
Right.
So this is a theory of mind, knowing, you know, I know what, if I did something behind your back,
I know what you don't know, and then I can reason about it, which is different from.
So child acquired this kind of capability by the age of four or five.
They can already reason about what other person may not know if they saw someone else moving
some objects behind their back or something.
And this is sort of like a bit symbolic, you know,
there's a symbolic nature in the way that we think about these things, and current AI is not very
good at that.
In fact, I can actually mention AlphaGo, Alpha Zero, you know, the amazing capabilities of a neural
network winning over world class Go champion.
In fact, it's not just the neural network magic, but neural network magic combined with
all the fashioned AI search algorithm called Monte Carlo.
tree search. So without Monte Carlo
Trisarch, neural network
would have not been as impressive
as how it appeared. If you
just completely get rid of it, both
during training and testing, then it's not
going to, it's going to be miserable
probably. And even during
the inference time, it's just still relying
on it. So that's really quite fascinating
that a lot of the times, you know,
people just assume that, oh, it's like, neural
network, magic, that's just
like, you know, so scary. But
the truth is on its own, it's
a little bit incomplete.
And so that's, you know, sort of where some people wonder old-fashioned
go-fi AI stuff might become relevant again.
So I personally have mixed feeling about that.
Like probably the old-fashioned stuff as is is sort of almost not usable
because they were not designed to work well with the neural network,
which means we need new innovation.
new algorithmic innovations to make neural networks actually comparable or can be integrable with
that sort of symbolic reasoning.
But this is active research topic right now.
There are a lot of papers, including some of my own, which demonstrate that if you add
some sort of symbolic reasoning on top of a neural network, you can unleash much better
capabilities out of a neural network, which kind of makes sense.
sense. And also, these neural networks are not very good at really symbolic operations like
multiplying two numbers. It's almost surprising why it's able to pass the bar exam, yet it cannot do
some of the simple algebraic operations all that reliably. It's extremely interesting to me the bad
at arithmetic thing, because of course, computers have the capability to be very good at arithmetic. And basically,
in making them sound more human,
we've made them forget how to do arithmetic,
which is a little bit ironic.
Yeah, yeah, totally.
And what does this have to do with,
what is the symbolic element
that we might want to include,
have to do with the search for common sense
to sort of teach a large language model,
everything that every human being in the world knows.
I know that you've given some examples
of very commonsensical questions.
It's easy to ask chat GPT and get crazy.
answers for. Yeah. So, yeah, common sense has been the interesting research topic in my heart
for a long time, especially that it was considered to be an impossible goal to achieve for a long time
so that, you know, I've been almost told not to do that, or don't even say the word for a long time
to be taken seriously. But, um, but, um,
It's really curious thing that humans acquire that easily, even animals acquire that in their lifetime.
And so common sense is what makes us robust. Basically, it's the background knowledge about how the world works that allows us to reason about previously on since situations in a very robust manner.
So it's just like, you know, naive physics knowledge as well as a folk psychology that we acquire.
Some of that has a symbolic nature, not all of them, by the way, because some of the naive physics knowledge that animals acquire may or may not have a symbolic nature in it.
But in any case, it's something that current large language models do acquire more and more, for sure, because as you scale things up, you're going to pick up on that as well.
well. But it's also something that is strikingly not as robust as you may have assumed from a
model that can pass the bar exam. So, you know, we have this lawyer, AI lawyer. Yeah, we may or may not want
to trust because you never know what silly mistakes it's going to make on some common sense cases.
So before we had this conversation, I asked ChatGPT. I tried to fool it, and it's very easy to fool,
right? I said, you know, if yesterday I used a cast iron skillet to bake a pizza in an oven at 500 degrees, should, would I burn my hands if I picked it up? And it said, yes, you should be very careful about picking up cast iron skillet that you baked it, you know, because the word yesterday was far before in the sentence, right? Yeah. And that seems like exactly what I would worry about if I had an AI lawyer because all of the cases it's going to care about are going to be slightly unusual, right? Where it doesn't necessarily fit into the pattern.
Yeah, exactly. And you're very creative to come up with that example.
A lot of people actually, though, I should, I would like to mention that a lot of people ask
simple things, and then they get very good answers about some common sense, you know,
reasoning questions. And then they're blown away that, oh, look, it does have a common sense.
Oftentimes, though, those questions are mundane questions. So this, especially GPT4,
became much better than chat GPT. So there's a minor.
versioning differences between the two.
And then
though these are moving variables
in the sense that Open AI keeps updating both of them.
So, you know, this may or may not be true
depending on how they update both models in the future.
But so GPT4 became much better at
common sense questions in many ways,
but that's in part because people do ask a lot of that
to their interface.
And now those questions,
questions may or may have not been used for their subsequent RLHF, quote unquote, this, you know,
adjustment training where you can align your language model to be able to answer common sense
questions better. So especially some of the famous ones that I used in my public talks or
interviews before have been all fixed. But then, you know, people ask me like, hey, Jejin,
they fixed it all. So, you know, maybe, um,
Maybe it is now solved, no?
So there was actually one example I used in my TED talk,
and given the public attention, it got, well, it has been fixed.
Except if you ask the same question very differently,
then it rolls back to the original error,
which is almost like a Waccamol game.
You know, by the way, humans don't need any of these fixed.
based on, you know, these are all questions that as a human, you will just answer correctly,
first of all.
Right.
Even if you were to make a mistake, you don't need to fix yourself by me, you know, giving
you the exact same question spoken differently, phrased differently.
Because you just understand the same concept and then that's it.
So there's something very dissatisfactory or almost disappointing about how this.
this smart, smart-looking AI that is also simultaneously quite silly or even stupid in the way that
it's not able to really understand the basic common sense. So how do we fix that? I mean, is it
kind of like the working memory thing where we add something on top of it? Can we give it a little
common sense module that has a physics engine describing what happens in the world? Like,
game designers have to make it so that if you put your coffee cup on the table, it doesn't
fault at the bottom, right? You know, can we teach large language models, that kind of behavior?
Yeah. You have a really good hunch there in the sense that now, you know, what you're suggesting
may not be exactly the, you know, the winning recipe per se, but the idea that maybe we need to
have a different module might be something to seriously consider in the following sense.
Like, human brain definitely is a lot more modular than how transformers are.
the monolithic, systematically symmetric, and, you know, just one thing.
And whereas the human brain is very complex, different modules,
connected in a very, very messy way.
So we might need something more modularized,
but at the same time messier in some sense, broadly speaking,
for this to go to the next level.
But how do we do that exactly?
I personally think we are quite far from figuring that out.
But whatever that is, should be able to really learn for itself,
as opposed to reading texts word by word,
without having any capability to even ask questions.
The fact that humans ask a question,
that's like a huge intellectual capability.
of knowing what you don't know
and even be able to formulate questions
that sort of extrapolate
out of what you do know.
So that capability is
something that we don't really know
how to computationally model quite correctly.
Well, my personal extremely uneducated feeling
has been that there's,
at the current state of the art,
an enormous difference between computers
and human beings because computers don't get
bored. They don't get tired. They don't get curious. We can ask them to mimic those things,
but they don't have that same kind of physical embodiment that gives us those sort of feelings
and motivations. And I suspect that that kind of matters a lot. I don't know. Oh, yeah. So
dopamine does drive human creativity and invention and makes us do things that seem crazy.
but yeah the AI doesn't have that kind of
peculiar learning objective
like just the desire to do things to the extreme level
just because it's interesting
yeah
there are many things that's fundamentally odd
about the difference between human intelligence and AI
and I think that internal desire is one of them for sure.
Yeah.
Whenever I say that, I kind of worry that I'm going to give some supervillain the idea of doing this
and it's going to lead to terrible things down the road.
Actually, about supervillains,
unfortunately humans do include the supervillains already,
with or without AI.
and they can actually do a lot of better things,
even with current AI, if they so desire or without AI.
So, I mean, it might be that AI, capable AI, strong AI,
can enable them even more.
So there may need to be some research to put better safeguard rails around AI models
and also better regulations to,
control how these models can be used, but just, you know, by you're pointing out where the fundamental
limitation of AI is, especially in terms of the innate desire for learning things in the way humans to do,
probably villains will not try to make research innovation for that.
I hope not.
Because they can do that with way without, yeah.
That's true.
You can do that, so they can just use whatever you figure.
out. But maybe that's a good place to sort of wind up because we talked a lot about the capabilities
of AI. Some of its shortcomings, at least in the short term. But there are a lot of people who are
worried. I mean, we talked about college professors, but political disinformation and fakes, I mean,
there was a recent AI ad from Ron DeSantis that absolutely faked Donald Trump's voice saying something
that he didn't say. Is that, number one, are you worried about that? And number two, is there something
else you're worried about even more?
Yeah, I'm worried about that.
And then some more.
As far as deepfakes or misinformation,
I've been worrying about it even without AI deepfacts, actually,
that there are a lot of misinformation that people easily believe in.
And they're weird, like, you know, health, I don't know,
made up health benefit information.
to sell weird stuff on people and some people believe it and they buy it and, you know.
So even without political problems, this has been human problem with or without AI and then
AI might be able to accelerate it, which means that we kind of need minimally two ways to
better handle this. One is to increase AI literacy, basically teaching people how
to better understand the limitation of AI.
I seriously worry a lot about how there's too much of a media hype around AI capabilities
compared to AI limitations so that people are willing to believe whatever AI, you know,
chat GPT tells us.
So there's that concern.
But another more directly handling misinformation, probably we need to also think about
solution beyond AI solutions because I personally think that this is just going to be
impossible for AI to just automatically detect misinformation because even if some AI can be
developed to detect machine text versus human text, humans can always edit on top of machine
text to evade that kind of detectors. So that means the technical solution shouldn't be
AI solution per se, but rather a platform solution.
Maybe it should be certified with some kind of approach that tells you that this information
is correct or backed up by some organization that says this is correct.
As opposed to just believing anything that floats around the Internet aspect.
About the bigger, you know, other concerns, there's just so many concerns around the AI right now,
at least for me because as this starts working really much better than before,
but at the same time, we don't know how to fix the limitation or failure mode or, you know,
other cases or how it can make strange failures based on adversarial attacks like jailbreaking.
So as AI becomes stronger, and then at the same time, we don't know how to fix these error cases,
there's a lot of concerns around it.
And then in addition to that,
there's concern about AI actually making a lot of decisions
that has moral implications or, you know,
like marginalizing values that belong to different people
because it might only support.
Currently, by the way,
Chachapete is left-leaning, Western viewpoint.
model. So that can please some left-leaning people. That holds Western viewpoint. And then upset a lot of
other people who are even more left, you know, will feel that this AI is not left enough for them,
while right-leaning people will feel they're excluded. So, you know, there are a lot of concerns
all around. We are living in this hot mess right now. I think so, yes. I mean, I guess I did have this
idea that there could be like an overlay or a filter on social media where the AI
pass some judgment on different things saying, yeah, this is probably fake or this is
probably real, but it sounds like you're a little skeptical that that would be very accurate.
Yeah, not only it's not going to be accurate.
So that's where the labeling becomes a political game too.
You know, you'll be surprised how, so building such an AI really.
requires you and I agree on whether something is a fake news or not.
And this can be a challenging task to do when a former president was arguing about
election fraud is a fake news or not.
So in fact, you know, some people don't believe Holocaust that did happen.
So this becomes in part a political argument.
But I do think that although getting consensus on the truth label is hard, we still have to work on it.
We have to somehow find some sort of consensus around it in the coming years.
It's almost like AI challenge became the challenge for both AI researchers about people outside, all of us.
Well, exactly.
And so I guess that's the last question I'm going to ask, which is at some point in your life, you decide to study computer science and you've been successful at it. But now you're in a world where you need to interact with philosophy and psychology and art and journalism and politics. Is this like exhilarating and you're so glad that it's like this? Or do you sometimes say, like, I just want to do my computer science?
I think I always had a bit of fascination about things outside AI, things outside computer science.
In fact, the reason I was drawn to AI is because it felt like it's about humans and it felt like it's about language and culture and everything that humans do.
So in that sense, I'm excited that now I have an excuse to learn more about the philosophy.
and cognitive science.
Whereas in the past, you know, it would have been a little bit disconnected from the mainstream
AI, whereas now it's becoming more of relevant, immediate interest in the AI field.
So I found that quite exciting.
That is wonderful.
That is an optimistic place to end, which I always like to do.
So, yes, I enjoy.
Thanks so much for being on the Mindscape Podcast.
Thank you so much for having me.
such a fun conversation.
