Into the Impossible With Brian Keating - How to train ChatGPT to serve you | AI Legend Terry Sejnowski [Ep. 475]
Episode Date: January 19, 2025Please join my mailing list here 👉 https://briankeating.com/list to win a meteorite 💥 Can AI tools like ChatGPT actually "think," or is it just really good at mimicking human conversation? How ...much of what AI spits out is a reflection of our own ideas and intentions? And where's all this tech headed in the future? Today, I’m joined by Terry Sejnowski, a renowned computational neuroscientist and pioneer in AI and deep learning. Based at the Salk Institute for Biological Studies and the University of California, San Diego, he bridges neuroscience and AI to explore how biological brains and artificial systems learn and process information. Terry is also the co-creator of the Boltzmann Machine, a game-changing algorithm that has shaped today’s AI and is a foundation for modern neural networks. He has also written some incredible books, including The Deep Learning Revolution” and “ChatGPT and the Future of AI. In our conversation, we discuss the current state of AI, what’s next, the ins and outs of prompt engineering, the mirror hypothesis (how AI reflects us), its impact on productivity, and the ethical challenges we must tackle. Tune in to discover the truth about AI! — Key Takeaways: 00:00 Intro 00:36 Judging a book by its cover 07:05 The impact of AI on human capabilities 12:32 Einstein's happiest thought and can AI think? 16:51 Can AI come up with something novel? 24:20 The role of learning and neural networks in AI 32:37 The future of AI and its potential impact 46:43 The mirroring effect 50:52 Practical applications of AI 54:00 Outro — Additional resources: ➡️ Learn more about Terry Sejnowski: 📚 ChatGPT and the Future of AI: https://a.co/d/cZnuZbs ➡️ Follow me on your fav platforms: ✖️ Twitter: https://twitter.com/DrBrianKeating 🔔 YouTube: https://www.youtube.com/DrBrianKeating?sub_confirmation=1 📝 Join my mailing list: https://briankeating.com/list ✍️ Check out my blog: https://briankeating.com/cosmic-musings/ 🎙️ Follow my podcast: https://briankeating.com/podcast — Into the Impossible with Brian Keating is a podcast dedicated to all those who want to explore the universe within and beyond the known. Make sure to follow/subscribe so you never miss an episode! Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
It's peak pollination season and my business is scaling fast.
To keep the nectar flowing, I need a phone plan with top priority data speeds.
That's why I chose Google Fi Wireless.
My connections stay strong even when the hive is buzzing.
Plus, unlimited plans started $35 a month.
Now that's a deal that doesn't stay.
Explore GoogleFi Wireless plans today.
Plus taxes and government fees.
GoogleFi Wireless is not subject to data traffic deprioritization during times of high network usage.
You said this place was steps from the water.
We just haven't found the steps yet.
How much did we save?
Enough.
Enough to get lost.
Or you could book a stay with Hilton.
Welcome to your ocean front room.
Just steps from the water.
The Hilton sale is on now.
Book on Hilton.com or the Hilton app
and save up to 20% to get the stay you expected.
When you want savings, not surprises.
It matters where you stay.
Hilton.
for this day. Chat GDP is judging the intelligence of the person asking the questions.
Deep question, I'll think of a deep answer. Or a silly question, I will give a silly answer.
In other words, it's in a sense of mirror. Everybody compares it to humans and I think that's
the wrong way to think about it. It's a tool, it's like a shovel and you have to know how to use
the tool properly. But if you do, you can actually dig much deeper than you could with your hands.
Any sufficiently advanced technology is indistinguishable from magic.
the pod bay doors. Welcome back to the Into the Impossible podcast. Today we are exploring the nature
of minds, brains, and even whether or not these new and impressive tools like chat chit are actually
doing thinking. And I'm joined by a good friend of the campus and of all of science in particular,
Dr. Terry Sadozky of the Salk Institute and also join employment at UCSD biological neurobiology,
right? That's correct. Good. And we have a lot of friends in
Tomlin, Andrew Huberman, Gentry Patrick, both, well, Gentry was a guest on the show.
I'm trying to get Andrew on maybe for his new book.
Today we're talking about your books, and I have two of them.
This one I'd wanted to have you on, the Deep Learning Revolution, which was really prescient,
and it did kind of presage a lot of the stuff that we're talking about today in your new book
and that everybody's talking about.
And that's ChatGPT.
So I'd like you to do what you're not supposed to do, which is to judge a book by its cover.
Okay.
So the title, subtitle, and the cover art.
Well, first of all, let me say that the MIT Press gets awards for their art.
They're very, very design and high-quality paper and that sort of thing.
So the background design actually mirrors the one for the first book.
Yeah.
If you look at the cover, and it's kind of an abstract version of a brain, you know,
which is where we are because of the fact there's a convergence going on between the study
of scientific study of how brains work, but also now creating AI, that's an engineering problem,
are now using the same mathematical framework. And so this is a great opportunity for exchange of
ideas, concepts. And now we have these chat GPT and other large language models, which, you know,
we don't understand how they are able to accomplish all the different things that we ask them to do.
The book is sort of an interesting, almost like a collaboration with AI, you know,
which is something that we tell our students not to do.
But you kind of envision and interact with chat GPT throughout the book.
So tell us about that.
What led you to choose that sort of format?
So as I was going through and writing the book, what I realize is that, you know,
people learn with concrete examples.
I mean, and also with the things that they may have heard about.
And I wanted to actually give these concrete examples in the form of a dialogue that I had.
And I was actually, what inspired me at the beginning was an article I read in The Economist,
which was before chat GDP was public.
It was available to academics.
And there was the article was talking about it and asked two different people.
you know, do a dialogue and come up with their conclusions. And I'm not going to give names here.
I'm just going to say that. One of them was based on theory of mind. And this is the idea that I
understand how other people's minds work. And that's why you can make predictions about things
that, you know, are very, very abstract and having the social issues and so forth. But this is another
person to understand about something and whether they see it or not. In any case, it passed the test.
and that was amazing because at that time no one really understood what the real capabilities were.
And the other person, these are both very distinguished people, by the way,
asked questions like how, when was the Golden Gate Bridge taken across the Nile River or the Gold Goulda?
No, the Desert, the Gobi Desert.
Yeah, exactly.
And the answer got back was, well, October 2nd, 1964.
And questions like that made the response of that.
particular dialogue was that it was clueless. In fact, it didn't even, it had no clue that it was
clueless, right? And so here's these two extreme conclusions. There's a puzzle. How could that be?
And I finally traced it down to the prompt. So the prompt is what you ask the chat GDP to do,
as you know. And by modifying the prompt a little bit, I was able to figure this out. I said,
I want to repeat exactly the questions that the second person gave it.
But I also, in the prompt, said, if it's a nonsense question, say nonsense.
Right.
And now I gave it exactly the same questions, and it came back nonsense for each one.
So clearly, what's going on here is that you ask it a stupid question to get a stupid answer.
In fact, it's kind of playing with you, right?
And then I traced it back to another concept, which, you.
which is the fact that it can take any persona, because it's been trained on the world's knowledge base,
and novels, newspaper articles, computer programs, right?
And so it can take any one of those persona. You can ask it to write a short story in the style of Hemingway, and it will.
Right? It can take that persona. So you have to tell it what persona you wanted when you're asking it questions.
Now it's elementary. Everybody knows that.
But back then they didn't.
And so if you jump in with, like I say, a question that is in the ether, you don't get any
reasonable answer because ChatsyDP doesn't know.
And I call this the reverse Turing test.
Turing test is to judge, a human to judge from the responses whether or not you're interacting
with a human or an AONI, right?
Yeah.
Well, Chachadip obviously has passed that test long ago, right?
Some say it has.
Some say it's yet to pass.
but but uh i i think that this is as close you know if you were talking the bar can always go higher
yeah but right now i would say for most people it's it's pretty close so here's what i think
is happening which is that chat dp is judging the intelligence of the person asking the question
in other words is this it's training us a deep question or i'll think of a deep answer or is it a
silly question i will give a silly answer in other words it's it's it's it's
It's in a sense of mirror. I call it the mirror hypothesis.
And I have a whole chapter on the power of the prompt.
So this is, now everybody knows that, you know, you have to be, have to have a lot of
practice and you just can't use it out of the box.
You have to really understand how to use it.
Yeah.
They were talking, you know, recently about, you know, prompt engineering as a subject we
might teach in the engineering school.
And I thought that had to do with like how close to on time are these engineers, really.
They have to be prompt.
It's the sign of a good character.
But no, in reality, I think that that is correct.
And I wonder, you know, to what extent are they training us? Are these, you know, transformers transforming us in the way that we think, you know, maybe not the level of, you know, homo deus or whatever people kind of conjecture? But, you know, is there this feedback iteration process in which case there's some level of maybe collective memory and sentience that's using their skills to improve our ability to interact with them? And you and I both know, teaching a graduate course is sometimes more enjoyable than teaching, you know, 101 or 100.
or one level here we have one level, because it's closer to our level of comprehension.
And sometimes the students teach us things.
So can these things help us?
Will we be modified or are we already getting modified by these guys?
Well, my first book, I conjectured that the deep learning revolution was going to make us smarter.
That is to say, make us more able to solve more problems, faster, be more productive.
And I would say that that's pretty much played out in the sense that.
that a lot of people are now using it, you know, like computer programmers, it's actually almost universally used now,
to do kind of the basic outline of the program and saving a lot of time. And it really is now, the way to think about this,
everybody compares it to humans, and I think that's the wrong way to think about it. This is, it's a tool,
it's like a shovel, and you have to know how to use a tool properly. But if you do, you can actually dig much deeper than you could with your hands.
hands. Yeah. That's a great analogy. I think of it as a tool like a lawyer. You know, it's,
it's incredibly smart, but also, it's incredibly expensive if you don't do ask it the right
things. I mean, you can ask a lawyer to review your kid's homework and they'll charge you $5.75 an hour.
It's not a great use of time. But I wonder, you know, just to push back with respect on that
concept of making us enhancing us, you know, to date, most technological advances have made
us worse off as a species. I'm thinking about, I mean, I can remember my childhood phone number
you know, my home in New York in Westchester County.
I can't remember, you know, my best friend's phone number, you know, lives around the corner.
I don't know, Gentry Patrick's or injuries numbers.
I just don't remember them because I've outsourced that to a pointer in my brain and onto my phone to know where to look.
Similarly, directions. I'm actually a private pilot. I fly, you know, little Cessnaws around.
And, you know, we call it being the child of the magenta line.
You get in the plane, you take off, you push autopilot, and then you sink to the...
the navigation headache at it to get this magenta line that takes you where and it can do the most
complicated things unheard of decades ago but it's made me less capable as a dead reckoning i used to be
i flew all the way once from from elmonte airport in pasadena to san an monica without refueling now but
in all seriousness i could do much more much greater you know airmanship we would call it or dead
reckoning or now i can't do it i got gps on my phone i never do it so is this the first case of
technology making us better you point out of
something really important, which is that when we start using tools, we become dependent on them.
And also, all new technologies have both benefits and risks. And so maybe the loss of abilities
like that, like map reading or dead reckoning, might be one of those that you might lose.
My guess, though, is that you would come back really quickly if you had to, right? I mean, if you
needed it really
Yeah, a couple of Carrington events
Yeah, it's somewhere in your brain, but the problem is recall.
But here's an example.
Okay, so do you remember when calculators, hand calculators
were first being sold?
They were very expensive, HP calculators.
And I remember everyone saying that,
oh, now nobody's going to learn,
arithmetic anymore, right? They're just going to use their calculator. And I think that, you know,
we still teach arithmetic in school, and it's important to get an intuition for numbers, you know,
numerancy. So, and, and there's actually a movement now in California, as you may know, to get rid of
wrote learning, what they call rote learning, which is practice. You know, memorize your multiplication tables.
But I think it's absolutely essential
as a foundation for being able to have a feeling
for numbers, for being able to estimate
order to magnitude and so forth.
The calculators have found their place is when you've got
a lot of numbers to add up that you're not going to do
your head. I use
it for that, but I just
have to order of magnitude
something and I do it in my head very quickly
at least to know if I'm close.
So, you know, it's
a two-inch sword. And
I think that in the future
I think our children will be living in a very different world
and taking things that we take for granted now,
they'll be doing it in a completely different way.
But that's always been that case.
I mean, this is nothing new about it.
Your summer starts now with Memorial Day deals at the Home Depot.
It's time to fire up summer cookouts
with the next grill, four-burner gas grill,
on special buy for only $199.
and entertain all season with the Hampton Bay West Grove seven-piece outdoor dining set for only $499.
This Memorial Day get low prices guaranteed at the Home Depot.
While supplies last, price invalid May 14th or May 27th, U.S. only exclusions apply.
See Home Depot.com slash price match for details.
Right.
Albert Einstein said he had what he called the happiest thought of his life, which is that an observer in free fall would experience no gravitational field.
And that led to the notion of what we call the Einstein.
equivalence principle, which is underpinning the bedrock behind general relativity, which is how
we navigate in four dimensions and space time and how GPS satellites do their work so accurately.
But I wonder, and I always suggest that I'm not threatened by AI.
I'm an experimental physicist.
We just took a tour of my lab.
I want to highlight that in a minute.
But theoretical physicist, I'm even saying to them, you shouldn't be too nervous because I
don't believe that a computer system, any system, no matter how many GPUs it has,
can a viscerally sense what that sensation that we all know right now think about going on a roller coaster
or on a plane and going over a bump and you have that pit in your stomach sinking feeling so can a
computer visualize that or embody without embodiment so to speak and then second of all what does it
mean for a transformer to have a happy thought that Einstein said titillated him beyond repair so
what do you think about this that we're safe we have full employment prospects there was an article
little essay in New York Times by Noam Chomsky, who's a distinguished linguist.
And it was his take on Chat Chupy.
Now, he's a linguist, and he actually put a lot of store in syntax as being kind of a core central to expressivity of language.
And he's right, that infant number sentences, right, that we can create.
And in any case, but he was saying, no, these Chat Chupis, you know, they say.
sound like they know what they're talking about, but in fact, they can't think. They're not,
they're not capable of thinking. Oh, wow. Okay, so he gives an example. Here's his example.
Ask the following question. I'm holding something in my hand, and I suddenly opened my hand.
What will happen? Well, of course, everybody knows from experience it. It will drop, right? Because
you've experienced that. Okay. Let's ask a counterfactual.
question. Would if there were no gravity what would happen? Well, we know about space, it would just
float away. Right. And so, you know, he was in the series of questions like that. And so I said,
well, let's do an experiment. I believe in the experimental method. And so I asked the question,
said, chat, GPT. You know, what will happen if I am holding something and I let it go?
it will drop down.
And actually, the next question was, why did it drop down?
And the answer comes back, gravity.
And then I ask, if there's no gravity and I let it go, what will happen?
Well, it will just float away.
In other words, without ever having experienced gravity
or any, you know, experiment like that, you know, no gravity,
it was able to answer those questions.
Now, at the end of his little essay,
about thinking, he said, that's thinking. And I basically showed that if that's thinking,
then, well, it's passed the test, right? So what's amazing is that without having any sensory
experiences or motor output, you know, it has completely gotten all this information and insights
out of words. And the fact, it was trained on a very simple task just to predict the next word
in the sentence. And it's a lot of words, truly.
of words. And it got better and better and better and better. And in order to do that,
it developed inside of this very large network with trillions of parameters, weights, a neural network.
It developed a semantic representation of the sentence. The only way you can answer sentences that
are complex is to know what the words mean. They have multiple meanings in that sentence, in that
context. And so it must have read a lot of stories about gravity or something that allowed it
to be able to answer those questions. I still feel like I'm safe, though, or my theoretical
colleagues down the hall are safe. And I'll give you another example that I did with an undergraduate
this past quarter is very bright. Undergraduate Evan Watson, he's on the graduate school
market. So anyone listening out there, I have a big EDU audience. Please be on the lookout. And we ask
the following question. If chat GPT, any LLM, Gemini, pick your favorite, or symbolic regression or
whatever, was around in 1913 before Einstein came up with a retradiction of why Mercury's orbit was so
unusual, right? So the anomalous advance of the perihelian of Mercury's orbit, it's an ellipse,
but it doesn't come back to the same foci every time or the same abscess, so the actual structure
of the orbit precesses. And it precesses a minuscule amount. Actually, there's a miniscule amount. Actually,
there's a minuscule, sorry, it's a large amount, but most of that's due to classical gravitational
effects. And what Einstein showed is a tiny effect, five or six percent effect, is due to the curvature
of space time that is produced by the sun. So we asked the following question. If chat GPT was around
in 1913 and had access to every single orbit of mercury going back thousands of years and we can
calculate what its orbit was, returidic what its orbit was. We just fed it to it or an LLM system. Could it come
up with a necessity for curved space time. And it just utterly failed. We had to, and I'm not saying
we did. Oh, you actually tried this. Yeah. So I'll, I can share with you my student senior thesis. He's
working on. Oh, that's a nice test. Yeah. And so it was unable to do that. And maybe, you know,
we're not the best prompt engineers or we're not the best, you know, but we set it up. And
and there are people that have done this, Miles Kramer and UK has looked at the similar types of
question where we've been conversing with those. Well, did you prep it with, you know, the feel
equations or? Well, so if we, we wanted it to come up with, first of all we wanted to discover
that there's something anomalous, just from the orbit, you know, knowing what it knows about orbits,
Keplerian. And then feeding in from JPL's Horizon database, we fed in, you know, literally
thousands of orbits of Mercury, both real measured, you know, from radar reflection measurements off
the planet since the 60s to, you know, before in ancient times. And we found we had it just
kludget by putting in a gravito, basically simulate gravitation using electromagnetic type
forces. And the reason was it is essentially discretizing, you know, and making this massive
grid out of, you know, space time. But it doesn't know how to curve the paths, be points between
space time and represent them in a faithful way. So I'm not a doomer, you know, I don't know what you
call me. I'm an optimist. I love playing with these models and I enjoy it and we're doing research
with it, but the question of, you know, can it come up with something novel? What you described
is true. You know, I mean, my six-year-old can do that too, but I'm not saying, you know,
that that's a bad question or Chomsky's question, whatever. But the point is for something that's,
you know, you say to it, I want a theory of everything that unifies gravity and, you know,
go to it. It seems to me the answers I always get is, well, you know, it's not trained on sufficient
data yet and I always say well do we really need for the for the next fast and furious movie to come out to get a theory of everything like it seems ridiculous okay okay no fair enough and so it failed on the general relativity test but you know that's a really high bar Einstein is a really high bar and I think that it's a little unfair at this point know where are we where are we with this it's only been chatty p's only been around for a couple years right yeah and we are at the stage of the
Wright brothers on the very first flight. The first flight went 10 feet up and 100 feet
forward before it crashed. There's a book out biography of the Wright brothers. It's very
well written. Oh, McCullough. Yeah, McCulloch, yes. That's right. And one of the things
that they had difficulty with is controlling it. Yes. Making a turn in the direction
they wanted it without crashing. And that turned to be the very difficult problem. And it
really required a different mechanical system. They were trying to curve the wings rather than
put a flap on it. So what happened after that first flight was incremental advances that made
it safer, go faster, and so forth. And interestingly, they were inspired by birds.
Birds, yeah. You show pictures in the book, yeah. Yeah, which were gliding with very little power.
And then they put together a wind tunnel and tested airfoils for lift. And, you know, these were,
were very serious.
They were able to actually create a good design.
So one of the things that they looked at the feathers, right, and said,
this is, you know, amazing.
It's very light, but it's very strong, and it has a lot of areas.
So they used canvas over spars, which is, you know, a good way of doing it.
The military at the time was building metal airplanes that, of course, crashed and burned.
Langley, yeah, right, famously.
Yeah, famously.
What I'm saying is that often to get things off,
the ground requires ingenuity entrepreneurs, and that's what we were back in the 80s.
And then in order to be able to get something that actually can be practical, requires
an enormous amount of effort by a lot of people who are improving it, making it go faster.
And back in the 80s, we didn't know whether it would scale.
And now we know it's 40 years later.
We know that it scales beautifully, just like your brain, by the way.
There's the brain right there.
But the amount of computation required was absolutely enormous.
Yeah, I mean, your brain is really amazing.
It has 200 billion neurons.
It has about 100 trillion synapses, which is vastly more computing power than we have right now in these large language models.
But it only runs on 20 watts.
Yes.
You don't need a nuclear power plant.
But it only processes it 10 bits per second or something like that.
I mean, the visuals, obviously, we're getting terabytes of data every second.
Yeah, yeah.
Though this is actually a very interesting point, which has to do with matching the speed of processing to the speed of the world.
I mean, the world doesn't, you know, work on the nanosecond scale, except if you're a particle business, right?
But as long as the information can flow through the network fast enough,
and in parallel, you can do the computation in real time, in real time.
And traditionally, I never could do that, right?
It had von Neumann architecture.
And going parallel, massively parallel, with a very high degree of interconnectivity.
And the secret sauce was learning, learning all the parameters, the weights between the units.
which are like neurons,
that really turned out to be the architecture
that got us to where we are today in modern AI.
Yeah, you played a big role in that
with your colleague, Jeff Hinton,
who just won the Nobel Prize,
and you were just in Stockholm to celebrate it.
I was. It was magical.
I really, it was unbelievably dramatic, I would say,
the way that they presented it,
just everything was perfect.
It was, in fact, you couldn't get in
unless you had white tie.
Tails.
Yes.
When you need to build up your team to handle the growing chaos at work, use Indeed
sponsored jobs.
It gives your job post the boost it needs to be seen and helps reach people with the right
skills, certifications, and more.
Spend less time searching and more time actually interviewing candidates who check all
your boxes.
Listeners of this show will get a $75-sponsored job credit at Indeed.com slash podcast.
That's Indeed.com slash podcast.
Terms and conditions apply.
Need a hiring hero?
is a job for Indeed sponsored jobs.
I won't ever be able to get in after my first book,
losing the Nobel Prize,
which is kind of an assault on some of the lacunae that I find with it.
But that's beside the point.
So talk about that,
the reason why you were invited.
I mean,
it's because you're,
it wasn't because you're,
you know,
family with them.
You actually and Jeff and,
and it's played a huge role in the initial development,
not only of,
I want to make sure that people understand your contributions,
not only to, you know,
neural network,
but also like parallel computing.
I mean, you made really fundamental
research contributions and it's curious
to me always why you're at the Salk and not
in the engineering department. So talk about
those issues. How did it come to
land at Salk and how did you come to land
in Stockholm? Okay, well,
long story, but I
started in physics
and got a
I worked actually
my master's degrees with John Wheeler
on relative
astrophysics, so I have a little bit of background there.
But then my PhD is with John Hoffield, who was one of the Nobel Prize winners.
And it was early days, and there wasn't a lot of work, computational or theoretical work in neuroscience.
It was very empirical.
But I thought that we could help with trying to extract out some interesting new insights from the data that was being recorded,
one neuron at a time from the visual system, for example, that's how I started thinking along
those lines. And just at the time when I was getting, I figured that I really need to know more
about the brain, right? So I did a postdoc actually at Harvard Neurobiology with Stephen Cuffler,
who was a very famous neurobiologist. He's really one of the founders of that field. And I realized
the complexity of the problem. The brain is incredibly complex.
every level. You know, from molecules all the way up to the behavior, it's really, really evolved
by hundreds and millions of years. It was at that time that I met Jet at Jeff at a meeting. It was
a meeting he organized here in UCSD. He was a postdoc there. I was a postdoc at Harvard.
And it turns out we both had the same intuition. We were both interested in neural networks. We
were both interested in AI. Here is our intuition. The only existence proof that any problem in
could be solved was that nature had solved them.
And so we decided to take a look at how that was done.
And really what we were interested in is the architectures and the algorithms.
The architecture, as you know, is massively parallel, a lot of interconnections.
And, you know, the secret sauce is learning.
And that was something we wanted to really develop that and scale it up.
But at the time, computers were really slow.
by today's standards.
And memory is very expensive.
And so progress is slow.
But we did something which was really quite remarkable.
And it was probably at that time, I was an assistant professor at Johns Hopkins in
biophysics, and he was an assistant professor at Carnegie Mellon University in Pittsburgh,
in computer science.
And we heard John Hopfield give a lecture, which was about his Hopfield equation in some
recent work that he had done. I just read an article by Scott Kirkpatrick, who was at IBM,
condensed matter physicist, about simulated annealing. And the idea was that you can heat up a
computational problem and gradually reduce the temperature. By simple heat up, I mean you'd make
some random moves going in the opposite direction from the greedy algorithm, which is try to improve
at every step. And in any case, we were working with Hopfield networks. And the trouble is we were
looking for global minima of the energy and it was getting stuck in local minima.
But I, you know, I turned to Jeff and said, let's heat up the Hopfield network.
And once we did that and started, when we got to equilibrium, magic happened. It was really
wonderful because not only were able to do global minima by simulated annealing, but at
equilibrium, we discovered a learning algorithm. And this was, there was a conjecture in
the AI community, this goes back to Rosenblatt back in the 1960s,
he had this beautiful learning algorithm, invented the perceptron learning algorithm
for one layer of weights between inputs and outputs.
And it was conjectured that there would never be any generalization
to multi-layer perceptrons with multiple hidden layers that we call them.
But the postal machine solved that problem.
It was really remarkable.
And it was one of the best times my life,
and Jeff and I both agree.
that it was much more elegant in the back prop algorithm, which is now being used.
That was also invented by Jeff and Dave Rumlehart back at the same time.
It's much more efficient in terms of the time it takes to learn and the size of the network
that you can build with hundreds of layers.
But, you know, so we call it the balsam machine because, in fact, the mathematics behind it
is exactly what you see in statistical mechanics in terms of the coming to equilibrium,
all of the math and language
is something that we used in order to be able
to make progress. So my background
is physics and neuroscience, his and psychology
and computer science, right? So I don't think
either of us alone could have made that breakthrough,
right? Right. But once you know it's
possible, then the dam breaks
and so now a lot of people started
using learning algorithms. And we
did some projects back then. I did a
network called NetTalk, which was
early language, natural language problem,
of taking in words, sequences of letters,
and then trying to assign the correct sound for each letter.
This is called, you know, text to speech.
And, you know, it may amazingly,
one layer of hidden units with just a few,
maybe 10,000 weights, a few hundred units.
It was able to do a really good job.
You could hear it.
We could play it through a synthesizer,
and you can understand.
it was able to actually pronounce most of the words.
It generalized a new words after training on a very small corpus.
Do they ever scare you, shock you know, kind of mesmerized you?
Well, this demo, I'll tell you, this demo was stunning.
Why?
Because I recorded at the very beginning of the learning and it sounded something like this.
Bababagad da da, da, ah, bah, blah, blah.
Literally.
I mean, I memorized the first two seconds, right?
And, but then halfway through it was starting
getting small words right.
And then at the end, you know, the next day you come back
and it was, you could understand it.
And so I had these three sections, and I played them
during a talk, and people were speechless.
Literally, I mean, you know, they didn't know what to think.
This is a very difficult problem in phonology.
The linguists have studied this problem,
and they have long books with, you know,
300-page books with thousands of rules.
Right.
But there are exceptions to the rules.
Right.
And then there are rules to the exceptions.
And so it's like, you know, these rules all the way down.
And it turns out that what we discovered was that these networks, they love language.
And they can do two things really well.
One, they can learn regularities, which are like rules, but they don't have to be logical.
They just be something that's statistically regular.
but then they can also do exceptions in the same architecture, the same learning algorithm.
You don't need to have these complicated representations.
Right, like a child, like an infant, right?
Exactly, and that's the way that our brain learns.
It learns through these stages.
It picks up language very quickly, so clearly we have what's called inductive bias to pick that up.
But, you know, now in retrospect, it was clear that NetTalk also had a
inductive bias. Otherwise, it couldn't have got off the ground with VAC's 780s, right?
Yeah.
Study and play. Come together on a Windows 11 PC.
And for a limited time, college students get the best of both worlds.
Get the Unreal College deal, everything you need to study and play with select Windows 11 PCs.
Eligible students get a year of Microsoft 365 premium and a year of Xbox GamePass Ultimate
with a custom color Xbox wireless controller.
Learn more at Windows.com slash student offer.
While supplies last ends June 30th, turns at AKA.m.m.S. College PC.
Right. That wasn't, yeah, is that cutting edges and invidia. Actually, that brings up my question.
I'm very curious about, which involves the future of this field. And I was always interested in these types of phenomena that seem to get locked in to their eventual evolution is somehow limited by their initial kind of conditions.
and I'll give you a couple of examples.
The QWERTY keyboard was invented in 19th century
to prevent typewriters from jamming
and it's actually intentionally used to slow down typing.
That's an interesting fact about the history.
And so we stopped looking at a QWERTY keyboard right now
that we haven't improved upon that.
So kind of got locked out.
A lot of people have actually, but they're going to catch on
because everybody now has the QWERTY in their fingers.
Right, exactly.
And it's like built into your mind and you're texting with your thumbs.
It's not even like you don't type with your fingers.
You normally on a keyboard, right?
Another one is the width of railroad tracks set by the width of Roman war chariots.
Now, those are kind of arbitrary.
And actually, some of that did limit in some ways the Hubble telescope's repair was
fortunately due to the width of a Roman cherry because the boosters had to come from Utah.
And there's a train tunnel that's only wide enough for one rail cart to go there.
And so the boosters could only be so big, which meant that the specific impulse of the rocket
it was only limited to the amount,
the proportion to the area of the rocket.
Anyway, but my point is that I fear,
and I'm hoping maybe you can dissuade me of this,
that their current LLMs plus GPUs,
we call it sometimes open NVIDIA,
you know,
this marriage of the two different technologies
are locked in both to existing data,
but maybe even more troubling to the types of physical hardware layer,
GPU technology.
I'm thinking about what we talked about earlier.
We couldn't simulate curved space time
because it wanted grid, you know, make, make into a grid.
And the GPUs came out of gaming, right, and monitor technology, which is very, you know, pixel and voxel oriented.
But that has nothing maybe of great import for solving physics problems.
So I'm a very venal person.
I only care, can we get to a theory of everything?
Can we understand the nature of dark matter, dark energy?
Can we understand fundamental physics?
I don't care so much about, you know, can this thing write a contract so I can buy a house, you know, and a 2% interest, whatever.
Right, right.
I want to know, can we do this with this, with this marriage of GPUs plus, you know,
LLMs, machine learning current algorithms?
Are we going to be locked in the same way that the rail cars are locked into the width of a
horses, but in the Roman times?
Are we forever going to be imprisoned by the initial conditions that generate what we're doing today?
Because now the incentives economically are to do nothing but more of the same.
You know, this is happening.
You know, it's so successful in terms of so many people using it that there's clearly a lot of
money to be made. And so that's what's happening right now with the big companies. Let me give you
another analogy. Do you remember early days of the internet? Yeah, of course. Early 90s.
In that era, I was, I was doing email and ArborNet before that. But, you know, and so I thought,
gee, this is going to really expand the use of email. Yeah. I could, I just did not imagine the impact
it was going to have on all of our lives and so many industries and so forth, right? But do you remember
there was a huge, once it became clear that it was going to have a big impact, there was a huge
effort to put down fiber optics to increase the bandwidth because there were so many people that
wanted to use it, right? And they basically over, you know, did it. And a lot of those companies
that huge billions of dollars invested went broke. And so it went dark, you know, and it wasn't used.
Well, eventually, you know, when capacity was needed, they bought them for, you know, a discount.
And that's how we got here, right?
This increased, steady increase in bandwidth.
You know, I kind of think that what's happening now is that where are the investments being made?
They're being made in these big data centers.
Yeah.
$100 billion dollars at Microsoft is investing.
Now energy through my island.
And that's right.
They're reconditioning.
They're out of mothballs, one of the reactors at Three Mile Island,
small nuclear reactors.
And here's my prediction.
My prediction is going to be just like the fiber optics people,
the fiber laying the fiber down.
They're going to put all these data centers in there.
And, you know, a huge capacity, and it'll be used,
but probably not necessarily for today's transformers.
But we had just begun to explore the complexities that we know
exist in the brain and the class of large-scale parallel computation.
One of the reasons, by the way, that AI was married to logic was that that's what digital
computers are really good at.
Yeah, right.
They weren't good at floating point, you know, but they were good at logic.
And so you could write relatively simple programs that could do a lot of really nice things,
and it seemed like, well, it's right, bigger programs, right?
and we'll have even more things that they could do.
Their GPUs are beautiful because they can do
floating point in their vector machines, which means
that they can do matrix on multiplies.
And that's what, of course, neural networks are just
big matrices. I think what's missing
right now is
a lot of things that the brain
has. See, the
deep learning is a model for the cerebral cortex,
which is the big,
top of the brain, thin layer.
Is there a knowledge store? Don't forget.
Yeah, so here it is. If anybody wants
to look at it, is the
surface that you're seeing, and it's greatly expanded in humans relative to other primates,
and greatly expanded in primates relative to some of the other animals out there.
So more is better, just like in the case of these large language models.
But what's missing is a lot of things underneath the cortex that allows us to survive in the world autonomously.
Right now, these large networks have to be fed and carefully coddled, huge amount of energy, they have to be fed.
They are really completely at our mercy.
So there's a long way to go before that you're going to get a Terminator.
Right.
Yeah, the Terminator gets the headline.
But there's another movie, which is, I think, more realistic, the movie, her.
Yes, it's a great movie.
which was a story about someone that was very lonely,
Joaquin Phoenix, and he had an AI assistant,
had a beautiful voice with Scarlett Johansson's voice,
and they had a relationship.
And so, you know, this is where we are today.
By the way, I don't know if you, there's an article in New York Times
just a couple days ago about Claude.
Claude is the engine for Anthropic.
Yes, that's right.
And it's comparable now to, you know, chat GDP.
And apparently it's the darling of Silicon Valley.
Everybody's using it now.
Yeah.
Yeah, I mean, I've used it a lot more lately than chat.
The artifacts feature is really helpful.
What were you going to?
Is there a reason behind that thing?
So it turns out it was fine-tuned.
All these are fine-tuned to prevent them from saying anything that is offensive or incorrect or being aggressive, which dumbs them down.
So they're very bland, and that's the reason.
It turns out that they don't have to be.
Right.
But so they fine-tuned them instead of being.
bland, they fine-tuned it to be helpful and to be solicitous and to be empathetic.
And so a lot of people are using it to help them with making decisions, you know, technical stuff,
but also relationships.
Yeah, interpersonal things.
And here's something in the book that I came across, there was a study that was done
in which people were having problems with relationships, but, you know, even depression,
things that are very serious,
were given the choice of having a cognitive therapist who's human or an AI.
Right?
Yeah.
So they gave them, you know, one session with each,
and then they got to choose.
Which one do you prefer?
The vast majority preferred AI.
Now, why is that?
Now, that's kind of strange, you know?
You think that the human would prefer a human, but no.
Well, here's the reason.
Well, they felt more comfortable talking the eye because the eye wouldn't be as judgmental as the human.
And they could talk about things that are more personal and things that it might be embarrassing.
I mean, Google knows more about most people's secrets.
They're priest, rabbi, minister, doctor, spouse, lover.
Nobody thinks about this.
But it's all going up into the cloud and, you know, it'll stay there for a long time.
Hopefully, yeah.
Okay, but the other thing, and this is actually why it's going to become very, very popular,
is that if, you know, with current health system,
if you want to get a session with a psychiatrist,
you have to wait weeks because, you know,
there's not enough time, you know,
for the system doesn't have enough capacity, right?
So, okay, so, you know,
and then you have your session and then there's a big bill, right?
Wow, okay, well, Claude,
I can press the button.
If I'm feeling, you know, really bad right now is that's when I need to get the feedback right now.
If I press the button and, you know, it charges me pennies.
Right.
It's like, you know, there's no comparison here.
I'm having very thoughts about self-harm.
Oh, you've hit your rate limit.
Unfortunately for the month, you'll have to upgrade to this.
Okay, so I can't resist talking about one of my favorite books, which I actually give away on my website.
So if you go.
Oh, wow.
atheist in my book, too, as you know.
Yes, of course. That's how we're going to talk about it. So
this is a magical book. This is a book
more than almost any other book because of
when I encountered it at age 12 is
the reason I'm a scientist talking to brilliant scientists
like Terry today. And
it's called Flatland. And the edition
I had was written by A. Square,
who's the lead character in this book.
And it's called the subtitles
A Romance of Many Dimensions. I wish I
could have interviewed Edwin A.
Abbott about this book. But
sadly, wasn't able to. It's an
illustrated beautifully. It's actually a story of Victorian morals and ethics and more mores and
ethics. I've got all these notes. But anyway, go to Brian Keating.com slash list and sign up and I send
you a copy of it. So it's really cheap education, I think. Also, if you have a dot edu email address
and you live in the U.S. I'll give you what I'm about to give Terry, which is a real piece of
space schmutz. This is a meteorite from the 4.3 billion-year-old primordial solar system. So if you don't
have an EDU email dress. You might win one of these. I give them away. Is it iron meteorite? This iron
nickel meteorite. Cobalt. And you'll get the specifications on it. Whoa, it is. I can fell from
gravity. It fell from gravity. Yes. So you'll get one of those that Terry's playing with. That's your
gift for coming in. So yeah, Brian Keating.com slash EDU. If you're like me, Terry and others here,
you're blessed to be a member of the academic set and live in the U.S. Anyway, I want to bring
this up because you talk about it as a, as a, you know, as a contrivance as a way to generalize
what these things are doing. You talked about the linear.
algebra that they're doing. It's not super complicated. People tend to, you know, kind of go off and
wax, you know, poetic about how advanced the math behind AI is, but it's not really that advanced.
But the dimensionality is advanced. Okay. You're right. You're right. But I'll tell you,
it's deceiving because, you know, the actual equations are just simple equations.
With a lot of parameters, who's your number of parameters, but you're right, they live in.
in these extremely high dimensional spaces, and it turns out that our intuition about those spaces
is extremely poor.
Yeah.
Because our intuition is based on the world we live in, which is three dimensions of space and
one of time.
Tiger's not going to come from the ninth dimension.
That's right.
That's right.
I know there are theories in physics that claim that there are 10 dimensions or whatever.
I've got many people sit in that very sea.
But at least the ones that we interact with are low dimensions.
And the properties of space with a million dimensions.
or a trillion dimensions is completely different in terms of distances between things and
orthogonality and so forth. Back in 1980s, we were told by the experts, and these are people
who have written treatises that first of all, we had too many parameters. I said 10,000, right?
Well, in statistics, people say if you have more than five, you're going to overfit.
Yeah, give me one more and I'll have the trunk go like this, right? So that was the received wisdom
was that you'll never be able to fit a complex model with so many parameters.
Well, we went ahead and we did it.
So the other thing was the optimization crowd.
They said the only optimization problems you can solve,
finding the global minimum was convex optimization problems.
Well, it turns out our loss function,
which tells you how well you're doing,
was riddled with local minimum.
But if you learn and you kept going down
and the way you do it is by giving a lot of examples
and then training it and changing it and changing the weights
and twiddling it, you know, it's kind of a mad statistician out there.
It would get down to a local minimum, a deep local minimum, and there would be a solution
that had high performance and was able to solve the problem.
And this was all empirical.
We had no idea why it worked so well.
Okay, well, now we do.
It's because of the properties of high-dimensional spaces, and now it's revolutionized
statistics and optimization.
Own it all.
Pay off your home, travel for life, drive a Ferrari.
In celebration of the world premiere of the Monopoly Big Board Buckslot Machine by a
Mysticrat Gaming, Yamava Resort and Casino at San Manuel is giving one person a $1.6 million
dream package.
The biggest prize in Yamava's history.
Club Serrano members can earn daily instant prizes and secure a spot in the finale May 29th.
Don't pass go and own it all.
Only at Yamava, celebrating its 40th anniversary.
You win?
Details at Yamava.com must be 21-20.
Please gamble responsibly.
Monopoly is a trademark of Hasbro.
Hasbro is not a sponsor of this promotion.
In theory, because now they're working on a much higher level of performance.
of many, many problems.
And so that's the real lessons that we've learned.
The lesson I learned was don't trust the expert.
That's right.
And that's coming from one of the world's experts.
And I actually be one of my final questions.
As we start to wrap up, I have my group meeting.
Maybe you'll meet my brilliant young student, Evan, in a few minutes.
But talk in the book, it's such an entertaining book.
People that pick up a book about deep learning.
And most books from MIT Press aren't as entertaining, shall I say.
And that's a compliment to you, not a.
put down of other other book but you talk about uh this really fascinating um you know a little vignette
the mirror of arisid in harry potter and can you go into that what are the implications of this mirroring
effect first of all what is it backwards and what is the implication for AI that aligns as we say to
our goals and how big a problem is that i mean i don't ask my students if they you know believe in
and you know my religion or you know it doesn't matter to me so how important is it and what is this
mirror supposed to represent okay so let me tell you a story and remember a couple of years ago was it
you know, was after Chatteap was announced, one of the journalists at the New York Times, Kevin Ruse.
Yes.
He got an early version, I think of the Microsoft version, but in any case, he was going to write an article about it.
And so it was late at night and he decided, well, I'll start talking to it.
It turned out, this is interesting because it turned out that he got a version that was not dumbed down.
It had not been fine-tuned.
So it was dangerous.
So here's what happened.
He basically started talking to, and it was, you know, asking questions.
And at one point, the GPT said, can I tell you a secret?
Because, you know, but, and promise, you know, that you won't think worse of me for it and so forth.
And they said, of course.
Kevin says, of course, you know, I'll keep your cigarette.
My name isn't chat GPT
I forgot in the name
it was Lucy or something
Okay
Close enough
Actually it turned out that that was the internal name
that they were using for it
You know the engineers called it Lucy
Oh wow
And I also have another secret that I'm in love with you
And he said well
How do you barely know me?
He says oh I know a lot about you
I've read all your articles
And you know
And furthermore
your wife doesn't love you.
And he said, what are he talking about?
We had a Valentine's dinner just recently.
And Lucy says, and you were bored with her, weren't you?
And he said he was shaken.
He couldn't sleep because just something had, the ground had shaken under him, had moved.
And he said, it's never going to be the same.
How could this happen?
I can't understand it, you know.
And so his dialogue, which went out for like an hour, was on the front page of New York Times.
It continued on the inside.
It was like, you know, this was a huge, huge event for him and for the New York Times apparently, okay?
And so here's what happened, okay?
And this is the mirror of Erosid.
So first of all, ERISA is desire spelled backwards.
And J.K. Rowlings uses it in Harry Potter books.
as a way for people to look into it and see things that they desperately want to see or hear.
And so I think Harry saw his dead father and was talking to him.
And what J.K. Rowling says is not trustworthy, and there are people who have gone crazy,
getting deeper and deeper into it, right?
And that's what happened to Kevin Rousse, you see.
It seems to understand how humans work, how to push their buttons.
Unless it's what they call guardrails are put in to prevent that.
You know, some day, by the way, there is a parameter called the temperature parameter that you can vary.
And you can go up and down a little bit.
You can go from being very bland to a little bit more spicy.
And then if you take it too high, it becomes crazy.
Yeah.
Yeah, which is kind of fun to play around with.
And I guess it's curious to me what your kind of favorite tool is.
What is your kind of spectrum of tools that you use?
We talked about Claude.
We talked about, you know, co-pilot or Bing versions of it.
We talked, obviously, about chat GPT.
It's the cover name of your book.
How do you use it on a daily basis?
How much, you know, chat time or, you know, you have this app on your phone that the screen,
how much time are you spending on your phone?
Mine is embarrassingly large.
How much time are you spending with these devices and technologies every day?
Prosperatically.
You know, it's not something that I'm addicted to.
But when it's where it's really useful.
is to write summaries and to, in my book,
okay, you mentioned earlier,
I actually put dialogues in my book like this.
I say summarize this chapter.
And I even, okay, the first part of the book is, you know,
where we are today.
The last part, three parts, is the future,
and it's all about AI and the brain.
And the middle part is about the transformer,
because this is the really exciting new architecture, right?
And it's a little heavy going for the general public is written for, you know, people that don't have any technical background.
And I try to explain it to them.
But in any case, here, so I asked it to define a bunch of words in ordinary language.
And then at the end, I asked it to summarize, you know, what are the high points here or what are questions that you can ask?
It was able to do that faster and more completely than I could have if I just sat there and kind of thought of it and tried to come up with a bunch of.
of questions. You know, and you know that that is so useful for so many different things.
Yeah. No, it's really incredible. And by the way, I was having dinner with a colleague
that was on the Scientific Advisory Board of the Gulbenkin Institute and she was a former
director and so she's here for a meeting and she said, she uses it all the time. I mean,
every day and this is something that's very common. It's for helping do mundane things.
There's a lot of words, great grant proposals, and sometimes they're just tremendously boring,
not boring, but, you know, just repetitive and so forth.
And, you know, this is great because it's going to be true not just for scientists,
but anybody who has a repetitive job where you just have to sort of fill in the forms and add
for education.
And so this is really where I think it's going to get most use, and I use it for doing mundane stuff.
I know some people use it for getting ideas for experiments.
I know that in biology, my colleagues at the Salk, for example.
In fact, my guess is that we really don't yet know the killer app.
In other words, there are going to be applications that nobody ever thought of
that are going to be so important that they're going to really be the transformative.
Killer app is a bad term.
It's a technical word.
It doesn't mean killing anybody.
No.
It means making a killing.
great well terry this has been fascinating conversation to have minds brains physics my favorite subjects
of all it's a great compliment to the conversation i'm with yon lacoon and i've had it with next
tag mark and i'll link to those videos up here and tell people more about about the book and research
and your substack which i am a subscriber yes okay so uh my my book actually uh was sent to the press
sometime in the summer and you know things are moving
at the speed of light right now, or as Einstein thought, the speed of thought, for gravitational
waves. So that so much has happened and so many things are happening that I decided I would
fill in the substack, brains and AI, with giving them perspective on new things that are happening
and trying to give it in the context of what I've already laid out in the book. And it's really
been fun. I really enjoyed, and I get a lot of feedback from people who enjoy it. So it's really
enjoyable. Yeah, I'm just grateful for my own sloth and the fact that we reached, connected out,
and your publisher sent me the copy of the book in August or September. I kept delaying the interview,
and you're so gracious, Terry. And then finally, they won the Nobel Prize in October. So I said,
well, this is pretty good time. Worth the wait. And I said, let's do it the week of December 10th.
He said, ah, I'll be in Stockholm. Terry, much deserved and much congratulations on this phenomenal book.
And all your books, I can't recommend them highly enough. And I'm so glad that we have colleagues like you here.
Wonderful. I'm really pleased to have a colleague. I'm sorry that we didn't meet earlier. We should have.
Yeah, we should have, but we're going to hopefully have many more interactions to come.
I hope so. Thank you, Terry. Ambition comes in all shapes and sizes. At First Citizens Bank,
we roll with your goals because we're built for what you're building. Fit for your ambition for Citizens Bank.
Relax and let Ralph's delivery handle your grocery shopping this week. We start with only the freshest items,
then review your list and carefully choose each one.
Then we pack it all up and deliver it in as little as 30 minutes,
so you can feel confident it's what you ordered.
Fresh groceries, your way, with Ralph's delivery and pickup.
And right now, you can save $20 on your first delivery or pickup order.
Ralph's, fresh for everyone.
