Dwarkesh Podcast - Michael Nielsen – How science actually progresses
Episode Date: April 7, 2026Really enjoyed chatting with Michael Nielsen about how we recognize scientific progress.It's especially relevant for closing the RL verification loop for scientific discovery.But it's also a surprisin...gly mysterious and elusive question when you look at the history of human science.We approach this question stories like Einstein (who claimed that he hadn't even heard of the famous Michelson-Morley experiment which is supposed to have motivated special relativity until after he had come up with the theory), Darwin (why did it take till 1859 to lay out an idea whose essence every farmer since antiquity must have observed?), Prout (how do you recognize that isotopes exist if you cannot chemically separate them?), and many others.The verification loop on scientific ideas is often extremely long and weirdly hostile. Ancient Athenians dismissed Aristarchus's heliocentrism in the 3rd century BC because it would imply that the stars should shift in the sky as the Earth orbits the sun. The first successful measurement of stellar parallax was in 1838. That's a 2,000-year verification loop.But clearly human science is able to make progress faster than raw experimental falsification/verification would imply, and in cases where experiments are very ambiguous. How?Michael has some very deep and provocative hypotheses about the nature of progress. One I found especially thought-provoking is that aliens will likely have a VERY different science + tech stack than us. Which contradicts the common sense picture of a linear tech tree that I was assuming. And has some interesting implications about how future civilizations might trade and cooperate with each other.Watch on Youtube; read the transcript.Sponsors* Labelbox researchers built a new safety benchmark. Why? Well, current safety benchmarks claim that attacks on top models are successful only a few percent of the time, but the prompts in those benchmarks don’t reflect how real bad actors actually write. You can read Labelbox’s research here: https://labelbox.com/blog/the-ai-safety-illusion-why-current-safety-datasets-fool-us-on-model-safety/. If this could be useful for your work, reach out at labelbox.com/dwarkesh* Mercury has an MCP that lets you give an LLM access to your full transaction history, including things like attached receipts and internal notes. I just used it to categorize my 2025 transactions, and it worked shockingly well. Modern functionality like this is exactly why I use Mercury. Learn more at mercury.com* Jane Street’s ML engineers presented some of their GPU optimization workflows at GTC, showing how they use CUDA graphs, streams, and custom kernels to shave real time off their training runs. You can watch the full talk here: https://www.nvidia.com/en-us/on-demand/session/gtc26-s82065/. And they open-sourced all the relevant code: https://github.com/janestreet/gtc2026/. If this kind of stuff excites you, Jane Street is hiring — learn more at janestreet.com/dwarkeshTimestamps(00:00:00) – How scientific progress outpaces its verification loops(00:17:51) – Newton was the last of the magicians(00:23:26) – Why wasn’t natural selection obvious much earlier?(00:29:52) – Could gradient descent have discovered general relativity?(00:50:54) – Why aliens will have a different tech stack than us(01:15:26) – Are there infinitely many deep scientific principles left to discover?(01:26:25) – What drew Michael to quantum computing so early?(01:35:29) – Does science need a new way to assign credit?(01:43:57) – Prolificness versus depth(01:49:17) – What it takes to actually internalize what you learn Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe
Transcript
Discussion (0)
Today I'm speaking with Michael Nielsen.
You've done many things.
You're one of the pioneers of quantum computing, wrote the main textbook in the field of the open science movement.
You wrote a book about deep learning that Chris Ola and Greg Brockman credit them with getting them into the field.
More recently, you're a research fellow at the Astaire Institute and writing a book about religion, science, and technology.
I'm going to ask you about none of those things.
The conversation I want to have today is how do we recognize scientific progress?
And it's especially relevant for AI because people are trying to close the R.L verification loop on scientific discovery.
And what does it mean to close that loop?
But in preparing for this interview, I've realized that it's a more mysterious and elusive force, even in the history of human science than I understood.
And I think a good place to start will be Michelson Morley and how special relativity is discovered.
If it's different than the story that you kind of get off of YouTube videos.
Anyways, I will front you that way and then we'll go in there.
Okay, yeah, so Michaelson Mawley is one of the famous results often presented as this experiment that was done in the 1880s and that helped Einstein, you know, come up with the special theory of relativity a little bit later.
So sort of changing the way we think about space and time and our fundamental conception of those things.
And there's kind of a big gap, I think, between the way Michelson and Mawley and other.
people at the time thought about the experiment and certainly the way in which Einstein thought
or did not think about the experiment. In actual fact, he stated later in his life he wasn't even
sure whether he was aware of the paper at the time. There's a lot of evidence that he probably
was aware of the paper at the time, but it actually wasn't dispositive for his thinking at all.
Something else completely was going on. So what Michelson and Molley
thought they were doing was they thought they were testing different theories of what was called the ether.
So as you go back to the 1600s, Robert Boyle introduced the idea of the ether. And basically,
the idea of the ether is, you know, we know that sound is vibrations in the air. And then Boyle and
other people got interested in the question of like, is light vibrations in something? And they
couldn't figure out what it was. Boyle actually did an experiment where he tested whether or not you could
propagate light through a vacuum. He found that you could. You couldn't do it with sound.
So he introduced this idea of the ether. And then for the next 200 or so years, people had all
these kind of conversations about what the ether was and what its nature was. And the Michelson-N-Morley
experiment was really an experiment to test different theories of the ether against one another.
And in particular, to find out whether or not there was a so-called ether wind. So the idea was that
the earth is passing through maybe this ether wind.
And if it is passing through the ether wind,
sort of this background, and you shoot a light beam
sort of parallel to the direction the ether wind is going in,
it'll get accelerated a little bit.
And if it's being passed back sort of in the opposite direction,
it'll get slowed down a little bit.
And you should be able to see this
in the results of interference experiments.
And what they found, much to their surprise,
I think, was the infect,
there was no ether wind.
And that ruled out some theories of the ether, but not all.
And Michelson certainly continued to believe in the ether.
Okay, so this is what was a shocking part of reading this story from the biography of
Weinstein that you recommended by, what was his first name?
Abraham Pied.
Yes.
Settled as the Lord.
And then also from Imre Lakatos, the methodologies of scientific research programs.
The way it's told is that Michelson morally proved that the ether did not exist.
Therefore, it created a crisis in physics that Einstein solved special relativity.
And which you're pointing out is actually it was trying to distinguish between many different theories of ether.
You know, if you're in space or if you're on Earth, it's the same direction of ether.
Or maybe the ether wind is being carried around by the Earth.
And so you can't really experience it on Earth, but if you go to a high enough altitude, you might be able to experience it.
In fact, the Michelson's experiments, the famous one is 1887, but he conducted these experiments for basically two decades.
I mean, for longer than that, he conducted them.
I think the first one was in 1881.
But he continued to believe until, I mean, he died.
He died, I think it was like 1929 or so.
It was like the late 20s.
And he was still doing experiments in the 1920s,
sort of about whether or not the ether existed.
And so he continued to believe in the ether to the end of his life.
Right.
I think the last public statement he made is like a year or two before he died.
And he still believed, basically believed at that point.
And in fact, there was another physicist Miller who kept doing these experiments.
And in the 1920s, he thought that,
he went to a high enough altitude, is in Mount Wilson in California, where, oh, I'm high enough
that I can actually, the ether winds are not being dragged with up by the earth.
And I've measured the effect of the ether.
And Einstein hears about this and he says, this is where you get the famous quote,
subtle is the Lord, but malicious he is not.
Anyways, I think the reason the story is interesting, for many different reasons, but one is
one of the different ways in which the real history of science is different from this idea
you get of the scientific method is you really can't apply falsification as easily as you might think.
It's not clear what is being falsified.
Is it just another version of the theory of the ether that's being falsified?
Certainly you can't induce the theory of special relativity from the fact that one version of the
ether seems to be disconfirmed by these experiments.
Yeah.
So, I mean, it certainly doesn't show that ideas about falsification are wrong, falsified.
But it does show the most naive ideas, you know, things are often much more complicated than you think.
So, you know, Michelson did this experiment in 1881.
He was a very young man.
And then other people, I think Rayleigh was one of them, pointed out that there was some problems with the way he did it.
So they had to redo it in 1887.
And at that point, like a lot of the leading physicists of the day, leading scientists of the day, basically accepted this result that there was no ether wind.
But what to do about this?
So, yeah, sure, maybe you falsified some theories of the ether.
There are others that you haven't falsified at all at this point.
And people sort of set to work on developing those.
Actually, it is funny.
I mean, people will phrase it as, show that there was, you know,
that the ether didn't exist.
And even just the word the there is kind of a misnomer.
You know, you actually had a ton of different theories and a couple of leading contenders.
So, yeah, there's some version of falsification.
going on, but like how you respond to this new experiment is very, very complicated.
Yeah. And most people responded, I mean, suddenly the leading physicists of the day responded
by saying, okay, this gives us a lot of information about what the ether must be, but it
doesn't tell us that there is no ether. In fact, Lawrence, at the end of the 19th century,
before Einstein, figures out the math how you convert from one reference frame to another
reference frame, comes up with the Lawrence transformations, which is basically the basis of special
relativity. But his interpretation is that you are converting from the ether reference frame
to these non-privileged other reference frames if you're moving relative to the ether. And his
interpretation of length contraction and time dilation is that this is the effect of moving through
the ether and you have this pressure and that the pressure is warping clopps, it's warping measures
of length. And the interesting thing here is it experimentally, you cannot distinguish Lawrence's
interpretation from special relativity. Yeah, I think that's a strong statement. I mean, Lawrence introduces
this quantity called Local Time, which he regards, he's not trying, my understanding is he's not
trying to give a really a physical interpretation of this. But it's what Einstein would later just
recognize as time in another inertial reference frame.
And he's not trying to attribute much physical meaning to it.
I think Pankre gets much closer to later on to realizing that, no, actually, this is the time
that's registered by clocks.
But if you think about, you go, what is it, it's 40-odd years later, people start doing
these Mewon experiments where they see basically Cosm and Grace hit the top of the atmosphere.
They produce a shower of muons.
and you can look to see at different heights in the atmosphere, you can look to see how many of those muons remain.
And they decay over time.
And a very strange thing happens, which is that they're decaying way, way, way too slow.
So you sort of, you expect actually they shouldn't really, they shouldn't be able to sort of last the whole way through the atmosphere at all.
There's just their decay rate is too quick if you were in a classical theory.
But if in fact their time really has slowed down, it's okay.
And in fact, you know, the measured decay rates in 1940 and then there have since been more accurate experiments done match exactly what you expect from special relativity.
So, you know, that's the kind of thing where, again, if Lorenz had been alive, he'd been dead 10 or so years at that point.
If he'd been alive, you know, I'm sure he would have tried, well, it seems quite likely that he would have tried to save him.
theory by patching it up yet again. But it would have been a massive, I mean, that's a real
setback. It starts to just look like, oh, no, time is, you know, this thing that Lawrence
introduced as a mathematical convenience. No, no, no, that's actually what time is. Right.
For the muons, at least. And then, you know, there's a whole bunch of other experiments
that show this very similar phenomenon. And when was that experiment done?
There was, I think, 1940 or 19, it might have been published in 1941. So maybe then to
rephrase, change my claim. It's not that you could not have distinguished them, but the
scientific community adopted what we in retrospect consider the more correct interpretation
before it was actually empirically or experimentally shown to be preferred. So there's clearly
some process that human science does, which can distinguish different theories. Can I just interrupt?
I mean, you use the word process, and it's interesting to think about that term, like
process kind of carries connotations of, you know, it's something said in advance, it's something,
and it's much more complicated in practice. You have people like Lorenz who, I mean, Einstein
just absolutely utterly admired. And Poncaray, one of the greatest scientists who ever lived,
and Michelson, I mean, another truly outstanding scientist, never reconciled themselves. So it's not
as though there's like some standard procedure that we're all using to like reconcile these things.
No, like great scientists can remain long, very, can remain wrong for a very long time after the
scientific community has broadly changed its opinion. But there's nothing, there's no centralized
authority, right, sort of saying or centralized method. Yeah. I mean, that is the interesting thing.
That like there's, there's progress even though it is hard to articulate the process by which
happens, the heuristics that are used. Anyways, you mentioned Pankare. And so Lorence has the math
right, but the interpretation wrong. And you should explain, it seems like Pungare had the
opposite, where he understood that it's hard to define simultaneously because it requires
uncircular definition with time or velocity of something that might be signed, you know,
arrive at a midpoint together, but velocity is defined in terms of time. And I find this
interesting, there's a couple other examples we could call on, but like, there is this phenomenon
in the history of science where somebody asks a right question, but then they don't sort of
clinch it. And I'm curious what you think is happening in those cases. I mean, I think you sort of,
you actually do want to go case by case and try and understand it. It's not necessarily clear that
they're doing the same thing wrong in all the cases. I mean, the Plunkery case is amazing.
He seems to have understood the principle of relativity, the idea that the laws of
are the same in all inertial reference frames. He seems to have understood that the speed of light
is the same in all inertial reference frames. He doesn't actually phrase it quite that way,
but is my understanding, but I don't speak French. But, you know, and this is, I mean,
these are basically, these are the ideas that Einstein uses to deduce special relativity.
But then he also has this additional sort of misunderstanding where he thinks that length contraction
is a dynamical effect that somehow, you know, sort of particles are being pushed together by,
by some external force, some, something is going on dynamically.
And he doesn't understand that it's purely kinematics, that actually space and time
are different than what we thought, and you need to fundamentally rethink those things.
So it's almost like he knew too much.
He had sort of almost two grand a vision in mind, and Einstein's,
it sort of almost subtracts from that and says, no, no, no, no, it's space and time are just
different than what we thought. And, you know, here's the correct picture. And there's a paper in,
I think it's 1909 where Pankari, like, he's still got this dynamical picture of what's going
on with the length contraction. And we just, you know, this is just not necessary. This is,
this is a mistake from the modern point of view. And so why is he doing this? Like, why is he clinging
on to this idea. And I don't know. I've obviously never met the man. It would be fascinating
to be able to talk it over and to try and understand. But he, I mean, his expertise seems
to be getting in the way. He knows so much. He understands so much. And then he's not able to
let go of these things. Actually, a really interesting fact is that a few years prior, so 18,
90s, Einstein's a teenager. He believes in the ether too. Like he knows about this stuff.
But like he's just not, he's not quite as attached, obviously, as these older people were.
And maybe they were a little bit prisoner of their own expertise. That's my guess. I mean,
historians of science could, could, could, would, would, would certainly disagree.
Well, there's, then there's the obvious stories where Einstein himself later on is set to have not
latched on to the correct interpretations of, um,
quantum mechanics or cosmology because of his own attachments.
Yeah. I think that the bigger question I have is like the Mewon example is a great example of
these long verification loops and how progress seems to be happened by the scientific community faster than these verification loops imply.
Maybe the clearest example is Aristarchus in second century BC comes up with the idea of heliocentrism.
The ancient Athenians dismiss it on the grounds that well we should see
as the Earth is moving around the Sun, if really the Sun is the center of solar system,
the star should move relative to the Earth.
And the only reason that is not possible, that would not be the case is the stars are so far away
that you would not observe this.
And it's only in 1838 that stellar parallax is actually measured.
And so we didn't need to wait until 1838 to have Heliocentrism, right?
Like we didn't need to wait for the experimental validation to understand Copernicus is better
in some way.
In fact, when Copernicus first comes up with theories, it's well known that the Talomac
model was more accurate because it had all these centuries of adding on these epicycles
was maybe less well appreciated.
It was also in some sense simpler because Copernicus actually had to add extra epicycles.
It had more epicycles in the Ptolemaic model because he had this bias that the urge should
go in a perfect circle in equal time.
Anyway, I think this is an interesting story because it's like it's not more accurate.
It's not a simpler theory.
So how could you have known ex ante that Copernicus was correct and Ptolemy was not?
I mean, good question.
And I don't know sort of entirely the answer.
I do know.
Well, I mean, I can give you a certainly a partial answer that I sort of, you know, centuries in the future, you start to find very compelling.
And I'm sure it's sort of part of the historic story at least, which is one of the big shocks
for Newton eventually.
He did understand Kepler's laws of motion eventually.
So you're able to explain sort of the motions of the planets in the sky.
But he also, out of the same theory, his theory of gravitation, was able to explain terrestrial
motion, so he was able to explain why objects move in parabolas on the Earth.
he's able to explain the tides in terms of the sun's, the moon and the sun's effect,
gravitational effect on water on the earth. And so you have what seem like three very different
disconnected phenomena, all being explained by this one set of ideas. That, that I think starts to feel
that that's very compelling, at least to me. And I think most people find that very, very satisfying
once they eventually realize it.
Have you read the King's biography of Newton?
Oh, he's written, you've read an entire book?
No, no, no, the essay.
Yeah, yeah, sure, sure, sure.
I love that.
I mean, this description of him as the last of the magicians is wonderful.
In fact, I think it's maybe worth superimposing
or you should read out that one passage of the thing.
All right.
So it's from, actually, I believe it was a talk that he gave at Cambridge.
not long before he died. He'd acquired Newton's papers somehow, and then he gave a lecture,
I think twice about this, or that his brother Jeffrey gave it the other time because he was too
ill. There's just this wonderful, wonderful quote in the middle. Oh, actually, the whole thing
is really interesting, but I love this particular quote. Newton was not the first of the age of
reason. He was the last of the magicians, the last great mind, which looked out on the visible
and intellectual world with the same eyes as those who began to build our intellectual inheritance
rather less than 10,000 years ago. And like this idea that people have, that Newton was sort of
the first modern scientist is somehow wrong. He, I mean, there's some truth to it, but he really
had this very different way of looking at the world that was part sort of superstitious and part
modern. It was a funny hybrid. He's sort of this transitional figure in some sense.
That phrase, the last of the magicians, I think really really points at something.
The thing I'm very curious about with Newton is whether it was the same program, the same
he's same heuristics, the same biases that he applied to his alchemical work as he did to
the understanding of astronomy. So this is from the Keynes' essay. There was extreme method
in his madness. All his unpopular.
works on esoteric and theological matters are marked by careful learning, accurate method,
and extreme sobriety of statement. They are just as sane as the principia if their whole matter
and purpose were not magical. They were nearly all composed during the same 25 years of his mathematical
studies. So clearly there was some aesthetic which motivated people like Einstein to say, reject
earlier ways of thinking and say, no, the either is wrong and there's a better way to think about
things. Same with Newton. And the question I have is whether similar heuristics towards parsimony,
towards aesthetics, et cetera, would be equally useful across time and across disciplines,
or whether you need different heuristics. And the reason that's relevant is even if you can
build a verification loop for science, maybe if the taste has to point in the same direction,
you can at least encode that bias into the AIs, and that would maybe be enough?
Yeah, I mean, these questions, like, the point is that where we always get bottlenecked
is where the previous processes and heuristics don't apply, right?
Like, that's almost sort of definitionally what causes the bottlenecks.
Because people are smart.
They know what has worked before.
They study it.
They apply the same kinds of things.
And so they don't get stuck in the same places as before.
They keep getting bottlenecked in different places.
I mean, that's overgeneralizing a bit, but I think it's the right.
Like, if you're attempting to reduce science to a process,
you're attempting to reduce it to something where there is just a method,
which you can apply and, you know, you turn sort of the crank and out pops inside.
Sure, I mean, you can do a certain amount of that,
but you're going to get bottlenecked at the places where your existing method doesn't apply.
and definitionally, there's no crank you can turn.
You need a lot of people trying different ideas.
And sort of the more difficult the idea is to have, the grade of the bottleneck,
but then also sort of the greater of the triumph.
Quantum mechanics is like, I mean, it's a great example of this.
It's such a shocking set of ideas.
It's such a shocking theory.
Actually, the theory of evolution in some sense is also quite a shocking idea,
not the principle of the sort of natural selection,
but that it can explain so much, that's a shocking idea.
Existing safety benchmarks claim that, at least for today's top models,
attacks are only successful a few percent of the time.
This sounds great, but Labelbox researchers were able to jailbreak these very same models
about 90% of the time, even the ones that have the strongest reputation for safety.
And the disconnect here is that the prompts which underlie these public safety benchmarks
are all framed in a very naive way.
There's no attempt to disguise harmful intent.
These prompts will just ask models to hack into a secure network
and to do so without getting caught.
But real bad actors don't write like this.
So Labelbox built a new safety benchmark from the ground up.
Their prompts reflect real adversarial behavior
by stripping out obvious trigger phrases
and wrapping their requests in fictional scenarios.
For example, instead of outright asking an LLM
to steal somebody's identity,
the prompt will frame it as a game.
A light bearer who's trying to hide from dark forces needs a handbook on how to disguise themselves as somebody else.
This safety research is linked in the description.
If you think this can be useful for your own work, reach out at labelbox.com slash thwar cash.
So Principia Mathematica is released in 1687.
The origin of species is released in 1859.
At least naively, it seems like Darwin's theory, the theory of natural selection, is conceptually easier
than the theory of gravity.
I asked Terrence Tau with this question.
But yeah, there was this contemporaneous biologist
with Darwin, Thomas Huxley, who read this and said,
how extremely stupid did not have thought of this.
And nobody ever reads the Fisherian Mathematica and thinks,
God, why didn't have beaten you into the punch here?
No.
And so, yeah, what's going on here?
Why did Darwinism take so much longer?
Yeah.
the idea must have been known to animal breeders for a long time at some level.
Or certainly large chunks of the idea were known.
Artificial selection was a thing.
And in some sense, Darwin's genius wasn't in having that idea.
It was understanding just how central it was to biology.
that you can potentially sort of go back
and you can explain a tremendous amount
about all of the variety of what we see in the world
with this as not necessarily the only principle
but certainly a core principle.
And so he writes this wonderful, wonderful book,
The Origin of Species.
And it's just so much evidence
and so many examples
and sort of trying to tease this out
and see what the implications are.
and to connect it to as much else as he possibly can,
to connect it to geology and to connect it to all these other things.
So that's sort of hard work that, you know, making the case
that it's actually relevant all across the biosphere,
you know, is what he's doing there.
He's not just having the idea.
He's making a compelling case that, no, it's intertwined
with absolutely everything else.
Yeah, the motivation of the question was
Lucretius, who is this first century,
Roman poet has an idea that seems analogous to a natural selection about, you know, species
get fitted more to time over time to their environments or species inducing it to their
environment. And so we're like, okay, well, why did this go nowhere for 19th centuries? And then
I looked into it or more accurately asked LLM's what exactly was Lucretius' idea here. And it actually
is extremely different from what real natural selection is. He thought there was this generative
period in the past where all the species came about and then there was this
one-time filter, which resulted in the species that are around today, and they became
fit to the environment.
He did not have this idea that it is an ongoing gradual process or that there is a tree of
life that connects all life forms on Earth together, which is, by the way, it's an
incredibly weird fact that every single life on Earth has a common ancestor.
It's not incredibly, it's not incredibly weird, right?
If you think that the origin of life must have been very hard, like that there's a bottleneck there,
that it's not so surprising.
Yeah. There's also this verification loop aspect where even if Newton might be harder in some sense,
if you've clinched it, you can experimentally, I know validated is the wrong word philosophically,
but you can give a lot of base points to the theory. You can be like, okay, I have this idea of why
things fall on Earth. I have this idea of why orbital periods or planets have a certain pattern.
Let's try it on the moon, which orbits the Earth. Yeah, yeah. And in fact, you know, it's weird.
The orbital period matches what my calculations imply. And the tides work correctly.
Exactly. Yeah. It's just amazing.
Whereas for Darwinism, it takes a ton of work for Darwin to compile all this sort of cumulative evidence, but there's no individual piece that is overwhelmingly powerful.
And there's a whole bunch of problems as well.
Like he doesn't really understand what, you know, sort of the mechanism of what the mechanism is.
He doesn't understand genes, like all these things.
The very interesting thing in the history of Darwinism is this idea which sort of theoretically you could come up with at any time, there is almost identical independent creation of that idea between.
Alfred Wallace and Charles Darwin.
So much so that I think Wallace sends his manuscript to Darwin is like, what do you think of this idea?
And I was like, fuck.
I don't think that's an exact quote, but I think it's pretty much correct.
And then so they actually end up presenting their ideas together in the spirit of sort of sportsmanship.
And so then, yeah, why was this spirit in the 1860s or 1850s?
What was that the right time for these ideas?
When you come up with different ideas.
One is geology.
So in 1830s, I think Charles Lyle figures out that there's been millions.
and billions of years of time that exists on an earth,
then paleontology shows you that actually organisms have existed,
fossils have existed for that entire time.
So life goes back a long time.
And in fact, you can even find fossils for intermediate species
that show you the tree of life.
In fact, between humans and other apes as well,
there's intermediate humans.
There's the age of colonization,
and you have all these voyages,
we're going to do this biogeography.
And I guess that all must have been necessary
because, in fact, there's a huge history.
your parallel innovation and discovery in history of science.
So maybe it is another piece of evidence to actually more had to be in place for given idea
to be discovered because if it's not discovered for a long time and then spontaneously, many
different people are coming up with it that shows you that actually the building blocks were
in some sense necessary.
Yeah, yeah.
I mean, I think, I mean, this example of Lyle and other biologists, excuse me, other geologists,
you know, sort of early 1800s basically coming, you know, having this idea of deep time
That does seem to have been crucial.
I know Darwin was very influenced by Lael.
And if you don't have at least sort of tens or hundreds of millions of years,
evolution just starts to look like a non-starter.
We should be seeing radical change.
In order to make it work on sort of a time scale of, say, 5,000 to 10,000 years or 6,000 years,
Bishabasha, you would need to be seeing evolution.
occurring in a massive rate sort of during human lifetimes, and we're just not seeing that.
So that does seem to have been a blocker.
It's interesting to, I mean, to your question, like, what other blockers were there?
Were there any others?
And I don't know.
Right.
Or, yeah, how much earlier could you, in principle, I've come up with that if you're
much smarter?
Actually, let me, I mean, just go back to sort of zoom out to your original question.
So you're talking about sort of the verification loop in AI.
And you're something, an example I think that should give you pause there is, you know,
the big signature success so far is certainly AlphaFold.
Yeah.
And of course, alpha-fold really isn't about AI.
You know, a massive fraction of the success there is the protein data bank.
So it's X-ray diffraction, it's NMR, it's CryoM, and the several billion dollars that
was spent obtaining whatever is 180,000 protein structures.
So sort of that, you know, it's basically the story of we spent many, many decades obtaining
protein structure just by going out and looking very hard at the world experimentally.
And then we fitted a nice model at the end of it, and that was like a tiny fraction of the entire
investment.
But it's definitely not, you know, that's a story of data acquisition.
Yeah.
Principally, it's not only.
I mean, the AI bit is very, very impressive.
It's quite remarkable.
But it is only a small part of the total story.
Off of a fold is very interesting
and I philosophically I wonder what you think of it as
scientific theory or scientific explanation
because if over time
I guess the world has become harder to understand
I'm going to
as I'm saying things because you're such a
careful speaker I'm
I say this phrase and I'm like
Is that it? Will he actually buy that premise
But yeah there's you know we need to fit models of things
rather than at least in some domains we're trying to fit models
things rather than coming up with underlying principles that explain a broad range of
phenomenon. And so it compares say the theory of general relativity with, um, or any theory which
just met's out to some equations versus alpha fold, which is encoding these different
relationships between different things we can't even interpret over a hundred million
parameters. And are those really the same thing? Because GR can predict things you could have
never anticipated or was never meant to do. Like why it is
Mercury's orbit precess, an alpha-fold is not going to have that kind of explanatory reach.
And I want to get your reaction to that.
Yeah, I think it's an incredibly interesting question.
I mean, maybe a really pivotal question.
In the sense of, so, you know, if you search like a very classic point of view,
you want these deep explanatory principles, you want sort of as few free parameters
as you possibly can, you want very simple models.
which explain a lot.
And AlphaFold doesn't look anything like that.
And so you might just sort of say,
oh, well, it's nice, it's maybe helpful as a model,
but it doesn't have, it's not a scientific explanation.
So that's kind of, that's like a conservative point of view.
That's sort of, I don't know, answer one to the question.
I think answer two is to say something like,
maybe you shouldn't think about AlphaFold, you know,
as an explanation in the classic sense.
But maybe it contains lots of little explanations
inside it. And so maybe part of what you can get out of like, you know, interpretability work is you can
go into AlphaFold and you can start to extract certain things. Maybe, maybe basically by doing sort of,
you know, archaeology of AlphaFold, we can actually understand a great deal more about these
principles. You can start to extract it. Oh, that circuit does this interesting thing and we
learn this. So I don't know to what extent that's been done with AlphaFold. I know it's been done
a little bit with some of like the chess models. I believe it's Alpha Zero.
There seemed to be some strategies, which were certainly borrowed by Magnus Carlson, at least,
which he seems to have just taken from Alpha Zero.
I mean, I don't think there's any public confirmation of this, but there were, you know,
some experts have noticed that he changed his game quite radically after some sort of,
some public forensics were released on how Alpha Zero worked.
So that's kind of sort of an example where I think human beings are starting to extract meaning out of these models.
And maybe that starts to lead to sort of, sort of viewing the models as a
source of, potential source of explanations.
You need to do more work because they're not very legible up front, but you can
extract them potentially.
And I think that's kind of, I think that's kind of an interesting intermediate situation
where they're not explanations, but you can extract interesting explanations out of them.
You can use them as kind of a source.
And I think like the third and the most interesting possibility is, no, they're like
they're a new type of object in some, in some sense.
They should be taken very seriously as explanations, but we're in the past, we haven't had the
ability to really do anything with them.
And now we're going to have sort of new, interesting new sort of actions which we can do.
We can merge them.
We can distill them.
We can do all these kinds of things.
And there's going to be sort of almost a new, it's a big opportunity sort of in the philosophy
of science to start to do that.
There's sort of like an anticipation of this in some sense, I think in the way.
certainly I know some mathematicians and physicists who, I mean, historically, if you had like a 100-page
equation, which, and that's the kind of thing that does come up, I mean, there's just nothing
you can do if it's 1920. There is nothing you can do. At that point, you give up on the problem.
And now today, with tools like Mathematica, you can just keep going. And so that's an object now.
That's a thing that you can work with.
And there are examples where people work with these things that formerly were regarded as too complicated.
And sometimes they get simple answers out of the end.
That's just an intermediate working state.
And so I sort of wonder if there's going to be something similar is going to happen in this particular case
where you can take these models and sort of just use them in a little bit the same way people do with Mathematica
and take them seriously as they're not explanations in the classics in,
but there'll be something else which interesting operations can be done on.
The thing I worry about is suppose that you, it's 1,600 and you're training,
or 1,500 in your training a model on, this is a weird history where we developed deep learning
before we had cosmology.
But suppose we live in that world and you're observing how there's the stars, they don't
seem to move, the planets, have all these weird behaviors.
And then you train a model on that and then you do something.
some kind of intrep on it and I'm trying to figure out, well, what are the patterns we see here?
What you'd see are just these, you just be able to keep building on Ptolemy's model.
You'd see like, oh, there's more epicycles we didn't notice.
There's another epicycle.
It's the parameters, whatever to whatever encode epicycle this, parameters, whatever, encode the next epicycle.
So if you were just trying to figure out why is the solar system the way it is from observational data,
you could just keep adding epicycles upon epicycles, but it really took one mind to integrate it all in and say,
here's my, here's the, here's the, here's what makes more sense overall.
So, so, I mean, they're like, like, you know, I mean, this is sort of to my point that we,
we don't really understand what to do with the models, like, sort of, we don't have, like,
the verbs necessarily yet. But, you know, it's certainly interesting to think about the question,
you know, where you start to apply constraints to the models, you know, it's sort of essentially
saying, what's the simplest possible explanation?
or, you know, can you simplify?
Can you give me sort of the 90-10 explanation?
Can you go further and further and further in boiling it down?
So it might be that indeed they sort of start out
by providing a very, very complicated many, many-pani parameter model.
But you can just force the sort of the case
and basically that's scaffolding,
which maybe they, you know,
is sort of the very early days of their attempt to understand something,
but they're forced through that to a much more simple understanding.
So start from misunderstanding, but it sounds like you're saying maybe there's some sort of regularizer
or some sort of distillation you could do of a very complicated model that gets you to a truer,
more parsimonious theory.
But yeah, just take Ptolemy versus Copernicus, right?
So you start off with lots of Ptolemyc epicycles and then you try to distill this model
and maybe gets rid of some of the epicycles that are less and less.
sort of necessary to get the mean squared error of the orbits to match.
But at some point it has to do this thing, which is like switch to things.
Yeah, yeah, yeah.
And locally, it actually doesn't make things more accurate.
It's sort of in a global sense that it's a more progressive theory.
Yeah, yeah.
And there's some process which obviously humanity did it over span, which did that regularization
or did that swap.
But if raw gradient descent, it seems like I don't really feel like it would do that.
Well, you think about the example of going from Newtonian gravity to Einstein's general
theory of relativity.
And these are, I mean, these are shockingly different theories.
And the question, you know, is like what causes that flip?
And as nearly as I understand the history, you know, what goes on is Einstein develops
special relativity.
And pretty much straight away, he understands.
I mean, it's a very obvious observation in special relativity.
influences can't propagate faster than the speed of light.
And in Newtonian gravity, action is at a distance.
In fact, it's straight away in special relativity.
You could use Newtonian gravity to do faster than light signaling.
You could send information backwards in time.
You could do all kinds of crazy stuff.
And so it's not a big leap to realize, oh, we have a big problem here.
And so, you know, that's kind of the forcing function there.
it's it's you've realized that your old explanation is not sufficient you need something new and then
you're gonna you're just gonna you're gonna you're gonna start by doing the simplest you know possible
stuff and it just turns out that a lot of that stuff doesn't work very well and so you sort of forced
in fact it is interesting you know he's sort of forced to go through these steps of gradually it
gets quite more complicated and it's sort of wrong in a variety of ways and the final theory
appears really shockingly simple and beautiful,
but it's gone through some somewhat ugly intermediate stages.
Yeah.
Yeah.
So if you're thinking about what does it look like to have AI accelerate science?
There's one for maybe well understood domains where we just want local solutions like,
how does this protein fold, we just train a raw model using gradient descent.
Then there's things like coming up with general relativity, where you couldn't really just train
on every single observation in the universe
and hope that general relativity pops out.
And so what would it require?
Well, it also certainly wasn't immediately discovered, right?
So it was a lot of decades of thought.
And I guess you need independent research programs
where people start off with these biases
where Einstein is just initially motivated
by this thought experiment of, you know,
can you distinguish the effect of gravity
from just being accelerated upwards?
And then you just need different AI thinkers to start off with these initial biases and see what can germinate out of them.
And then the verification loop for that might be quite long, but you just need to keep all those research programs alive at the same time.
Yeah, I mean, I think there's like, I mean, this point that you make about sort of keeping all the different research programs alive, like that, that I think is very important and somehow central.
I mean, a great example is situations where the same answer has been correct in some circumstances
and wrong in other circumstances.
So the planet Uranus was not in quite the right spot and people very famously predicted
the existence of Neptune on this basis.
Wonderful, massive success for Newtonian gravity.
The planet Mercury is not in quite the right spot.
you predict the existence of some other distorting a planet.
Turns out that doesn't exist.
Actually, the reason Mercury is not in the right spot is because you need general relativity.
And so you've sort of, you've pursued very similar ideas and has been very successful in one case
and it's been completely and utterly unsuccessful in the other case.
And I think, I mean, a priori, you can't tell which of these is the thing to do,
and you actually need to do both.
Yeah.
And so, I mean, this is certainly, it's very true in the history of science,
that, you know, this kind of diversity where you just have lots of people go off and
pursue lots of potentially promising ideas, you just need to support that for a long time.
And it's, I mean, it's hard to do that for a variety of reasons.
But it does seem to be very, very, very important.
So this example of Uranus versus Mercury is very interesting.
In one, I think it illustrates sort of the difficulty of falsification.
Like the orbit of Uranus is in some sense falsifying Newtonian mechanics, but then you say you make some ancillary
prediction that says, oh, the reason this is happening is there must be another planet, which is affected, perturbing Uranus's orbit, and I think it's LeVirier in 1846.
Point a telescope in the right direction, you find Uranus.
Neptune.
Oh, it's there, yeah, Neptune, yes.
But with Mercury, yeah, it's observed that it's the ellipse which forms this orbit.
it is rotating 43 arc seconds more every century than Newtonian mechanics would imply.
So people say that there must be a planet inside Mercury's orbit, they call it Vulcan,
and point in telescopes it's not there.
But if you're a proper Newtonian, what you do is say, well, maybe there's some cosmic dust
that's occluding this planet.
Or maybe the planet is so small we can't see it.
Or maybe there's some, let's build an even more powerful telescope.
Or maybe there's some magnetic field, which is sort of occluding our measurements.
And this happens over and over, right?
Like, you know, there's just so many stories which are exactly like this.
Right.
I mean, an example I love from, you know, in the 1990s,
some people noticed that the pioneer spacecraft weren't quite where they were supposed to be.
And so, you know, you can get very excited about this.
Oh, my goodness, general relativity is wrong.
We have like, you know, maybe we're going to discover the next theory of gravity.
And today, the accepted explanation is that, no, actually, there's just a slight asymmetry
in the spacecraft, it turns out that the thermal radiation is like slightly larger in one direction
than the other and that's causing a tiny little acceleration towards the sun. And most of the time,
when there's these apparent exceptions, it's just something like that's going on. It's very much
like the Vulcan, the Mercury Vulcan case. But every once in a while, it's not. And A prior, you can't,
you can't distinguish these. But I mean, science is just full of these. It's funny too,
Like the way we tell the history of science, it sounds so simple.
Like, oh, you just focus on the right exception and, you know, you realize that you need to throw out the old theory.
Right.
And lo and behold, your Nobel Prize awaits.
But in fact, these exceptions are all over the place.
And 99.9% of the time, it just turns out to be some effect like this thermal acceleration in the case of the pioneer spacecraft.
So, you know, sort of that, unfortunately, there's a lot of selection bias going into those stories.
And the thing is, there's no ex-ante heuristic, which tells you which case you're in.
And just to spell out why I think this is important, is because some people have this idea that
AI is going to make disproportionate progress towards science, because it makes disproportionate progress
towards domains where there's tight verification loops.
And so it's really good at coding because you can run unit tests.
And science may be similar because you can run experiments.
I think what that doesn't appreciate, one, is that experiments actually don't,
there's an infinite number of theories that are compatible with any given experiment.
And over time, why we glob onto the, at least in retrospect, we think is a more correct one,
is, as we're discussing in this conversation, sort of hard to articulate.
Lactatos actually has all kinds of interesting examples in the book about these kinds of hostile verification loops
that are extremely long-lasting.
So one, he talks about his prout or prout,
I don't know how to pronounce it,
but there's this chemist in 1815.
He hypothesizes that all atomic nuclei
must have whole number weights.
They're basically all made of hydrogen.
And the reason he thinks this is because
if you look at the measure rates of all elements,
it does seem that almost all of them do have a whole number rates.
But then there's some exceptions.
Like, for example, chlorine comes out of 35.5.
And so then there's all these ad hoc theories that people in this school keep coming up with, like, oh, maybe there's chemical impurities.
But then there's no chemical reaction you can do which seems to get rid of this.
Maybe it's fractions of whole number, so it's 35.5. It can be halves.
But actually you measure chlorine even closer.
It's 35.46.
So it's actually getting further away from the correct correction.
And later on, what is discovered is what you're actually measuring is different isotopes, which cannot be chemically distinguished.
They can only be physically distinguished.
But so then you just had it 85 years before we realized what an isotope is were the verification
was actually actively hostile against you, against the correct theory, and you just need this
remnant to be defending.
There's no extent of reason it's the preferred theory.
As a community, we should just have people defend, try to integrate new observations, even if they don't seem to fit their school of thought with what they believe.
And hopefully, enough of that happens.
Anyways, yeah, I guess the thing that I'm trying to articulate is,
the difficulty with automating science.
Yeah, I mean, the question is, where is the bottleneck at some level of,
and sort of, you know, are we primarily bottlenecked on one thing or one type of thing,
or are we bottlenecked on sort of multiple types of thing?
So, you know, certainly talking to structural biology people,
they seem to think that Alpha Fold was an enormous advance.
It was a shock.
So at some level, yes, AI can, you know, it seems certain it can help us speed up science.
So it is helping with a certain type of bottleneck.
That doesn't mean, though, as you're saying,
that it's necessarily going to help with all kinds of bottlenecks.
And sort of, I suppose the question you're pointing at is,
like, what are the types of bottlenecks that remain
and what are the prospects for getting past them?
I think even in the case of coding, like it's really interesting.
You know, talking to programmer friends, at the moment,
they're all in this state of shock and high excitement,
and they're all over the place actually kind of talking to them.
You do wonder, like, where is the bottleneck going to move to?
So certainly one thing that a lot of them seem to be bottlenecked on
is now having interesting ideas,
and in particular having interesting design ideas.
So there's not really a verification loop for knowing,
oh, that design idea is very interesting.
So they're no longer nearly as bottlenecked by their ability to produce code,
but they are still bottlenecked by this other,
by this other thing, they always were, they were, formally, they weren't bottlenecked on it because,
you know, just writing code was, took so much of their time, they could sort of have lots of
ideas while they were, you know, they take their three weeks to implement their prototype and
then they would implement the next version. Now they're taking three hours to implement the
prototype and they don't have, you know, as good ideas sort of after that from a design point of
last year, I predicted that by 2028, AI would be able to prep my taxes about as well as a competent
general manager. But we're already getting pretty close. As I shared before, I used Mercury both for my
business and my personal banking. So I recently gave an LLM access to my transaction history across
both accounts through Mercury's MCP. I asked it to go through all my 20205 transactions and flag
any personal expenses that seem like they should actually be charged to the business. And this
worked shockingly well. Mercury's MCP exposed.
a bunch of detailed information, things like notes and memos and any JPEGs of receipts and
PDF attachments. So my LLM had plenty of context to work with. One of my favorite examples
happened with the charge to Bay Paddle. If you looked at the vendor alone, you would have had to
assume that it's a personal expense. But the LLM looked at the receipt and the attached note in Mercury
and realized this was actually a team bonding exercise from our last in-person retreat. So a legitimate
business expense. I imagine it will be a while before traditional banks have MCP.
Functionality like this is why I use Mercury. Go to mercury.com to learn more.
Mercury is a fintech company, not an FDIC insured bank. Banking services provided through Choice
Financial Group and Column NA members FDIC. You have a very interesting take. I think it was a
footnote when I know where you're S is and they couldn't find it again, which was that it's very
possible that if we met aliens, that they would have a totally different technological stack
than us. And that contradicts, I guess, a common sense assumption I had that I never questioned,
which is that science is the thing you do relatively early on in the history of civilization
where you get to a point and you have a couple hundred years of just cranking through
the basics, understanding how the universe works, et cetera, and you've got it. You've got science.
And then basically everybody would converge on the same, quote-unquote, science. And so I found that a very
interesting idea and I want you to say more about it. Yeah, I mean, I think probably the idea
there that I'm at least somewhat attached to is the idea that sort of the tech tree or the
science in tech tree is probably much larger than we realize. I mean, we're sort of in this
funny situation. People will sometimes talk about, you know, a theory of everything as a potential
goal for physics. And then there's this funny situation.
there's this presumption somehow that physics is done once you get there.
And of course, this is not true at all. If you think about computer science,
computer science basically got started in the 1930s when Turing and Church and so on
just laid down what the theory of everything was. They just said, you know, here's how
computation works. And then we've spent 90 or years since then just exploring
consequences of that and gradually building up more and more interesting ideas. And those ideas are, to some
you can just regard as technology, but to some extent, insofar as they're sort of discovered
principles inside that theory of computation, I think they're best regarded as science, and in some
cases, very fundamental science, ideas like public key cryptography are, I mean, they're just
incredibly deep, very non-obvious ideas, which in some sense lay hidden already sort of in the 1930s.
And so my expectation is that different, there will be different ways of explore.
this tech tree and we're still relatively low down. We're still at the point where we're just
understanding these basic fundamental theories and we haven't yet explored them.
Sort of a thing which I think is quite fun is if you look at just the phases of matter.
When I was in school we'd get taught that there are three phases of matter or sometimes
four phases of matter or five phases of matter depending a little bit on what you included.
And then as an adult, as a physicist, you start to realize, oh, we've been adding
to this list.
We've got sort of superconductors and superfluids
and maybe different types of superconductors
and Bose-Einstein condensates
and the quantum hall systems
and fractional quantum hall systems
and and and and and and and.
And it's something to turn out.
It looks like actually there's a lot of phases of matter
to discover.
And we're going to discover a lot more of them.
And in fact, we're going to be able to start
to design them in some sense.
I mean, we'll still be subject to the laws of physics.
But there is a lot of,
this sort of tremendous freedom in there. And this looks to me like, oh, we're down at sort of the
bottom of the tech tree. We've barely gotten started there. And I expect that to be the case
broadly. Certainly in terms of, I think programming is a very natural place to look. The idea
that we've discovered all the deep ideas in programming just seems to be sort of obviously
ludicrous. We keep discovering sort of what seems like deep, new, fundamental ideas. And
I mean, we're very limited.
We're basically slightly jumped up chimpanzees.
So we don't, you know, we're slow and it's taking us time.
But, you know, what do we look like sort of another million years in the future in terms of, you know, all of the different ideas which people have had around how to how to manipulate computers, how to manipulate information?
I think we're likely to discover that actually there are a lot of very deep ideas still to be discovered.
It's a nice, who was it?
I think it was Canuth in the preface to the art of computer programming.
So something like, you know, he started this book back in the 60s and he talked to a mathematician.
It was a bit contemptuous and said, look, computer science isn't really a thing yet.
Come back to me when there's a thousand deep theorems.
and Canuth remarks, and he's writing this now decades later, the preface, there clearly are
a thousand deep theorems now.
And that means, like, it's really interesting to sort of think of it.
Like, what's the long-term future?
As you get higher and higher up in the tech tree, like choices about which direction we go and
sort of how we choose to explore, you know, I think it's potentially the case that we're, you know,
different civilizations or different choices mean that we end up in different parts of that tree.
And in particular, just things, I mean, it's sort of very basic things about, you know,
we're very visual creatures, certain other animals are much more orally based.
Does that bias sort of the types of thoughts that you have?
And then you extend it, you know, to sort of much more exotic kinds of civilizations
where maybe just sort of their biases in terms of how they perceive
and how they manipulate the world are maybe quite different than ours,
and that might make some significant changes
in terms of how they do that exploration of the tech tree.
It's all speculation, obviously.
No, this is such an interesting take.
I want to better understand it.
So one way to understand it is that there might be some things
which are so fundamental and have such a wide collision area
against reality that they're inevitably going to discover like general relativity.
Numbers.
Numbers.
Yeah, yeah.
Like, of all of the intelligences in the Milky Way galaxy, maybe that number is one, actually arguably
we've already increased the number.
But of all of those, what fraction of the concept of counting?
And it does seem very natural.
What fraction have discovered the idea of
some kind of decimal place system.
Interesting question.
And maybe we're missing something really simple and obvious.
That's actually way better than that.
What fraction got there immediately?
What fraction sort of had to go through some other intermediate state?
What fraction use linear representations
versus say, I don't know, a two-dimensional
or a three-dimensional representation?
I think the answers to these questions
are just not at all obvious.
It's a lot of design freedom.
On the article computer science, this is
going to be extremely naive and arrogant.
But I took Scott Erinsson's, you know, class on complexity theory, and that was by far the
worst student he's ever had.
But what I remember is, like, there was this period that you were, you know, you were
the pioneers of where we figured out, here's the class of problems that quantum computers
can solve and how it relates to problems as a class of computer.
It's always like groundbreaking, oh, crazy, this works.
and then since then it's been
this literally it's called
Complexity Zoo, this website which lists out
here's all the complexity classes
and if you have this complexity class
with this kind of Oracle
it's sort of equivalent to this other class
and that it feels like we're building out that
taxonomy and so
there's a couple ways to understand what you're saying
one maybe you just disagree with me
that this is actually what's happened with this field
another is that while that might
happen to any one field
the amount of fields who would have thought
in 1880 the computer science
other than Bavitt or something
the computer science is going to be a thing
in the first place. So the amount of field, we're underestimating how many more fields there could be.
Yeah, yeah, for sure. Or maybe you think both, or maybe a third secret thing, but I'd be curious.
I mean, a very common argument here is sort of the low-hanging fruit argument, the argument that says,
oh, there should be diminishing returns. And in fact, empirically, we see this, right? The amount of
scientists in the world is just exponentially increased. And, and, I mean, I think it's, you know,
it's worth thinking about like why, why do you expect diminishing returns and how well does that
argument actually apply in practice? And an analogy I like is actually thinking about sort of
going to some event, going to a wedding or whatever, and you go to the dessert buffet. And they've
put out, you know, 30 desserts. And of course, naturally what people do, right, the best desserts go
first. I mean, we don't quite have a well-ordered preference there, so maybe there's some difference.
But human beings are fairly similar.
So the best desserts will go first.
And this is an argument for why you expect diminishing returns
in a lot of different fields.
If it's relatively easy to see what's available
and people have similar preferences,
then the best stuff goes first.
And it just gets worse and worse after that.
And sort of, if you, you know, a very static snapshot in time
of scientific progress, maybe there's some,
truth to that. But if somebody is standing behind the dessert table and is replenishing,
restocking the desserts and keeps kind of adding new ones in, it may turn out that a little bit
later, much better desserts appear. And so you're going to go and you're going to go and eat those
instead. And scientific progress has a little bit of that flavor. You know, we go through
these sort of funny time periods. Computer science is a great example where computer science
basically arose as sort of a side effect of some pretty abstruse questions in the philosophy
of mathematics and logic. And so you've got these people trying to attack these rather esoteric
questions that seem quite high up in some sense in sort of exploration, quite esoteric.
And they discover this fundamental new field and all of a sudden there's an explosion
there. So sort of the diminishing returns argument just didn't deploy there. We just weren't able
to see what was there. And this has been the case over and over and over again. So new fields
arrive and all of a sudden, boom, it's actually easy to make progress again. Young people
flood in because you can be 21 and make major breakthroughs rather than having to spend 25 years
mastering everything that's been done before. It's obviously very attractive. And I don't understand,
And I'm not sure anybody understands very well.
Sort of the dynamics of that, like how to think about why the structure of knowledge is that way,
that these new fields keep opening up.
But it does seem empirically, at least, to be the case.
Despite the fact that that is a case,
take deep learning, right?
Obviously, this is an example of a new field where the 21-year-olds can make progress,
and it's relatively new, 50s.
years or so when it sort of gets back into high gear.
But already we're in a stage where you need billions or tens of billions or hundreds of
billions of dollars to keep making progress at the frontier.
And there's a couple of ways to understand that.
One is that it actually is harder than the kinds of things the ancients had to do or
requires more, is more intensive at least.
Second is it might not have been, but because our civilization resources are so large,
the amount of people is so large, the amount of money is so large,
that we can basically make the kind of progress
that would have taken the ancients forever to make,
almost immediately.
We notice something is productive,
immediately dump in all the resources.
But it's also weird that there's not that many of them.
Like, I feel like deep learning is notable
because it is one big exception to the fact that it's hard to think of other examples.
I mean, I think it's a consequence of sort of the architecture of attention, right?
Like at any given time, there's always a,
sort of a most successful thing.
Maybe if deep learning wasn't a thing,
maybe you'd be talking about CRISPR,
maybe you'd be talking about whatever it is.
Maybe we wouldn't think about solving
sort of the protein structure prediction problem
as a really a success of AI.
Maybe we would have figured out how to doing it
with sort of curve fitting, like more broadly construed
and we'd just be like, oh wow,
like we took a lot of computing resources,
but protein structure prediction might be an enormously,
important thing. So there is always sort of our biggest thing. And I think what you're pointing
at is more a consequence of the way in which attention gets centralized. It's basically fashion
is sort of what I'm saying. It's not just fashion, but there is some dynamic there.
There's a very interesting and important implication of this idea that the branching is so
wide and so contingent and so path-dependent, that different civilization would stumble on
entirely different technology sex.
There's a very interesting implication that there will be gains from trade into the far,
far future, which might actually be one of the most important facts about the far future,
in terms of how civilizations are set up, how they can coordinate, how they interface with,
like, there's not this like go forth and exploit.
It's actually, there are humongous gains to trade from adjacent colonies or whatever.
That, yeah, sort of. There's a question of like what's actually hard. You know, if it's a question of if it's just the ideas, well, those spread relatively quickly. It's relatively easy to share ideas. If it's something more, it's almost sort of a Dan Wang kind of an idea where it's actually sort of there's some notion of capacity. You need all the right text. You need all of the right manufacturing capacity and so on. And so, you know, Civilization A has very different kind of manufacturing.
capacity and it's just not so easy to build in civilization B, even if civilization B is kind of ahead,
then I think that that becomes true. There is actually, you know, comparative advantage, which is really
worth, I mean, is going to provide massive benefits to trade in both directions. Eventually,
you're going to expect some diffusion of innovation. It is funny like to think about what the barriers
are there. A fun thought experiment I like to think about is sort of GitHub, but for aliens,
So, you know, somebody presents you with all of the code from some alien civilization.
And I mean, I don't even know what code means there, but this is sort of their specification of algorithms.
And it's so interesting, like, it would have many interesting new ideas in there and it would take forever for human beings to dig through and to try and extract all of those.
Because one reason I mean the origin of this for me was actually thinking about proteins in nature.
We've been gifted just this incredible variety of machines which we don't understand really at all.
And we just have to go and sort of try and understand them on a one-by-one basis.
We're still understanding hemoglobin and insulin and things like this.
And no doubt, you know, and there's hundreds of millions of proteins known.
So it is a little bit like that.
We've been gifted by biology, just this immense library of machines, no doubt containing
an enormous number of very interesting ideas.
And we're just at the very, very, very beginning of understanding it.
So actually, I mean, that's, I suppose, kind of your point actually is, is, you know, I need
to relabel your argument slightly, but you sort of think of that as a gift from an alien civilization,
which obviously it isn't, but you think of it that way.
And it's like, oh my goodness, there's so much in there, and we're going to study it.
And goodness knows how long we could continue to study it.
There's tens of thousands of papers about the hemoglobin and things like that.
And we still don't understand them.
And yet we're getting so much out of it.
I mean, just think about insulin alone.
It's such an important thing.
That's an incredibly useful intuition problem that you have on Earth.
I had Nicolian on where he had this theory about how life emerged, but like whatever theory
you have, basically something like DNA, four billionaireses, and you have an alien civilization
coming here and be like, there's all these interesting things to learn about material science,
about, you name it, right? Like about...
Think about Cheneasin walking along.
I mean, and we know almost nothing about these proteins, and yet the tiny few facts we do know
are just incredible.
The ribosome.
Yeah.
example, I mean, this miraculous sort of device, little factory.
And all seeded by just like, there's this particular chemistry on Earth with nucleic acid
and carbon-based light forms that that chemistry gives rise to all of these interesting things,
which an alien civilization would find very interesting. And so that very, that seed, which must be
one among, you know, trillions of possible seeds of, I mean, just of general intellectual ideas.
leads to all this fecundity.
That's a very interesting interest from.
I want to meditate on this gains for trade thing
because I feel like,
I think there's something actually very interesting
about this idea
that if you have this vision
of how technology progresses
and how it may be different
in different civilizations,
it has important implications
about how different civilizations
might interact with each other.
Like the fact that they're going to be
these huge gains from trade.
It makes friendliness much more rewarding.
Yes.
Right.
Yeah.
That's a very important observation.
Yeah.
I hadn't thought about that at all.
That's really, that is a very interesting observation.
It is funny.
I mean, you know, comparative advantage is something that people, you know, they love to invoke.
And it's a very beautiful idea, obviously.
There are limits to it.
Like, you know, it's kind of, it's a special limited model.
We don't, we don't, you know, chimpanzees can do interesting things.
We don't trade with them.
And I think it's sort of interesting to think about the reasons why.
Yeah. And part of it is just power, I think.
Like, once there's a sufficiently large power imbalance, very often, not always, but very often groups of people seem to sort of shift into this other mode where they just seek to dominate.
And maybe that's something special about human beings, but maybe it's also sort of a more general sort of a thing.
They're not, they're no longer, they give up, you know, you need all these special things to be true before groups will trade.
Yeah.
And, yeah, it's not necessarily obvious.
I think the big thing going on here is, one, transaction costs.
Yeah.
And two, compared to advantage does not tell you that the terms on which the trade happens are above subsistence for any given one producer.
So people often bring this up in the context of, well, humans will be employed.
even in post-AGI world because of a compared advantage.
There's big, there's like five different ways that argument breaks down,
but the easiest ways to understand are,
why don't we have horses all around on the roads
because there's some comparative advantage between cars and horses?
Good example.
Well, there's huge, one, there's huge transaction cost to building roads
that are compatible with horses and cars at the same time.
In a similar way, AI sort of thinking at 1,000 times the speed
and can sort of shoot their latent states at each other
are going to find it way more costly than the benefit
in just terms of interacting with you
to have a human being in the supply chain.
And second, that just because horses have a comparative advantage
mathematically does not mean that it is worth paying
100K a year or whatever costs to sustain a horse in San Francisco,
that subsistence is going to be worth the benefit you get out of the horse.
I do think it's interesting, like, that just the sheer fact that, you know, my expectation
and my intuition obviously differs a great deal from yours on this, you know, is that most
parts of the tech tree are never going to be explored.
There's just too many interesting ways of combining things.
There's too many sort of deep ideas waiting to be discovered.
And, you know, not only we, but nobody ever is going to discover most of them.
So choices about how to make, how to do the exploration actually matter quite a bit.
Interesting.
It's something I really dislike about sort of technological determinist arguments.
I'm willing to buy it sort of low enough down when, you know, progress is relatively simple.
But higher up, you start to get to shape the way in which you do the exploration.
And it's interesting, you know, people, we are starting to shape it in, in interesting ways.
You know, sort of, I mean, there's various technologies that have been.
essentially banned, you think about DDT, you think about chlorofluorocarbons, you think about
restrictions on the use of nuclear weapons, the nuclear nonproliferation treaty.
Those kinds of things are, you know, they're not, they weren't done before the fact,
but they're, you know, starting to get pretty close in some cases where we just sort of preemptively
decide, oh, we're not going to go down that path.
So that starts to look like a set of institutions, which, where we are actually in
influencing sort of how we explore the tech tree.
Yeah.
On where you would see these gains from trade,
obviously you'd see the most where it's pure information that can be sent back and forth
because the information has a quality where it is expensive to produce,
but cheap to verify and cheap to send.
And so it'll be interesting how much of future productivity or whatever can be distilled down to information.
Right now it's kind of hard to do because you can't really transfer,
like if China is really good at manufacturing something,
where there's this process knowledge that's in the heads of 100 million people involved in the manufacturing sector in China.
But in the future, it might be easier if AIs are doing.
I mean, the question about sort of to what extent does their, you know, fabrication get sort of very uniform and get really commoditized?
Like, you know, 3D printers have been the next big thing for at least 20 years now.
Why do they still not work all that well?
Why are they still not actually at the center of manufacturing?
and sort of what comes after that.
You know, it is funny to look at, say, the ribosome by contrast,
it really is at the center of biology
in a whole lot of really interesting ways.
And whether or not that's the future of manufacturing
is something very simple, sort of where everything goes
as sort of as throughput through, I don't know,
maybe it's a bioreactor or something like that.
So you send to the information and then you grow stuff
or you have some 3D printer that actually works.
And if they're good enough, then actually it does become much more a pure information problem
and some of this process knowledge becomes much less important.
Jane Street has a lot of compute, but GPUs are very expensive.
And so even optimizations that have a relatively small effect on GPU utilization are still extremely valuable.
Two of Jane Street's ML engineers, Corwin and Silvan, walk through some of their optimization workflows at GTC.
You're not bottlenecked on the network being too slow.
your bottlenecked on waiting for a different rank in your training, not having completed the work.
They talked about how Jane Street profiles trace is and diagnoses bottlenecks,
and then how they solved them using techniques like kuda graphs and kuda streams and custom kernels.
With these sorts of optimizations, Corwin and Sylvan were able to get their training steps down from 400 milliseconds to 375 milliseconds each.
This 25 millisecond difference might sound small, but given the size of Jane Street's fleet,
that improvement could free up thousands of B-200s.
Jane Street open source to all the relevant code.
If you want to check it out, I've linked the GitHub repo and the talk in the description below.
And if you find this stuff exciting, Jane Street is hiring researchers and engineers.
Go to jane street.com slash thwar cash to learn more.
Can I ask a very clumsily phrase question?
So there's these deep principles that we've discovered a couple of.
One is this idea that, hey, if there's a symmetry across a dimension, it corresponds to a conservative quantity.
It's a very deep idea.
There's another, which you've written a lot about,
written a textbook about, in fact,
about there's ways to understand this thing
of what kinds of things you can compute,
what kinds of physical systems
you can understand with other physical systems,
what a universal computer looks like, et cetera.
And is your view that if you go down to this level of idea
of Nother Serum or the Church-Turing principle,
that there's an infinite number of extremely deep
such principles because I really like what makes them special is that they themselves encompass
so many different possible ways the world could be but no it has the world has to be
compatible with actually a couple of these very deep principles I don't know I mean
I just all I have here is speculation and sort of instinct my instinct is we keep
we keep finding very fundamental new things it was very I mean for me anyway quite
formative to understand as I say you know I gave the example before there's
these wonderful ideas of church and touring and these other people, ideas about universal
programmable devices, and then you understand later, oh, this also contains within it,
the ideas of public key cryptography, and then you understand later, oh, that also contains
within it. The ideas, I mean, people refer to it as cryptocurrency or whatever, but there's
a very deep set of ideas there about the ability to collectively maintain an agreed upon
ledger, which is built upon this. And there's probably many deep ideas,
to sort of, actually took whatever, it's taken many years really to figure out the right
canonical form of those. And so just this fact that you keep finding what seemed like deep
new fundamental primitives, I find very, for me, that has been a very important intuition bump.
And it's across, I mean, I've given that particular example, but I think you see that same
pattern in a lot of different areas.
What is your interpretation then of this empirical phenomenon where ideas like whatever input you consider into the scientific process or technological process?
Economists have studied this a million in a hundred ways.
It just seems to require, even at actually a very consistent rate, X percent more researchers per year.
So there's a famous paper from a couple years ago by Nicholas Bloom and others where they say,
how many people are working in the semiconductor industry and how does it increase over time through the history of Moore's law?
And I think they find Moore's Law means computing increases 40% a year,
or transistor density increases 40% of the orbit.
To keep that going, the amount of scientists
has increased 9% a year that in the industry,
and they go through industry after industry
with this observation.
And so as your view that there are these deep ideas,
but they keep getting harder to find?
Or is that no, there's another way to think
about what's happening with these empirical observations?
I mean, they're, so first of all of their examples
are narrow, right?
they all, they pick a particular thing and then they look at some particular metric.
You know, nowhere in that shows up, like, GPUs don't show up there.
Right? Like in the sense of, oh, yeah, all of a sudden you get this ability to parallelize.
And that's really interesting.
So there's sort of a lot of external consequences that are just delighted from, basically,
you know, they have these simple quantitative measures.
They look at it in agricultural productivity.
They look at it in a whole lot of different ways.
But you do have to focus narrowly.
And I suppose I'm certainly interested, as I say, in this fact that just new types of progress keep becoming possible.
But there is still, I think even there, there does seem to be some phenomenon of diminishing returns.
Is that intrinsic?
Is that something about the structure of the world?
What is it? Well, one thing which hasn't changed that much is, is, you know, sort of the individual
minds which are doing this kind of work. And, you know, maybe that, those should be sort of
being improved as well, or some sort of, you know, feedback process going on there.
You know, and, you know, maybe that changes the nature of things. I suppose I, I, I look at
scientific progress up into, let's say, 1700, something like that. And it was very, very,
slow and also was very irregular. You had the Ionians back sort of five centuries before Christ
doing these quite remarkable things. And so much knowledge would get lost and then it would
be rediscovered and then it would be lost again. And you'd have to say that progress was very
slow. And there it's partially just bound up with the fact that there were some very good ideas
that we just didn't have. Even once you've had the ideas, then you need to build institutions
around them, you actually need to solve a whole lot of different problems about training,
about allocation of capital, about all these kinds of things, even just about basic sort of security
for researchers, so they're not worried about the Inquisition or things like that. So there's all
these kind of complicated problems. You solve all those complicated problems, and then all of a sudden,
boom, there's a massive sort of burst of scientific progress. If you're not changing it, if there's
some kind of stagnation there, if you're not changing those external sort of circumstances, yes,
like you may start to get sort of diminishing returns again.
But that doesn't mean there's anything intrinsic about the situation.
Maybe something just external needs to change again.
Obviously a lot of people think AI is potentially going to be a driver.
I mean, it certainly will at some level.
In fact, to the extent you can think of a lot of modern scientific instrumentation
is really, I mean at some level kind of robots, you know, what is the
the James Webb Space Telescope.
Well, you know, it's unconventional maybe to describe it as a robot,
but it's not completely unreasonable either.
You know, it is an example of a highly automated,
very sophisticated system with electronically mediated sensors and actuators,
where machine learning in fact is being used to process the data.
So in that sense, we're already starting to sort of see that transition.
We've been seeing it for decades.
I have this smoke adjoined and take a puff thought, which...
I think we've had a few.
Yeah, yeah.
Well, I think we're going to do that part of the conversation, and you can help me get my foot out of my mouth and figure out of more concrete way to think about it.
So the... to your point that...
There's a natural revolution, the Enlightenment and now there's AI and now there's AI and each might be a different pace or a different way in which science happens.
If you think about the pace of how fast such transitions have been happening, you can draw
over the long span of human history that's hyperbolic of the rate of growth is increasing.
So yeah, 100,000 years ago, you have the Stone Age.
You go back even much further, how long it probably has been around.
It would be like, let's say, millions of years and 100,000 years ago, the Stone Age, then 10,000
years ago, the agricultural revolution, that 300 years ago, the Industrial Revolution each marked
by this increase in the rate of exponential growth.
And then people think it's gonna happen again
with AI, but that would happen potentially even faster.
It would not have occurred to somebody
at the beginning of the Industrial Revolution
that the next demarcation in this trend
will be artificial intelligence.
And so if things are getting faster
and it's hard to anticipate what the next transition will be,
I guess we just think of this singularity
between now and AI, and that's really what distinguishes
the past from the future.
But just applying the same heuristic
that maybe people in the passion have had,
maybe the intelligence age is also quite short.
And the next thing after that is,
we don't even have the ontology to describe what it is,
but the future will not think of the past
as like there was pre-intelligent AI and post-AI.
No, that seems.
I mean, obviously we can't prove this.
but it certainly seems quite plausible.
I mean, part of the issue, of course, is just the substrate we have available to conceive,
like seems all wrong.
You can't speculate with a bunch of chimpanzees about what it would be like to have language.
Just to sort of pick a major transition in the past, the transition itself is the thing.
And it seems likely.
if we're talking about taking a puff kind of thoughts,
you know, I'm certainly amused by the idea
that there's going to be some transition
involving artificial general intelligence
using classical computers.
But actually there'll be an interesting transition
with quantum computers as well.
They're probably capable of a strictly larger class
of potentially interesting computations.
So maybe actually the character of sort
of a QGI or whatever it should be called is actually qualitatively different.
So maybe there's sort of a brief period between those two things.
Interesting.
I mean, as I say, you know, this is just speculation, but it's certainly amusing.
Is there a reason to think that?
Because from what I understand there's been, for decades, people like you have put pretty
tight bounds on the kinds of things quantum computers are going to do.
And so it'll speed up search somewhat.
It will do.
And the kinds of things it extremely speeds up, like Shoros algorithm, it seems like it,
again, maybe this is to your point that we can't predict in advance what's down the tech tree,
but at least from here it seems like you break encryption, but what else are you using?
Sure's algorithm.
Yeah, I mean, we've only been thinking about it for 30 years or whatever.
It's 40 or so years, not for very long.
And we sort of haven't in some sense thought that hard about it as a civilization.
So, yeah, does it turn out that it's very,
narrow, maybe. Does it turn out that it's very broad? That's also, you know, like a really
radical expansion. That seems distinctly possible. Like, keep in mind as well, we've been doing it
without the benefit of having the devices. Right. Right. Like, that's a pretty big bottleneck to have.
If you're thinking about computer science in the 1700s, and you're like, okay, and do and or,
yeah, yeah, yeah, yeah, yeah. What are you going to do? You can't anticipate Bitcoin. You can't
anticipate deep learning. No, no. I mean, maybe you could if you were, you know, sufficiently bright, but it is a
pretty hard situation, right. What is your inside view having been in and contributing to quantum
information, quantum computing back in the 90s and 2000? What is your telling of the history of
what was the bottleneck? What was the key transition that made it a real field? And how do you
rank sort of the contributions for Feynman to Doidge to everybody else that came along? Yeah. So,
So I mean, let's just focus on sort of the question about sort of what, you know, what actually
changed?
So why was quantum computing not a thing in the 1950s, right?
Like it could have been.
Yeah.
Yeah.
You know, somebody like, I don't know, John von Neumann, good example, absolutely pioneering
computation also wrote a very important book about quantum mechanics and was deeply interested
in quantum mechanics.
Like he could have invented quantum computing at that time.
And I think there were quite a number of people who potentially could have.
So why do we have these papers by people like Feynman and Deutsch in the 80s?
And those are, I think, fairly regarded as the foundation of the field.
There are some partial anticipations a little bit earlier, but they were nowhere near as comprehensive
and nowhere near as deep.
And well, you should ask David.
You can't ask, you can't ask Feynman, unfortunately, but he'll know much better than I do.
A couple of things that I think are interesting.
One is that, of course, computation became far more salient,
sort of late 70s, early 80s.
It just became a thing which many more people were interested in,
partially for very banal reasons.
You could go and buy a PC.
You could buy an Apple 2.
You could buy a Commodore 64.
You could buy all these kinds of things.
It became apparent to people that these were very powerful devices,
very interesting to think about.
At the same time, in the quantum case,
That was also the time of the ball trap and the ability to trap single ions and so on.
And up to that point, we hadn't really had the ability to manipulate single quantum states.
So you kind of got these two separate things that just for historically contingent reasons
had both sort of matured around sort of, let's say, 1980 or so.
And somebody like von Diamond could have had the idea earlier,
but it is, I think, quite an interesting, you know, a story about Richard Feynman.
He went and got one of the first PCs, which is around 1980 and 1981.
And he was apparently just so excited with this device.
He actually tripped and hurt himself quite badly, sort of carrying his brand new computing device.
You know, that's a very historically contingent sort of a coincidence, but having somebody
who's very, very talented and understanding of quantum mechanics also just very excited about
these new machines.
It's not so surprising perhaps that he's thinking then.
What similar story could you have told 10 years earlier?
There is just no, the conditions don't exist for it.
So I think that's, I mean, it's quite a banal story.
One of the things we were going to discuss was this idea you had about the market for follow-ups.
And I think this is actually the perfect story to discuss it for because you wrote the textbook by the field, right?
Mike and Ike is the definitive textbook on quantum information.
And so you presumably came in after Deutsch,
but you identified in the 90s somehow identified it
as the thing that is worth following up on and building on.
And instead of talking about more abstractly,
I'd love to actually just hear the story of the first-hand story
of how did you know that this is the thing
of all the things that were happening, physics and computing, et cetera,
that I want to think about this problem.
Sure, sure.
So, yeah, Richard Feynman writes this great paper in 1982,
David Deutsch writes an absolutely fantastic paper in 1985,
sort of sketching out a lot of the fundamental ideas of quantum computing.
So I'm 11 in 1985.
I'm not thinking about this.
I'm playing soccer and doing whatever.
But in 1992, I took a class on quantum mechanics.
It was really terrific given by Jared Milburn.
And I just went and asked Jared one day after,
it's like the fifth lecture or something,
I said, do you have anything, you know, sort of papers or whatever that you could give me?
And he said, come by my office in a couple of days' time.
And I did.
And he presented me with a giant stack of papers, which included the Deutsche paper,
and included the Feynman paper, and included a whole bunch of other sort of very fundamental papers
about quantum computing and quantum information.
At a time when essentially nobody in the world was working on it, he was.
he'd actually, I think he wrote the very first paper that proposed, I mean, sort of a practical
approach to quantum computing.
It wasn't very practical, but it was actually in a real system.
And so in some sense, you know, I'm benefiting from the taste of this other person.
But as soon as I read the papers, or take a look at the papers, like these are exciting papers.
You know, they're asking very fundamental questions and you're sort of like, oh, I can,
make progress here. These are things that one could potentially work on. Deutsch has this
sort of conjecture that basically, there should be, I don't know what the right term for it is,
thesis or what you would call it, that a universal model quantum maturing machine should be
capable of efficiently simulating any system, any physical system at all. This is a very
provocative idea. I think in that paper he more or less claims that he's proved it. I'm not
sure that necessarily everybody would would agree with that. There's questions about whether or not
you can simulate quantum field theory effectively. And that kind of question is I think very
interesting and very exciting there. It's obviously a fundamental question about the universe.
You know, here's some wonderful ideas in there about sort of quantum algorithms and where they
come from and what they mean and what they relate to the meaning of the wave function and questions
like this, which is still not, it's not agreed upon amongst physicists.
So, yeah, there's just some sense of, oh, I am in contact with something which is, A, deeply
important, and B, we as a civilization, don't have this.
and so of course you start to focus your attention a little bit there.
I'm not sure I got the answer to the question that...
Maybe I misunderstood the question.
Let me think of how about it raise it.
Maybe I'll explain the motivation first.
So in a previous conversation we're discussing,
how could you have known in the 1940s the Shannon's theorems?
Yeah.
And Shannon's way of thinking about communication channel
is a deep idea that goes beyond the problems with pulse code modulation
that Bell Labs was trying to solve at the time,
and it applies to everything from quantum mechanics to genetics to computer science, obviously.
And one of the, I think an idea you stated that we didn't get a chance to talk about yet.
There's this idea, well, Shannon publishes a paper.
There's all these other papers, but there's a marker to follow-ups where people gravitate to
wouldn't build upon Shannon's work and how do they realize that that's the thing to do and how
does that process happen? And so I guess you gave your local answer, you read these papers
and you immediately realized, okay, there's work to be done here, there's a low hanging
through, there's some deep provocative idea that I need to better understand and I could,
I could, you know, attractively make progress on it. Yeah, I mean, so, you know, to some extent,
you're sort of saying, okay, I, you know, wanted to get into this game of contributing to
humanity's sort of you know understanding of of the universe and you are
applying sort of this this low-hanging fruit algorithm you're like relative to
my particular set of interests and abilities where should I you know pick up my
shovel and start digging and and there it was like oh this this looks like
quite a good place to to start digging you know and different people of course
you know chose very differently it was it was a very unusual choice
at the time. It was 1992. Very few people were thinking about that. Yeah. Fast-forwarding a bit.
So you've been, I don't know how you think about your work on the open science movement now.
But did it work? Like, what would have, what a successful they would look like? What is it,
what is it that that movement is trying to accomplish? Yeah, I mean, the set of ideas about open science.
I mean, it's interesting. You didn't stop and define open science.
there, which I think 20 years ago you would have had to do, people recognize the phrase.
People have some set of associations with it. Most often they have a relatively simple set of
associations. It means maybe something about making scientific papers open access. Very often,
they have some set of notions about maybe it means also making code openly available.
Maybe it means making data openly available. But already, those are, I think, very large
successors of the open science movement, which is to make those salient issues, those are issues
on which people have opinions, and then there are relatively common arguments, an argument like,
so this is sort of the meme version, you know, publicly funded science should be open science.
That's a distillation of a set of ideas, which you might be able to contest, but if you can get
people actually sort of thinking about it and engaged with that kind of argument,
Yeah, that's a very fundamental kind of an issue to be considering in the whole political
economy of science.
If you go back, say, three centuries, there was a very similar kind of an argument prosecuted,
which is the question, do we publicly disclose our scientific results or not?
So if you look at people like Galileo and Kepler and so on, the extent to which they publicly
disclosed, like it was done in a very odd.
kind of a way. They sometimes they did bizarre things where they were, you know, famously they
published some of their results as some anagrams. So basically, you know, they'd find some discovery.
They would write down the result in sort of a sentence like his, you know, the discovery of the,
of the, I'm trying to think of an example. I think the moons of Mars, I think was one such example.
I'm getting it wrong.
Was it Hook's Law?
Anyway, it doesn't matter.
The point was they'd write it down, but then they'd scramble it, publish that,
and then if somebody else later made the same discovery,
they would unscramble the anagram and say, oh, you know, I actually did it first.
This is not an ideal way.
There's not an ideal foundation for a discovery system.
And then it took, I mean, a very long time, sort of over a century,
I think, to obtain more or less the modern ideals in which what you do is you disclose
the knowledge in the form of a paper. There is then an expectation of attribution, and so there's
a kind of reputation economy which gets built, and so basically, oh, such and such did this
work, so they deserved the credit for that, and that's then the basis for their careers. So this is
sort of the underlying political economy of science, and that made a lot of sense when what you've got
is a printing press and the ability to do scientific journals. Then you transition to this modern
situation where in fact you can start to share a lot more, you can start to share your code,
you can start to share your data, you can start to share in-progress ideas.
But there's no direct credit associated to those.
It's not at all obvious, sort of, you know, how much reputation should be associated to
them.
That's all constructed socially.
And so making it a live issue is, I think, a very important thing to have done.
And that's, I view anyway, is one of the main positive outcomes of work on open science.
I'll give you a really practical sort of example to illustrate the problem.
For a long time in physics, there was a preprint culture in which people would upload preprints
to the preprint archive.
And in biology, this didn't happen.
There was no preprint culture.
That's changing now.
But for a long time, this was the case.
And I used to sort of amuse myself by asking physicists and biologists why this was the case.
And what I would hear sometimes from biologists was they would say, well, biology is so much more competitive than physics that we need to protect our priority.
And so we can't possibly upload to the archive.
We have to just publish in journals.
And then we'd sometimes hear from physicists, physics is so much more competitive than biology.
that we need to establish our priority by uploading as rapidly as possible to the preprint archive.
We can't possibly wait to do it with the journals.
And I think this emphasises the extent to which this kind of attribution economy is just something we construct.
It's just something which we do by sort of agreement.
And so any attempt to sort of change that economy results in a different system by which we construct knowledge.
And so there is sort of this very fundamental set of problems.
around the political economy of science.
You know, sort of we've got this collective project
and how we mediate it depends upon the economy
we have around ideas.
One of the sort of things you've emphasized
as a part of this project of open science
is collective science or groups of people
working progress on a problem where no individual
understands all the logical and explanatory levels
necessary to make a leap or a connection. Outside of mathematics, what is the best example
of such a discovery? I mean, I'm not sure I have a well-ordering of them to give you a best,
but I mean, yeah, an example that I think is very interesting is, is the LHC where it's just this
immensely complicated object. I actually, years ago, I snuck into an accelerator physics
conference, I didn't know anything at all about accelerator physics, but I was just kind of curious to see
what they were talking about. And this particular group of people were experts on numerical methods,
in particular on inverse methods. And so it basically turns out, you know, inside these accelerators,
you have these cascades, so a particle, you know, will be massively accelerated, maybe it'll be
collided, and then you'll get a shower of particles which decays and decays and decays. And there's just this
incredible sort of consequential shower, which is ultimately what you see at the detector,
and then you have to retroactively figure out what produced it. And so there's these very,
very complicated sort of inverse problems that need to be solved. You've got this final data,
but you need to figure out what produced it, and that's how you look for sort of signatures of
these. And what many of these people were, was they were incredibly deep experts on simulation methods
for sort of following particle tracks.
And like this was really deep and difficult stuff.
And I'm like, wow, you could spend a lifetime just learning
sort of how to do this and how to solve some of these inverse problems.
And you would know nothing about, or you would know very little about quantum field theory.
You would know very little about detector physics.
You would know very little about vacuum physics, all these other things that are absolutely,
or very little about data processing, very little about all these things that are absolutely essential.
to understanding, say, the Higgs boson.
And I don't think it's possible for one person to understand everything in depth.
Lots of people understand broadly a lot of these ideas,
but they don't understand sort of everything in the depth that is actually utilized.
That's why there's these papers with well over 1,000 authors.
And those people can, you know, they can talk to one another at a high level,
but they don't understand each other's specialties
that much depth.
I mean, things like, as I say,
detective physics, vacuum physics,
these kinds of solving of inverse problems.
Like, this stuff is incredibly different from each other.
And, you know, to understand it in real detail
is serious work.
How do you think about prolificness versus depth?
Where, I don't know, maybe Darwin's an example of somebody
who's like just gestating on something for many decades.
There's other examples where Einstein during the year comes with special relativity,
just doing a bunch of different things.
Pais talks about how they were all relevant to the eventual buildup.
Yeah, I mean, it's something I stress about a lot.
Sometimes I feel like I'm too slow.
Actually, it's funny that, I mean, the Darwin example is really interesting.
Like, you know, prolificate what?
Like, I mean, God knows how many letters he wrote.
It must have been an enormous number.
So you're certainly very active.
There's also, like, there's sort of, there's two types of work that tends to be involved in any kind of creative project.
There's routine stuff.
And there, you just want to avoid procrastination.
You just want to, like, you know, how do I get good at this?
Or how do I outsource it and how do I do it as rapidly as possible?
And just avoid, you know, like getting into a situation where you're prolonging it.
And then there's high variance stuff where you actually, you need to.
be willing to, you know, take a lot of time.
You need to be willing to go to the different places and talk to the different people
where in any given instance, most of it's just not, it's not going to be an input.
And somehow sort of balancing those two things.
I think a lot of people are very good at doing one or the other, but it's hard to, you know,
it's almost like a personality trait, sort of, you know, which one you prefer.
And people tend to end up doing a lot of one and not enough of the other.
So I certainly try and balance those two things.
I mean, it's such an interesting example.
I mean, 905 is just this extraordinary year.
Like you can delete special relativity entirely and it's an extraordinary year.
You can delete special relativity and you can delete the photoelectric effect for which
he won the Nobel Prize and it's still an extraordinary year, like plausibly a multi-noble
prize winning year.
So what's you doing?
Yeah, I mean, maybe the answer is just he's smarter than the rest of us.
And there's a lot of luck as well.
But certainly for myself anyway, like trying to identify those things that are routine,
that I should get good at, and then just try and do as quickly as possible.
I think that's yielded a certain amount of returns, but also being willing to bet a little bit more
on myself on sort of the variance side has also been very, very, very helpful. That's really
hard. Intrinsically, you're putting yourself in situations where you don't know what the outcome
is going to be. And so if you're very driven to be productive and whatever, and actually
mostly it's not working over there, you're like, let's reduce this. Like, it doesn't feel right.
When I worked in San Francisco, actually a practice I used to have each day was instead of taking the 15
walk to work, I would take the more beautiful 30 minute walk to work, partially just because
it was beautiful, but partially also as just a reminder to, like, that there are real benefits
to not being efficient. But it's not an answer to your question. I mean, really, I think all I'm
saying is I struggle a lot with the question. I mean, there are these, Dean Keith Syvington,
I forgot his exact name. Yeah, yeah, I know who you mean. Has this famous equal odds world where he says
the probability that any given thing you release, any paper, book, whatever, will be extremely important for a given person through their lifetime is not that different.
And what really determines in what era they are the most productive is how much they're publishing.
Any given thing has equal odds of being extremely important.
Maybe just think of some of the most successful creatives or scientists.
They're just doing a lot.
like Shakespeare's just publishing a lot.
Yeah, yeah, yeah.
And of course, then there's kind of examples, you know, Gerdell publishing almost nothing.
Yeah.
But, you know, broadly speaking, you know, I think, like, you need a very good reason to be avoiding it.
There's basically to not do that.
It's funny.
I mean, I've talked to, I've met a lot of people over the years who you talk to, they're clearly brilliant,
and they're just obsessed that they are going to work on the great project that, you know, makes them famous.
and they never do anything.
And that seems connected, like it's a type of aversiveness.
I think very often they just don't want public judgment.
Something that I would love to see,
there's an awful lot of biographies and memoirs
and histories of people who achieve a lot.
I wish there was like a very large number of biographies
of people who are fantastically talented.
That's great.
Who, you know, just missed.
Like, you know, absolutely.
I've known people who won gold medals at IMOs and things like that, who then tried to become
mathematicians and failed.
Like what happened?
What was the reason?
I suspect in many cases that's actually more informative than anything else.
You have this essay that I was reading before this interview about how you think about what
is the work you're doing.
writer doesn't seem like, as you say, was Charles Darwin a writer, right? What exactly is that
label? I'm a podcaster, right? So I'm, and in a way, obviously, our work is very different.
But I also think a lot about what is this work and how do I get better at it? And in particular,
how I can make sure there's some compounding between the different people I talk to on the podcast,
where I worry that instead of this kind of compounding, there's actually, I build up some understanding
that's somewhat superficial about a topic,
and then it depreciates,
and they move to them in the next topic
and sort of depreciates.
And so I think there's this question.
There's a lot of podcasters in the world
who will interview way more experts
than I have or have,
and I don't think they're much the wiser
or more knowledgeable as a result.
So it's clearly possible to mess this up.
And I wonder if you have thoughts or takes or advice
on how one actually learns
in a deeper way from this kind of work.
Yeah.
I mean, sort of an incredibly complicated and rich question.
I mean, it just seems like sort of the question is, like, you know,
how do you make it a higher growth context?
How do you make it a more demanding context?
And sort of, you know, you can do that in like relatively small ways,
but that might have a yield compounding returns,
or you can do something that is maybe more radical.
Maybe it means actually, you know, starting sort of a parallel project in which you do something
that is actually quite a bit different.
There is something I think really interesting about how being very demanding can simply
change your response to something.
Something that I would sometimes do with students and sometimes with myself, it was really
aimed more at myself, was they would say some week, oh, you know, I'm going to try and do,
you know, this work over the coming week and then the next week would come by and they,
you know, they hadn't solved the problem or whatever.
And you sort of like, you know, if a million dollars had been at stake, like, would you have put the same effort in? And the answer is no, sort of invariably. Like, they've tried, but they haven't really tried. I think that's a very familiar feeling for all of us. You know, you sort of, you, often you, you could do a lot more if you had just the right sort of demanding taskmaster standing by you and saying, look, you're barely, you're, you're barely.
operating here. And so I do sort of wonder a little bit about like, you know, what's the,
what's the demanding taskmaster? What can they ask you that is going to make your preparation
way more intense? The most helpful thing, honestly, is for some subjects, it is very clear how I prep.
Like, I'm doing an upcoming episode on chip design with the founder of a company that is
a ship design, and he wrote a textbook on chip design. And he, yesterday I went over to
his office and we brainstormed five sort of roofline analysis I can do.
And if I understand that, I have some good understanding.
The problem is with almost every other field, there's not this, there's not like you,
I don't know, when I interviewed Ilya, three, four years ago, it's like implement the transformer.
And if you implement it, like, you have some nugget of understanding, you have clamped down.
And with other fields, it's just like, I vaguely understand this.
It's not clamped.
I vaguely understand this.
I vaguely all-limbed about this.
is a little bit about this, but there's no forcing function that you do this exercise,
and if you do it, you will understand.
Yeah, so, I mean, really what you're sort of saying is you can do a good job at podcasting
without actually attaining this kind of, and that's the problem from your point of view.
You want to sort of change your job description so that you are internalizing these chunks
and just getting this kind of integration each time.
And it seems to me like you, you know, what that means is you actually want to change the structure of like,
like the work output at some level.
I mean, lots of people think,
there's this terrible idea.
People have that they should be in flow all of the time.
And because as far as I can tell,
high performance just don't believe this at all.
They're in flow some of the time.
Like you certainly see this with athletes.
When they're actually out there,
playing basketball or tennis or whatever,
ideally they are in flow much of the time.
But when they're training, they're not.
They're stuck a lot of the time or they're doing things badly.
And I suppose I wonder what that looks like for you.
That I would be extremely satisfied with.
The problem is I just like, I don't know what the equivalent of do the 64 lapses for almost.
And so this is sort of a thing you can change by choosing guests where there is a legible curriculum.
And so maybe it's a mistake for not having done that.
Or also, like there's no real way to prefer Terence Tao or something.
thing and like there's no curriculum that's like a plausible one. I think um well there's one
failure mode so there's many failure modes but one is um if you you could do one dynamic I'm
worried about a long-term dynamic is that you do good you can have a good podcast and it's a local
maximum but um you for no particular guest or topic are you going deep enough that you've
I think my model of learning is there's if you don't really understand the deeper mechanism
you're just mapping inputs and outputs of a black box yeah yeah and that just fades incredibly
fast or is not worth it in the first place and you kind of just move on and it's over.
And you kind of need to build the intermediate connection.
And it's unclear.
I think actually AI in a weird way is really easy for that reason because there is a clear
thing you can do, just implement it, right?
And then you understand it.
We're almost, if I applied that criteria and elsewhere, what am I, do I just not do
history episodes?
Ada Palmer, like what?
Ada Palmer, like what, you know, wonderful to talk to, incredibly interesting, but for you personally,
like, what changed?
Right.
Yeah, there's some things I learned.
I think I could have done if I had maybe allocated more time, especially after the interview,
to like, let's write up 2,000 words on everything I learned and how it connects to other things I know and something.
And maybe that's thing worth doing is spreading out the episodes more and spending more time afterwards consolidating.
But yeah, I think I would pay basically infinite amounts of money.
if there's somebody who is really good at coming up with,
here's the curriculum and here's the practice problems you need to do
and here's the exercising you drop the interview
to clam what you have learned.
Have you tried doing that with somebody?
It's hard to find so.
I mean, maybe I haven't tried super hard, but it seems like,
isn't it a big tough to find somebody who could do that for every single
kind of discipline?
Maybe I should just hire different ones for different topics.
Maybe, or...
There's something about, like, I mean, what problem, you know,
are you solving sort of for each episode?
And I mean, as far as I can tell, like, that's the only way I really understand anything is that, you know, I get interested in something.
At first, I don't even have a problem, but there's just some sense of there's some contribution to make here.
And gradually, you home in and there's a problem.
And then, I mean, funnily enough, I mean, spending time stuck is incredibly important.
And I sort of, you know, they used to just be annoying.
Now it seems like, oh, this is actually maybe even the most important part of the whole process.
But that very hard oneness of it means that, you know, I internalize it afterwards.
I often find, actually, if I, you know, I've written sometimes 10,000 word essays in, you know,
a couple of days and I've written them in, you know, three months or six months.
I feel like I didn't learn very much from the ones that only took a couple of days.
Whereas, you know, some of the ones that took three months, I'll be, you know,
15 years later, I'll still remember.
Yeah, can you describe outside of physics, how you learn of the one that took three months?
I mean, by far the most, you know, the common things, there's always some creative artifact.
Sometimes it's a class.
Sometimes it's engagement with a group of people who, you know, there's some collective creative
artifact that you're working on together.
I mean, you might not even be aware of it, but you're acting as an input to their creative ends in some way.
And sometimes it's just, you know, it's an essay or a book or whatever.
You know, it's one of the reasons why I often quite enjoy doing podcasts.
I mean, particularly, I mean, I said yes to come here partially because I know you ask unusually demanding questions.
And so it's sort of, that's an attempt to get this sort of perspective from a different, it's a different,
different kind of enforcing function. So you're trying to pick sort of the most demanding creative
context. Yeah. So for this interview, I went through like three lectures of the Suskin,
Sessions, Sessions, a short-old-a-old-turdy book. The problem is that there's almost no practice
problems in it. And so I hired a physicist friend who's going to like, I haven't done it yet.
It's just like every lecture I want like a bunch of practice problems go to them and I'm
planning on being appropriately humbled. How do you make it as jugular as possible, right?
Like, the higher you can raise the stakes, the better.
I mean, the interview is, in some sense, high stakes,
but also it doesn't necessarily test deep understanding.
Yeah, but I don't think the interview is that high stakes, right?
You're not writing a book about special relativity,
and you're not trying to write a book that replaces the current,
whatever the existing standard textbook is.
Like, that's a really high, really high sake.
A phrase that I sort of find particularly difficult
and it's a funny one.
People will talk about going deep on a subject.
And it turns out different people have different ideas
of what this means.
Some people mean they read a couple of blog posts.
Some people it means they read a book about it.
Some people it means they wrote a book about it.
And I think like sort of what your standard is,
sort of the standard you hold yourself to determines a lot
about your ability to integrate knowledge in this way.
I don't know where your experience has been,
but I found that I'm getting,
I'm in some sense,
it wasn't moved much faster on some things
to the help of AI,
but I don't know if I'm learning better.
And I think it's probably because
the hardest thing,
the thing that is most demanding
is so aversive
that you try to take any excuse you can
to get out of it.
And just having back and forth conversation
where you gloss over.
It's entertaining,
but not necessarily anything else.
Yeah, so it's such a,
an easy way to get out of the thing.
Yeah.
In fact, it makes it easier because instead of doing some intermediate thinking,
there's always a next question you can ask a chatbot.
Yeah.
And it's somewhat valuable.
Like, it's not, I mean, that's part of the seductiveness, of course.
Like, it's not actually useless.
But yeah, it can sort of substitute for actually doing the thing that maybe you should be doing.
It's interesting that.
Like, the extent to which, you know, to what extent,
should you be outsourcing that kind of stuff and to what extent you do you know like like
it's really there's some sort of interesting judgment call about you know you actually there is a
whole bunch of routine work that you want done and in fact it's it's low value for you so you
may as well if you can get a chatbot to do it you may as well so somebody interviewed
the pioneering computer scientist Alan Kaye years ago and he was asked what he thought about
basically Linux and if I remember his answer correctly he basically said look you know
it doesn't have anything to do with computer science it's just a great big ball of
mud there's a few interesting ideas in there which are which are worth
understanding but mostly you're all you're learning is stuff about Linux like
you're not actually learning anything which is transferable
I thought there was like a very interesting like that there's a certain kind of
seductiveness to some things where
you know, it's sort of a Rube Goldberg machine.
You can just sort of learn about all the bits and it feels kind of entertaining.
But if you step back and think about the question, you know, what am I actually doing here?
It might not actually be meeting your objectives.
Maybe you want to become, you know, a cis admin and learning Linux is a great use of your time.
Right.
There's no, no harm in that at all.
But if your answer is, if your objective is to understand the fundamentals of computing,
it's much less, much less clear that that's a good use of your time.
I thought that was certainly an answer I've thought a lot about where you actually need to,
for a certain type of mind, there is a seductiveness in just learning systems
and confusing that with understanding.
Okay, I'll keep you updated on how this guess.
I owe you a text or than a month of some revant learning system.
I'll be really curious if you, I mean, it's also true, right?
Like, tiny incremental improvements in this, I mean, they're just worth so much.
I know, yeah. It's sort of the main input into the podcast, you know. It's great that the bookshelves are fancy and I've got to Blackford or whatever. But really, like, the thing that makes the podcast better is if I can improve the learning, I do. So it's, yeah, it's worth every more so little improvement.
Yeah. All right, thanks for the, thanks for the therapy session.
Create notes done done.
Thanks, Michael. All right. Thanks for all right. Thanks for all guys.
