Lex Fridman Podcast - #153 – Dmitry Korkin: Evolution of Proteins, Viruses, Life, and AI
Episode Date: January 11, 2021Dmitry Korkin is a professor of bioinformatics and computational biology at WPI. Please support this podcast by checking out our sponsors: - Brave: https://brave.com/lex - NetSuite: http://netsuite.co...m/lex to get free product tour - Magic Spoon: https://magicspoon.com/lex and use code LEX to get $5 off - Eight Sleep: https://www.eightsleep.com/lex and use code LEX to get special savings EPISODE LINKS: Dmitry's Website: http://korkinlab.org/ Dmitry's Twitter: https://twitter.com/dmkorkin PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/LexFridmanPage - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (07:21) - Proteins and the building blocks of life (14:23) - Spike protein (21:11) - Coronavirus biological structure explained (26:09) - Virus mutations (32:39) - Evolution of proteins (42:25) - Self-replicating computer programs (50:02) - Origin of life (57:34) - Extraterrestrial life in our solar system (59:31) - Joshua Lederberg (1:05:30) - Dendral (1:08:24) - Why did expert systems fail? (1:10:35) - AlphaFold 2 (1:32:13) - Will AI revolutionize art and music? (1:39:12) - Multi-protein folding (1:43:39) - Will AlphaFold 2 result in a Nobel Prize? (1:46:10) - Will AI be used to engineer deadly viruses? (2:01:17) - Book recommendations (2:11:00) - Family (2:13:38) - A poem in Russian
Transcript
Discussion (0)
The following is a conversation with Munchery Korkin, his second time in the podcast.
He's a professor of bioinformatics and computational biology at WPI, where he specializes in bioinformatics
of complex disease, computational genomics, systems biology, and biomedical data analytics.
He loves biology, he loves computing, plus he is Russian and recites a poem in Russian at the end of the podcast.
What else could you possibly ask for in this world?
Quick mention of our sponsors.
Brave browser, net suite, business management software, magic spoon, low carb cereal, and
eight sleep self cooling mattress.
So the choice is browsing privacy, business success,
healthy diet, or comfortable sleep.
Choose wisely, my friends, and if you wish,
click the sponsor links below to get a discount
at the support this podcast.
As a side note, let me say that to me,
the scientists that did the best,
apolitical, impactful, brilliant work of 2020
are the biologists who study viruses
without an agenda, without much sleep, to be honest, just a pure passion for scientific
discovery and exploration of the mysteries within viruses.
Viruses are both terrifying and beautiful, terrifying because they can threaten the fabric
of human civilization,
both biological and psychological. Beautiful because they give us insights into the nature
of life on earth and perhaps even extraterrestrial life of the not so intelligent variety that
might meet us one day as we explore the habitable planets and moons in our universe.
If you enjoyed this thing, subscribe on YouTube, review it on Apple Podcasts, follow on Spotify,
support on Patreon, or connect with me on Twitter and Lex Friedman.
As usual, I'll do a few minutes of ads now, and no ads in the middle.
I try to make these interesting, but I give you timestamps, so go ahead and skip if you
must.
But please, still check out the sponsors by clicking the links in the description.
It is, in fact, the best way to support this podcast.
This show is sponsored by Brave, a fast, privacy-preserving browser that feels like Google Chrome,
but without ads, or the various kinds of tracking that ads can do.
I love using it more than any other browser, including Chrome.
If you like, you can
import bookmarks and extensions from Chrome just as I did. The Brave browser is
free available on all platforms. It's actively used by over 20 million people.
Speedwise, it just feels more responsive and snappier than other browsers. So I
can tell there's a lot of great engineering behind the scenes. It has a lot of privacy
related features that Chrome doesn't have like it includes options such as private window with
Tor, for those seeking advanced privacy and safety. The tech behind Tor in fact is pretty fascinating
and I'm sure I will explore it on a future podcast. Get this awesome browser at brave.com slash Lex
and it might become your favorite browser as well.
That's brave.com slash Lex.
This show is also sponsored by NetSuite.
This one's for the business owners.
Running a business is hard.
If you own a business, don't let quick books
and spreadsheets make it even harder than it needs to.
You should consider upgrading to NetSuite.
It allows you to manage financials, human resources, inventory, e-commerce, and many more business-related
details all in one place.
I dislike the bureaucracy that companies sometimes build up around this.
NetSuite can probably help, but I'm sure bureaucracy can still flourish if you're not careful.
To me at least efficiency and excellence are essential.
NetSuite or not.
Anyway, whether you're doing a million or hundreds of millions of revenue, save time and money
with NetSuite.
24,000 companies use it.
Let NetSuite show you how they'll benefit your business with a free product tour and NetSuite.com slash
Lex. My reading engine is not functioning properly today. It requires more coffee. If you own a
business, try them out. Schedule your free product tour right now in all caps and NetSuite.com slash
Lex. Right now, because they want to create an artificial sense of urgency. NetSuite.com slash Lex. Right now, because they want to create an artificial sense of
urgency. Metsuite.com slash Lex, go there. This episode is also sponsored by Magic's
Boom, Low Carb, Keto Friendly Serial. This is one of the more fun and colorful sponsors
this podcast has. I've been on a mix of Keto, Carnivore diet for a long time now. That
means very few carbs. I do unfortunately
binge eat cherries or apples sometimes and regret it later, but love it in the moment.
Just like I used to regret eating cereal, because most have crazy amounts of sugar, which
is terrible for you. But magic spoon is a totally new thing. Zero sugar, 11 grams of protein,
and only 3 net grams of carbs.
I personally like to celebrate little accomplishments in productivity with a snack of magic spoon.
It feels like a cheap meal, but it is not.
It tastes delicious, it has many flavors including cocoa, fruity, frosted, and blueberry.
I think they've been adding some new ones.
To me they're all delicious, but if you know what's good for you,
you'll go with Coco, my favorite flavor,
and the flavor of champions.
Just not realizing that Coco reminds me of Joey Coco Diaz,
who perhaps might be fun to have on this podcast one day.
Click the magicspoon.com slash Lex, link in the description,
and use code Lex at checkout for free shipping.
Finally, this episode is also sponsored by 8Sleep and Spot Pro mattress.
A product I enjoy every single day, sometimes in the afternoon as well.
It controls temperature with an app, is packed with sensors, and can cool down to as low
as 55 degrees on each side of the bed separately.
It's been a game changer for me.
I just enjoy sleep and power now's way more.
I feel like I fall asleep faster
and get more restful sleep.
Combinations of cool bed and warm blanket is amazing.
Now, if you love your current mattress,
but are still working for temperature control,
eight sleeps, new pod pro cover
adds dynamic cooling heating capabilities
onto your current
mattress. This thing can cool down to as low as 55 degrees or heat up to 110 degrees.
The latter feature I did not use but I know some of you probably will and you
could do this kind of cooling and heating on each side of the bed separately.
Also can track a bunch of metrics like heart rate variability, but honestly, cooling alone is worth the money.
Go to atsleep.com slash Lex, and when you buy stuff there, you'll get special savings
as listeners of this podcast.
Once again, that's atsleep.com slash Lex.
And now, here's my conversation with Dimitri Korkin. It's often said that proteins and the amino acid residues that make them up are the building
blocks of life.
Do you think of proteins in this way as the basic building blocks of life?
Yes and no.
So the proteins indeed is the basic unit, biological unit, that carries out an important function of the cell. However, through studying the proteins
and comparing the proteins across different species across these different kingdoms,
you realize that proteins are actually much more complicated. So they have so-called modular complexity.
And so what I mean by that is an average protein
consists of several structural units.
So we call them protein domains.
And so you can imagine a protein as a string of beads, where each bead is a protein domain.
And you know, in the past 20 years, scientists have been studying the nature of the protein
domains, because we realized that it's the unit, because if you look at the functions, right?
So many proteins have more than one function.
And those protein functions are often carried out
by those protein domains.
So we also see that in the evolution,
those proteins domains get shuffled.
So they act actually as a unit,
also from the structural perspective.
So some people think of a protein as a sort of a globular molecule,
but as a matter of fact, the globular part of this protein is a protein domain.
So we often have this, you know,
again, the collection of this protein domains
align on a string as beads.
And the protein domains are made up of amino acid residues.
So it's, so this is the basic bill.
So you're saying the protein domain is the basic building block of the
function that we think about proteins doing. So of course, you can always talk about different
building blocks. It's turtles all the way down. But there's a point where there is at the
point of the hierarchy where it's the most the cleanest element block based on which you can put them together in
different kinds of ways to form complex function. And you're saying protein domains. Why is that not
talked about as often in popular culture? Well, you know, there are several perspectives on this.
on this. And one of course is a historical perspective, right? So, historically, scientists have been able to structurally resolve to obtain these 3D coordinates of a protein for
smaller proteins. And smaller proteins tend to be a single domain protein. So we have a protein equal to a protein domain.
And so because of that, the initial suspicion was that the proteins
are, they have globular shapes.
And the more of smaller proteins you obtain structurally,
the more you became convinced that that's the case.
And only later when we started having
alternative approaches,
so the traditional ones are X-ray,
crystallography and NMR spectroscopy.
So this is sort of the two main techniques that give us the 3D coordinates.
But nowadays there is a huge breakthrough in cryo-electron microscopy.
So the more advanced methods that allow us to get into the 3D shapes of much larger molecules, molecular complexes to give you one of the common examples
for this year. So the first experimental structure of a S protein, so the spike protein.
And so it was solved very quickly and the reason for that is the advancement of this technology
is pretty spectacular.
How many domains is it more than one domain?
Oh yes.
Oh yes.
So it's a very complex structure and complex. We,
you know, on top of the complexity of a single protein, right? So this, this structure is actually
is a complex, is a trimer. So it needs to form a trimer in order to function properly. What's
a complex? So a complex is a glomeration of multiple proteins. And so we can
have the same protein, copied in multiple, you know, made up in multiple copies and forming
something that we call a homo alligomer. Homo means the same, right? so in this case, so the spike protein is the is an example of a homo
tetrah, homotrimer, sorry. So, it's three copies of a three copies in order to exactly.
We have the these three chains, the three molecular chains coupled together and performing the
function. That's what when when you look at this protein from the top,
you see a perfect triangle.
But other complexes are made up of different proteins.
Some of them are completely different.
Some of them are similar.
The hemoglobin molecule, right?
So it's actually, it's a protein complex. It's made of four
basic subunits. Two of them are identical to each other, and two other identical to each other,
but they are also similar to each other, which sort of gives us some ideas about the evolution of
this molecule. And perhaps one of the hypothesis that in the past it was just a Homo tetramer,
right?
So for identical copies, and then it became modifit, it became mutated over the time and
became more specialized.
Can we linger on the spike protein for a little bit?
Is there something interesting or like beautifully
you find about it?
I mean, first of all, it's an incredibly challenging protein.
And so we as a part of our research
to understand the structural basis of this virus
to sort of decode, structure decode every single protein in its proteome.
We've been working on this spike protein and one of the main challenges was that Cryo-M data allows us to reconstruct or to obtain the 3D coordinates of roughly two-thirds
of the protein.
The rest of the one-third of this protein is a part that is buried into the membrane
of the virus and of the viral envelope. And it also has a lot of unstable structures around it.
So it's chemically interacting somehow with whatever the hack is connecting?
Yes, so people are still trying to understand.
So the nature of the role of this one third, Because the top part, you know, the primary function is to get attached to the,
you know, a story-sector, human receptor. There is also beautiful, you know, mechanics of how
this thing happens, right? So because there are three different copies of this change. There are three different domains.
We're talking about domains.
This is the receptor binding domains, RBDs, that gets untangled and get ready to get
attached to the receptor.
Now, they are not necessarily going in a sync mode,
as a matter of fact.
Say synchronous.
So yes, and this is where another level of complexity comes into play,
because right now what we see is,
we typically see just one of the arms going out and getting ready to be attached to the ACE to receptors.
However, there was a recent mutation that people studied in that spike protein and a very recently
a group from UMAAS Medical School will happen to collaborate with groups. So this is a group from UMAs Medical School,
we happened to collaborate with groups,
so this is a group of Jeremy Luban
and a number of other faculty.
They actually solve the mutated structure of this bike
and they show that actually because of this mutations you have more than one
arms opening up. And so now also the frequency of two arms going up increase quite drastically.
How interesting. Does that change the dynamics somehow. It potentially can change the dynamics of
because now you have two possible opportunities to get attached to the ACE2 receptor. It's a very
complex, molecular process, mechanistic process. But the first step of this process is the attachment of this spike protein, of the spike
trimmer, to the human ASTU receptor. So this is a molecule that sits on the surface
of the human cell. And that essentially what initiates the, what triggers the
whole process of encapsulation. If this was dating, this would be the first date. So this is the end of the way.
So is it possible to have the spike protein just like floating about on its own, or does
it need that interactability with the membrane?
Yeah, so it needs to be attached at least as far as I know, but when you get this thing attached on the surface, there is also a lot of dynamics on where, how it sits on the surface.
So, for example, there was a recent work in, again, where people use the cryocton microscopy to get the first glimpse of the overall structure. It's a very low res, but you still get some interesting details
about the surface, about what is happening inside, because we have literally no clue
until recent work about how the Capsid is organized.
So Capsid.
So Capsid is essentially the inner core of the viral particle, where there is the RNA of the virus,
and it's protected by another protein and protein
that essentially acts as a shield.
But now we are learning more and more,
so it's actually not just a shield,
it's potentially used for the stability of the outer shell
of the virus, so it's pre-complicated.
And I mean, understanding all of this
is really useful for trying to figure out
like developing a vaccine or some kind of drug attack
and you ask for this, right?
So I mean, there are many different implications to that.
And first of all, you know, it's important to understand the virus itself.
So in order to understand how it acts, what is the overall mechanistic process of this virus,
replication of this virus proliferation to the cell, right? So that's one aspect.
The other aspect is, you know, designing new treatments.
So one of the possible treatments is, you know,
designing nanoparticles.
And so, so nanoparticles that will resemble the viral shape
that would have the spike integrated.
And essentially would act as a competitor to the real virus
by blocking the ACE2 receptors
and thus preventing the real virus entering the cell.
Now, there is a very interesting direction
in looking at the membrane,
at the envelope portion of the protein and attacking its M protein.
So, to give you a brief overview, there are four structural proteins.
These are the proteins that made up a structure of the virus. So spike as protein that acts as a trimer, so it needs three copies,
E, envelope protein, that acts as a pentamer, so it needs five copies, no act properly.
M is a membrane protein, it forms dimers, and actually it forms beautiful lattice and this is something
that we've been studying and we are seeing it in simulations. It actually forms a very
nice grid or you know, you know, threads, you know, of different dimers attached next
to each other.
But you copy each other and then naturally when you have a bunch of copies of each other,
they form an interesting virus.
Exactly. And you know, if you think about it, right, so, so, so, the, this complex, you know, the,
the viral shape needs to be organized somehow, self organized somehow, right.
So it, you know, if it was a completely random process, you know, you probably wouldn't
have the, the, the envelope shell of the ellipsoids shape. You would have
something pre-random, right? So there is some regularity and how these endimers get to
attach to each other in a very specific directed way.
Is that understood at all?
It's not understood.
We have been working in the past six months since we met, actually, this is where we started
working on trying to understand the overall structure of the envelope and the key components
that made up this structure.
It is the envelope also have the virus structure now.
So the envelope is essentially is the outer shell
of the viral particle.
The N, the nuclear capsut protein,
is something that is inside, but get that.
The N is likely to interact with M.
It is ago M and E, like where is the E and so. Just ago, m and e, like, where's the e?
So, e, those different proteins, they
occur in different copies on the viral particle.
So, e, this pentamer complex, we only have two or three,
maybe, per each particle.
We have 1,000 or so of M dimers that essentially made up, that makes
up the entire outer shell. So most of the outer shell is the M dimer.
And we am protein. We say particle, that's the virus on the virus the viral individual virus. It's a single element of the virus.
It's a single virus.
Single virus.
Right.
And we have about, you know, roughly 50 to 90 spike
trimers, right?
So, so, so, when you, you know, when you show a
per per virus particle, per virus particle.
So what did you say?
50 to 90?
50 to 90, right?
So, so this is how this thing is organized. And so now typically,
right? So you see these the antibodies that target spike proteins, certain parts of the spike
protein, but there could be some also some treatments, right? So these are small molecules that bind strategic parts of these
proteins disrupting its function. So one of the promising directions, it's one of the
newest directions, is actually targeting the MDimer of the protein, targeting the proteins that make up this outer shell,
because if you're able to destroy the outer shell, you're essentially destroying the viral particle itself.
So, preventing it from, you know, functioning at all.
So, that's, you think, is from a sort of cybersecurity perspective, a virus security perspective,
that's the best attack vector.
Or like, that's a promising attack vector.
I would say yes.
So I mean, it's still tons of research needs to be done.
But yes, I think, you know, so there's more attack surface, I guess.
More attack surface.
But, you know, from our analysis, from other
evolution analysis, this protein is evolutionally more stable compared to the spike protein.
A stable means a more static target. Well, yeah, so it doesn't change. It doesn't evolve
from the evolution of evolutionary perspective so drastically,
as for example, the spike protein.
There's a bunch of stuff in the news about mutations of the virus in the United Kingdom.
I also saw in South Africa something, maybe that was yesterday.
You just kind of mentioned about stability and so on, which aspects of this are mutatable and which
aspects, if mutated, become more dangerous, and maybe even zooming out what are your thoughts
and knowledge and ideas about the way it's mutated, all the news that we've been hearing.
Are you worried about it from a biological perspective? Are he worried about it from a human perspective. So I mean, mutations are sort of a general way for these viruses to evolve.
So it's essentially this is the way they evolve.
This is the way they were able to jump from one species to another. We also see some recent jumps. There were
some incidents of this virus jumping from human to dogs. So there is some danger in those
jumps because every time it jumps, it also mutates. So when it jumps to the species and jumps back, so it
acquires some mutations that are driven by the environment of a new host. And it's different
from the human environment. And so we don't know whether the mutations that are acquired in the new species are
neutral with respect to the human host or maybe, you know, maybe damaging.
Yeah, change is always scary.
So you worried about, I mean, it seems like because the spread is during winter now seems to be exceptionally high, and
especially with the vaccine just around the corner already being actually deployed, there's
some worry that this puts evolutionary pressure, selective pressure on the virus, afford to
meet, afford to mutate.
So that is just a worry. to meet, to meet a set of stress away. Well, I mean, there is always this thought, you know,
in the scientist's mind, you know, what will happen, right?
So I know there have been discussions
about sort of the arms race between the, you know,
the ability of the humanity to get vaccinated faster than the virus, essentially becomes resistant
to the vaccine.
I don't worry that much simply because there is not that much evidence to that,
to aggressive mutation around the vaccine. Exactly. Obviously, there are mutations around the
there are vaccines. So the reason we get vaccinated every year against the season of the mutations.
But I think it's important to study it, no doubts, right?
So I think one of the, to me, and again, I might be biased
because we've been trying to do that as well.
But one of the critical directions
in understanding the virus is to understand its evolution
in order to sort of understand the mechanisms,
the key mechanisms that lead the virus to jump,
the Nordic viruses to jump from species to another,
that the mechanisms that lead the virus to become resistant to vaccines
also to treatments. And hopefully that knowledge will enable us to forecast the evolutionary
traces, the future evolutionary traces of those virus.
I mean, what from a biological perspective, this might be a dumb question, but is there
parts of the virus that if souped up through mutation could make it more effective at doing
his job, we're talking about this specific coronavirus.
Like, because we were talking about the different, like the membrane, the amprotein, the eep protein, the N, and the spike.
Is there some 20 or so more addition to that?
But is that a dumb way to look at it?
Like, which of these if mutated could have the greatest impact, potentially damaging impact on the effectiveness of the virus.
So it's actually a very good question, because the short answer is we don't know yet,
but of course there is capacity of this virus to become more efficient. The reason for that is
if you look at the virus, it's a machine, So it's a machine that does a lot of different functions.
And many of these functions are sort of nearly perfect, but they're not perfect.
And those mutations can make those functions more perfect. For example, the attachment to ACE2 receptor, right, of the spike. Right, so, you know, is it, has this virus reached the efficiency in which the attachment
is carried out?
Or there are some mutations that still to be discovered, right, that will make this attachment stronger or something more efficient from the point of view of this
virus functioning.
That's the obvious example, but if you look at each of these proteins, it's therefore a
reason, it performs certain function. And it could be that
certain mutations will enhance this function. It could be that some mutations will make this
function much less efficient. So that's also the case.
Let's, since we're talking about the evolutionary history of a virus, let's zoom back out and look
at the evolution of proteins.
I glanced at this 2010 nature paper on the quote, ongoing expansion of the protein universe.
And then, you know, it kind of implies and talks about that protein started with a common ancestor, which is kind of interesting.
It's interesting to think about like even just like the first organic thing that started
life on earth.
And from that, there's now, you know, what is it?
3.5 billion years later, there's not millions of proteins, and they're still evolving.
And that's, you know,
in part one of the things that you're researching. Is there something interesting to you about the
evolution of proteins from this initial ancestor to today? Is there something beautiful and
insightful about this long story? So I think, you know, if I were to pick a single keyword about protein evolution,
I would pick modularity, something that we talked about in the beginning, and that's the fact that
the proteins are no longer considered as, you know, as the sequence of letters. There are hierarchical complexities in the way
this proteins are organized and these complexities are actually going beyond the protein sequence. It's
actually going all the way back to the gene to the nucleotide sequence. And so, again, this protein domains,
they are not only functional building blocks.
They are also evolutionary building blocks.
And so what we see in the later stages of evolution,
I mean, once this stable structurally and functional building
blocks were discovered, they essentially,
they stay, those domains stay as such.
So that's why if you start comparing different proteins, you will see that many of them
will have similar fragments.
And those fragments will correspond to something that we call protein domain families.
And so they are still different because you still have mutations and the different mutations
are attributed to diversification of the function of this protein domains. However, you don't, you very rarely see, you know, the, the evolutionary
events that would split this domain into fragments, because, and it's, you know, once you have the,
the domain split, you actually, you, you know, you can completely cancel out its function or at the very least you can
reduce it. And that's not efficient from the point of view of the self-function. So the
protein domain level is a very important one. Now, on top of that, right?
So, if you look at the proteins, right?
So, you have these structural units
and they carry out the function.
But then, much less is known about things that connect this protein domains.
Something that we call linkers.
And those linkers are completely flexible, you know, parts of the protein that nevertheless carry out a lot of function.
It's like little tails that are heads.
So we do have tails, so they call terminus, C and N terminus.
So these are things on one and another ends of the protein sequence. So they are also very important.
So they attribute it to very specific interactions
between the proteins.
So you refer to the links between the domains.
That connect the domains.
And apart from the simple perspective,
if you have a very short domain,
you have two domains short domain, you have a very short linker,
you have two domains next to each other,
they are forced to be next to each other.
If you have a very long one,
you have the domains that are extremely flexible,
and they carry out a lot of sort of spatial reorganization,
right?
That's awesome.
But on top of that, right, just this linker itself, because it's so flexible, it actually can adapt to a lot of different shapes.
And therefore, it's a very good interactor when it comes to interaction between this protein and other protein.
They in a way have different laws of driving laws that underlie the evolution because they no longer need to preserve certain structure, unlike protein domains.
And so on top of that, you have something that is even less studied.
And this is something that attribute to the concept of alternative splicing. So alternative
splicing, so it's a very cool concept. It's something that we've been fascinated about for over a decade in my lab and trying to do research
with that. So typically, a simplistic perspective is that one gene is equal one protein product.
So you have a gene, you transcribe it and translate it and it becomes a protein. In reality, when we talk about
eukaryotes, especially more recent eukaryotes that are very complex, the gene is no longer equal to one protein. It actually can produce multiple,
functionally active protein products. And each of them is called an alternative
splice product. The reason it happens is that if you look at the gene, it actually has, it has also
blocks. And the blocks, some of which, and it's essentially, it goes like this. So we have a
block that will later be translated, we call it Xon, then we'll have a block that is not translated,
cut out, we call it intron. So we have Xon,ron, exon intron, et cetera, et cetera, et cetera.
Right?
So sometimes you can have dozens of these exons and introns.
So what happens is during the process
when the gene is converted to RNA,
we have things that are cut out,
the introns that cut out, and exons that now get assembled together.
And sometimes we will throw out some of the exons.
And the remaining protein product will be cut.
So it will be the same.
Oh, different.
Right, so now you have fragments of the protein that no longer there.
They were cut out with the introns.
Sometimes you will essentially take one
axon and replace it with another one.
So there's some flexibility in this process.
So that creates a whole new level of complexity.
Because now...
This is random, no?
It's not random.
We, and this is where I think now the appearance of this modern single cell
and before that tissue level sequencing, next generation sequencing techniques such as RNA-SIG
allows us to see that these are the events that often happen in response in the dynamic event that happens in response to disease
or in response to certain developmental stage of a cell.
And this is an incredibly complex layer
that also undergoes, I mean, because it's at the gene level,
right, so it undergoes certain evolution, right?
And now we have this interplay between what is happening
in the protein world and what is happening in the gene
and RNA world.
And for example, it's often that we see that the boundaries
of these zones coincide with the boundaries of the protein domains. Right? So the there
is there is, you know, close interplay to that. It's not always, I mean, you know, otherwise
it would be too simple, right? But we do see the connection between
those sort of machineries. And obviously the evolution will pick up this complexity and,
select for whatever success we see that complexity in play and makes this question
more complex, but more exciting.
As a small detail, I don't know if you think about this
into the world of computer science.
There's Douglas Hustetter, I think,
came up with a name of Kwan,
which I don't know if you're familiar with these things,
but it's computer programs that have,
I guess, Exxon and Entron, and they copy the whole purpose
of the program is to copy itself.
So it prints copies of itself, but can also carry information inside of it.
So it's a very kind of crude, fun exercise of, can we sort of replicate these ideas from
cells?
Can we have a computer program, program that when you run it,
just print itself, the entirety of itself. And does it in different programming languages and so on,
I've been playing around and writing them. It's a kind of fun little exercise.
You know, when I was a kid, so, you know, it was essentially one of the main stages in mathematics, Olympia, that you have to reach in order to be any
so good, is you should be able to write a program that replicates itself.
And so the task then becomes even more complicated.
So what is the shortest program?
What is the shortest?
And of course, it's a function of a programming language.
But yeah, I remember long, long, long time ago,
when we tried to make it short and short and find the short
kind of thing.
There's actually a stack exchange.
There's an entire site called CodeGolf code golf I think where the entire is just
the competition people just come up with whatever task. I don't know like a right code that
reports the weather today and the competition is about whatever programming language what
is the shortest program and it makes you actually, people should check it out because it makes you realize there's some
some weird programming languages out there, but you know, just to dig on that a little
deeper, do you think?
You know, in computer science, you don't often think about
programs, there's like the machine learning world now
think about programs. There's like the machine learning world now that's still kind of basic programs. And then there's humans that replicate themselves, right? And there's these mutations
and so on. Do you think we'll ever have a world with there's programs that kind of have an
evolutionary process? So I'm not talking about evolutionary algorithms,
but I'm talking about programs that kind of meet
with each other and evolve and like,
on their own replicate themselves.
So this is kind of the idea here is, you know,
that's how you can have a runaway thing.
So we think about machine learning as a system
that gets smarter and smarter and smarter and smarter,
at least the machine learning systems of today are like, it's a program that you can like turn off.
As opposed to throwing a bunch of little programs out there and letting them like multiply and mate and evolve and replicate.
Do you ever think about that kind of world, you know world when we jump from the biological systems that
you're looking at to artificial ones?
I mean, it's almost like you take the sort of the area of intelligent agents, right?
Which are essentially the independent sort of codes that run and interact and exchange
the information.
So I don't see why not.
I mean, it could be sort of a natural evolution in this area of computer science.
I think it's kind of an interesting possibility.
It's terrifying too, but I think it's a really powerful tool.
Like they have agents that interact.
You know, we have social networks with millions of people and they interact. I think it's interesting to inject into that,
it was already injecting into that bots, right? But those bots are pretty dumb, you know,
they're probably pretty dumb algorithms. You know, it's interesting to think that there
might be bots that evolve together with humans. And there's the sea of humans and robots that are operating first in a digital space.
And you can also think, I love the idea of some people work.
I think at Harvard, at Penn, there's robotics labs that take as a fundamental task,
they build a robot that given extra resources can build another copy of itself.
Like in the physical space, which is super difficult to do, but super interesting.
I remember there's like research on robots that can build a bridge.
So they make a copy of themselves and they connect themselves.
So there's like self-building bridge based on building blocks, you can imagine
like a building that self-assembles.
So it's basically self-assembling structures
from robotic parts, but it's interesting
to within that robot add the ability to mutate
and do all the interesting little things
that you're referring to in evolution to go from
a single origin protein building block to like, well, we're complex.
And if you think about this, I mean, you know, the bits and pieces are there, you know,
so you might have revolution algorithm, right?
You know, so this is sort of, yeah, and the maybe sort of the goal is in a way different.
So the goal is to essentially to optimize your search.
But the ideas are there.
So people recognize that the recombination events lead
to global changes in the search trajectory,
the mutations event, the more refined step in the search.
Then you have other sort of nature inspired algorithm, right?
So one of the reasons that I think it's one of the funnest one is the slime-based algorithm, right?
So, I think the first was introduced by the Japanese group where it was able to solve some
pre-complex problems. So, that's the end. And then, I think there are still a lot of things
with yet to borrow from the nature.
So there are a lot of ideas that nature gets to offer us
that it's up to us to grab it and to get the best use of it.
Including your networks, you know, we have a very crude inspiration from nature on your
networks.
Maybe there's other inspirations to be discovered in the brain or other aspects of the
various systems, even like the immune system, the way it interplays.
I recently started to understand that the immune system has something to do with the way it interplays. I recently started to understand that the immune system
has something to do with the way the brain operates.
Like there's multiple things going on in there,
which all of which you're not modeled
in an artificial neural networks.
And maybe if you throw a little bit of that biological
spice in there, you'll come up with something cool.
I'm not sure if you're familiar with the Drake equation
that estimate. I just did a video on it yesterday because I wanted to give my own estimate of it.
It's it's an equation that combines a bunch of factors to estimate how many alien civilizations
in the galaxy. I've heard about it. Yes. So one of the interesting parameters, you know, it's like
I've heard about it. Yes. So one of the interesting parameters, you know, it's like how many
stars are born every year, how many planets are on average per star, for this, how many habitable planets are there. And then the one that starts being really interesting is the probability that life emerges on a habitable planet. So like, I don't
know if you think about, you certainly think a lot about evolution, but do you think about
the thing which evolution doesn't describe, which is like the beginning of evolution, the
origin of life. I think I put the probability of life developing an habitable planet, 1%, this is very scientifically
rigorous.
Okay.
Well, first, at a high level for the Drake equation, what would you put that percent
that on earth?
And in general, do you have something, do you have thoughts about how life might have
started, you know, like the proteins being the first kind of one of the early jumping points.
Yes, so, so I think back in 2018, there was a very exciting paper published in Nature, where they
found one of the simplest amino acids, glycine, in a comet dust.
So this is, I apologize if I don't pronounce, it's a Russian named comet, I think, to
Grimoff Gerasimenko.
This is the comet where, and there was this mission to get close to this comment and get the start-ups from its tail.
And when scientists analyzed it, they actually found traces of glycine, which makes up the,
it's one of the basic, one of the 20 basic, I mean, ISIS that makes up proteins.
So that was kind of exciting.
That's exciting, right?
But the question is very interesting, right?
So if there is some alien life, is it gonna be made of proteins?
Or maybe RNAs, right?
So we see that the RNA viruses
are certainly very well established
sort of group of molecular machines, right?
So yeah, it's a very interesting question. What probability would you put?
Like, how hard is this jump?
Like, how unlikely just on earth do you think this whole thing is that we got going?
Like, is that we're really lucky or is it inevitable?
Like, what's your sense when you sit back and think about life on earth?
Is it higher or lower than 1%?
Well, because 1% is pretty low but it's still like
damn that's a pretty good chance. Yes it's a pretty good chance I mean I would
personally but again you know I'm you know probably not the best person to do
such estimations but I would you know intuitively I would probably put it lower, but still, I mean, you know, give it
really lucky here on earth.
I mean, or the conditions are really good.
I mean, I think that there was everything was right in a way, right?
So, we still, it's not the conditions were not like ideal if you try to look at what was several billions years ago
when the life emerged. So there is something called the rare earth hypothesis that
encounter to the Drake equation says that the conditions of earth, if you actually were to describe earth, it's quite a special place.
So special might be unique in our galaxy and potentially close to unique in the entire universe.
It's very difficult to reconstruct those same conditions. What the rare earth hypothesis
argues is all those different conditions are essential for life.
So that's sort of the counter, you know, like all the things we
thinking that earth is pretty average. I mean, I can't really, I'm trying to remember to go
through all of them, but just the fact that it is shielded from a lot of asteroids, obviously the distance to the sun,
but also the fact that it's like a perfect balance
between the amount of water and land,
and all those kinds of things.
I don't know, there's a bunch of different factors
that I don't remember, there's a long list,
but it's fascinating to think about if in order
for something like proteins and DNA and RNA to emerge,
you need, and basically living organisms, you need to be very close to an Earth-like planet,
which will be sad.
Well, exciting, I don't know. If you ask me, I, you know, in a way I put a parallel between
our own research and I mean from the intuitive perspective, you have those two extremes.
And the reality is never very rarely falls into the extremes.
It's always the optimism always reached somewhere in between.
And that's what I tend to think.
I think that we're probably somewhere in between.
So they're not unique, unique, but again, the chances are, you know, reasonable, just
small.
The problem is we don't know the other extreme is like,
I tend to think that we don't actually understand
the basic mechanisms of like what this is all originated from.
Like, it seems like we think of life as this distinct thing,
maybe intelligence as a distinct thing,
maybe the physics that from which planets and suns are born is a distinct thing, but the physics that from which planets and sons are born is a distinct thing,
but that could be a very, it's like the Stephen Wolf from thing, it's like the,
from simple rules, emerges greater and greater complexity. So, you know, I tend to believe that
just life finds a way. It, like, we don't know the extreme of how common life is because it could be life is like everywhere.
Like, like, so everywhere that it's almost like laughable.
Like that we're such idiots to think we're you like it's like ridiculous, even like think.
It's like ants thinking that they're a little colony is the unique thing and everything else
doesn't exist. I mean it's also very possible that that's that's the extreme
and we're just not able to maybe comprehend the nature of that life. I mean just
to stick on alien life for just a brief moment more. There is some signs of signs of life on Venus in gaseous form. There's hope for life on Mars, probably
extinct. We're not talking about intelligent life. Although that has been in the news recently.
We're talking about basic, like, you know, bacteria. Yeah. And then also, I guess, there's
a couple moons that are like Europe. Yeah, Europe, which is
Jupiter's moon. I think there's another one. Are you, is that exciting? There's a terrifying
to you that we might find life? Do you hope we find life? I certainly do hope that we find life.
I mean, it was very exciting to hear about, you know, this news about the possible life on the Venus. It's been nice to have hard evidence of something with which is what the hope is for Mars and Europa.
But do you think those organisms would be similar biologically or would they even be sort of carbon-based if we do find them?
I would say they would be carbon-based.
How similar is a big question, right?
So it's the moment we discover things outside Earth, right?
Even if it's a tiny little single cell, I mean, there's so much.
Just imagine that. That would be so. I think that that would be another turning point for the science.
And if the specific is different in some very new way, that's exciting because that says that's a definitive state,
not a definitive, but a pretty strong statement that life is
everywhere in the universe. To me, at least, that's really exciting. You brought up Joshua
Letterberg in an offline conversation. I think I'd love to talk to you about Alpha Fold, and this
might be an interesting way to enter that conversation, because so he won the 1958 Nobel Prize in Physiology
Medicine for discovering their bacteria, can mate and exchange genes, but he also did
a ton of other stuff like we mentioned, helping NASA find life on Mars and the... Dendro.
Dendro, the chemical expert system.
Expert systems, remember those?
Do you, what do you find interesting about this guy and his ideas
about artificial intelligence in general?
I have a kind of personal story to share.
So I started my PhD in Canada back in 2000. And so essentially my PhD was,
so we were developing sort of a new language for symbolic machine learning. So it's different
from the feature-based machine learning. And one of the sort of cleanest applications of this
of the sort of cleanest applications of this approach, of this formalism, was two cheminformatics and computerated drug design. So essentially, this is part of my research. I developed a system
that essentially looked at chemical compounds of, say, the same therapeutic category, you know, male hormones,
right, and try to figure out the structural fragments that are the structural building blocks
that are important that define this class versus structural building blocks that are there just because, you know,
they to complete the structure, but they are not essentially the ones that make up the chemical,
the key chemical properties of this therapeutic category. And, and, you know, for me, it was
something new. I was I was trained as an applied mathematicians, you know, as with some machine-learning
background, but you know computer-aid-rugged design was completely a completely new territory.
So because of that, I often find myself asking lots of questions on one of these sort of central
forums. Back then, there were no Facebooks or stuff like that. There was a forum.
It's a forum.
It's essentially like a bulletin board.
Yeah.
Why didn't you get?
Yeah.
So you essentially, you have a bunch of people and you post a question and you get an
answer from different people.
And back then, one of the most popular forums was CCL. I think computational chemistry library,
not library, but there's something like that. But CCL, that was the forum. And there, I
asked a lot of dumb questions. Yes, I asked questions. Also, shared some information about our formalism, how we do and whether whatever we do makes sense.
And so I remember that one of these posts,
I mean, I still remember, I would call it
desperately looking for a chemist advice, something like that.
And so I post my question, I explained how our formalism is, what it does, and what kind of
applications I'm planning to do.
It was in the middle of the night and I went back to bed. And next morning, have a phone call from my advisor,
who also looked at this forum.
It's like, you won't believe who replied to you.
And it's like, who?
And he said, well, there is a message to you
from Joshua Lederberg.
And my reaction was like, who is Joshua Lederberg. And my reaction was like, who's Joshua Lederberg?
Here at Pfizer hung up.
So essentially, Joshua wrote me that we had conceptually
similar ideas in the Dendrel project.
You may want to look it up.
And I'm like, we should also, sorry, and I said, side comments say that even though
he won the Nobel Prize at a really young age in 58. But so he was, I think, he was what, 33.
Yeah, it's just crazy. So anyway, so that's so hence, hence in the 90s, responding to young
whippersnappers on the CCL forum.
Back then he was already very senior.
I mean, he unfortunately passed away back in 2008.
But back in 2001, he was a professor emeritus
at Rockefeller University.
And that was actually, believe it or not, one of the reasons I decided to join as a postdoc,
the group of Andre Saleh, who was a Trockafaler University, with the hope that I could actually
have a chance to meet Joshua and Persson.
I met him very briefly, right? Just because he was walking, you know, there is a little bridge that connects the sort of
the research campus with the sort of sky scrapper that Rockefeller owns the very, you know,
post dogs and faculty and graduate students live.
And so I met him you know and had a
Very short conversation, you know, but
So I started you know reading about dandral and I was amazed, you know, it's we're talking about 1960. Yeah, right
The ideas were so profound. Well, what's the fundamental idea?
The reason to make this is even
crazier. So Leatherberg wanted to make a system that would help him study the
extraterrestrial molecules. So the idea was that, you know, the way you study the extraterrestrial molecules is you do the mass spec analysis, right?
And so the mass spec gives you sort of bits numbers about essentially gives you the ideas about the possible fragments or, you know, atoms, you know, and maybe a little fragments pieces of this molecule that make up the molecule.
So now you need to sort of decompose this information
and to figure out what was the hole
before it became fragments bits and pieces.
So in order to have this tool, the idea of ladderwork was to connect chemistry, computer science,
and to design this so-called expert system that takes into account, that takes as an input the mass spec data, the possible database
of possible molecules, and essentially try to sort of induce the molecule that would correspond
to this spectra.
Or, you know, essentially what this project ended up being it was that, you know, it would provide a list of candidates that then a chemist would look at and make final decision.
But the original idea, I suppose, is he, back then, he approached, yes, believe that. It's amazing.
I mean, still blows my mind, you know, that this was essentially the origin of the modern bioinformatics, cheminformatics, you know, back in the 60s.
Yeah.
So that's, you know, so every time you deal with projects like this,
with the, you know, research like this, you just, you know,
so the the power of the intelligence of these people is just, you know,
overwhelming. Do you think about expert systems?
Is there and why they kind of didn't become successful, especially in the space of bioinformatics,
where it does seem like there is a lot of expertise in humans.
And it's possible to see that a system like this could be made very useful.
Right.
And people stuff.
It's actually, it's a great question.
And this is something.
So, at my university, I teach artificial intelligence.
And we start, my first two lectures are on the history of AI.
And there we try to go through the main stages of AI.
And so the question of why expert systems failed
or became obsolete, it's actually a very interesting one.
And if you try to read the historical perspectives,
there are actually two lines of thoughts.
One is that they were essentially not up to the expectations.
And so therefore, they were replaced by other things.
The other one was that completely opposite one, that they were too good.
And as a result, they essentially became sort of a household name and then essentially
they got transformed.
I mean, in both cases, the outcome was the same.
They evolved into something.
Yeah.
Right?
And that's what I, if I look at this, right?
So the modern machine learning, right?
So echoes in the modern machine learning to that.
I think so.
I think so.
Because if you think about this, and how we design the most
successful algorithms, including alpha- how we design, you know, the most successful algorithms, including alpha
fault, right? You built in the knowledge about the domain that you study, right? So,
so you built in your expertise. So speaking of alpha fault, the deep minds alpha fault
too recently was announced to have, quote unquote unquote solved protein folding.
How exciting is this to you?
It seems to be one of the exciting things that have happened
in 2020.
It's an incredible accomplishment from the looks of it.
What part of it is amazing to you?
What part would you say is overhyped or maybe misunderstood?
It's definitely a very exciting achievement to give you a little bit of perspective.
So in Binephamatics, we have several competitions. And so the way you often hear how those
competitions have been explained to sort of to non-binephameticians is they call it Binephamatics
Olympic games.
And there are several disciplines, right?
So the historical one of the first one
was the discipline in predicting the protein structure,
predicting the 3D coordinates of the protein.
But there are some others.
So the predicting protein functions,
predicting effects of mutations on protein functions,
then predicting protein-protein interactions.
So the original one was a CASP,
or a critical assessment of protein structure.
And the, you know, typically what happens
during these competitions is, you know, scientists,
experimental scientists solve the structures, but don't put them into the protein
data bank, which is the centralized database that contains all these 3D coordinates.
Instead, they hold it and release protein sequences.
And now the challenge of the community is to predict
the 3D structures of this proteins
and then use the experimental result structures
to assess which one is the closest one.
Right.
And this competition by the way,
just a bunch of different tensions.
And maybe you can also say what is protein folding
then this competition cast competition has become the gold standard and that's what was used to
say that protein folding was solved so I just added a little um yeah just a bunch so if you couldn't
whenever you say stuff maybe throw in some of the basics for the folks that might be outside of the field. Anyway, sorry.
So, yeah, so the reason it's relevant to our understanding of protein folding is because
we've yet to learn how the folding mechanistically works.
So there are different hypothesis, what happens to this fault.
For example, there is a hypothesis that defaulting happens by, you know, also in the modular
fashion, right?
So that, you know, we have protein domains that get folded independently because the structure
is stable and then the whole protein structure gets formed.
But within those domains, we also have
so-called secondary structure, the small alpha helices,
beta-sis, so these are elements that are structurally stable.
And so the question is, when do they get formed?
Because some of the secondary structural elements,
you have to have, you know, a fragment in the beginning and say the fragment in the middle,
right? So you cannot potentially start having the full fall from the get-go, right? So it's still,
you know, it's still a big enigma, what happens? We know that it's an extremely efficient
and stable process.
So there's this long sequence
and the fold happens really quickly.
Exactly.
Well, that's really weird, right?
And it happens the same way almost every time.
Exactly, exactly.
That's really weird.
So that's freaking weird.
It's the same thing.
That's the same thing.
It's such a amazing thing. But most
importantly, right? So it's you know, so when you see the the
the you know, the translation process, right? So when you
don't have the the whole protein translated, right, it is
still being translated, you know, uh, uh, getting out from
the ribosome, you already see some structural, you know, uh, uh, uh, getting out from the ribosome, you, you already see some structural,
you know, fragmentation. So, so folding starts happening before the whole protein gets produced.
Right. And so this is, this is obviously, you know, one of the biggest questions in, you know,
in modern molecular biologies. Not, not what happens, that's not as big as a question of folding, that's the question of
like deeper fundamental idea of folding. Yes, behind folding. Exactly. Exactly. So obviously, if we
are able to predict the end product of protein folding.
We have one step closer to understanding the mechanisms of the protein folding.
Because we can then potentially look and start probing what are the critical parts of this process
and what are not so critical parts of this process. And what are not so critical parts of this part? So we can start decomposing
this. So in a way, this protein structure prediction algorithm can be used as a tool.
So you change the, you modify the protein, you get back to the stool, it predicts, okay, it's completely unstable.
Which aspects of the input will have a big impact on the output?
Exactly. So what happens is, we typically have some sort of incremental advancement.
Each stage of this Casp competition, you have groups with
incremental advancement and you know, historically, the top performing groups were,
you know, they were not using machine learning, they were using very advanced
biophysics, combined with bioinformatics, combined with the data mining, and that would enable them
to obtain protein structures of those proteins that don't have any structurally soft relatives.
Because if we have another protein, say the same protein, but coming from a different species,
we could potentially derive some ideas, and that's so-called homology or comparative modeling,
where we'll derive some ideas from the previously known structures, and that would help us tremendously in reconstructing the 3D structure overall.
But what happens when we don't have these relatives?
This is when it becomes really, really hard.
So that's so called the NOVA, the NOVA protein structure prediction.
And in this case, those methods were traditionally very good.
But what happened in the last year, the original alpha fold came into
and over sudden, it's much better than everyone else.
This is 2018.
Yeah.
And the competition is only every two years.
I think. And then, so, you know, it was sort of
kind of over shockwave to the Biden-Familisk community that, you know, we have like a state of the
art machine learning system that does, you know, structure prediction. And essentially what it does, if you look at this, it actually predicts
the context. So the process of reconstructing the 3D structure starts by predicting the
context between the different parts of the protein. And the context essentially is the parts of
the proteins that are in a close proximity to each other.
Right. So it's actually the machine learning part seems to be estimating. You can correct me if I'm wrong here,
but it seems to be estimating the distance matrix, which is like the distance between the different parts.
Yeah. So we go to the contact map.
Contact map. Right. So once you have the contact map, the reconstruction is becoming more straightforward.
Right.
But so the contact map is the key.
And so, so, you know, so that, what happened.
And now we started seeing in this current stage, right, where in the most recent one, we started
seeing the emergence of these ideas in others' people works.
But yet, here's AlphaFold 2 that again outperforms everyone else.
And also by introducing yet another way of the machine learning ideas.
There don't seem to be also an incorporation.
First of all, the paper is not out yet, but there's a bunch of ideas already out.
There does seem to be an incorporation of this other thing.
I don't know if it's something that you could speak to, which is like the incorporation
of like other structures, like evolutionary similar structures that are used to kind of give you hints.
Yes, so so so evolutionary similarity is something that we can detect at different levels,
right? So we know for example that the structure of proteins is more conserved than the sequence.
The sequence could be very different, but the structural shape is actually still very conserved.
So that's sort of the intrinsic property that in a way related to protein folds,
to the evolution of the protein domains, etc.
But we know that, there have been multiple studies. And ideally, if you have structures,
you should use that information. However, sometimes we don't have this information. Instead,
we have a bunch of sequences. Sequences have a lot. So we have hundreds, thousands of
hundreds, thousands of different organisms sequenced, right? And by taking the same protein, but in different organisms,
and aligning it, so making it, you know,
making the corresponding positions aligned,
we can actually say a lot about sort of what is conserved in this protein,
and therefore structurally more stable, what is diverse in this protein.
So on top of that, we could provide the information about the secondary structure of this protein,
etc. So this information is extremely useful, and it's already there. So while it's tempting to do a complete
ab initio, so you just have a protein sequence and nothing else,
the reality is such that we are overwhelmed with this data.
So why not use it?
And so, yeah, so I'm looking forward to reading the
this paper. It does seem like they, in the previous version of Alpha Fold, they didn't,
for this, their evolutionary similarity thing, they didn't use machine learning for that.
Or they, rather, they used it as, like, the input to the entirety of the, the neural net,
like, the features derived from the similarity.
It seems like there's some kind
of quote unquote iterative thing where it seems to be part of the learning process is the
incorporation of this evolution in similarity.
Yeah, I don't think there is a bioarchive paper, right?
There's not all. There's nothing.
There's a blog post that's written by a marketing team essentially,
which, you know, it has some scientific similarity probably
to the actual methodology used, but it could be, it's like
interpreting scripture.
It could be just poetic interpretations of the actual work
as opposed to direct connection to the work.
So now speaking about protein folding, right? So you know in order to answer the question whether or
not we have solved this, right? So we need to go back to the beginning of our conversation.
You know with the realization that you know an average protein is that typically what
the the Casp has been focusing on is the, you know, this competition has been focusing on the single maybe two domain proteins that are still very compact.
And even those ones are extremely challenging to solve. Right. But now we talk about, you know, an average protein that has two, three protein domains. If you look at the
proteins that are in charge of the process with the neural system, right?
Perhaps one of the most recently evolved sort of systems in the organism, right? All of them, well, the majority of them are highly multi-domain proteins.
So they are, you know, some of them have 5, 6, 7, you know, and more domains, right? And, you know, we are very far away from understanding how this protein is a folded.
So the complexity of the protein matters here,
the complexity of the protein modules
or the protein domains.
So you're saying,
solved, so the definition of solved here
is particularly the cast competition
achieving human level, not human level.
Achieving experimental level
performance on these particular sets of proteins that have been used in these competitions.
Well, I mean, I do think that, especially with regards to the alpha-fold, it is able to solve at the near experimental level a pretty big majority of the more compact
proteins, like or protein domains, because again, in order to understand how the overall protein, multi-domain protein fault, we do need to understand the structure of its individual domains.
I mean, unlike if you look at Alpha Zero or even Mu Zero, if you look at that work, it's nice. Reinforcement learning, self-plane mechanism is nice because it's all in simulation
so you can learn from just huge amounts. You don't need data. The problem with proteins,
like the size, I forget how many 3D structures have been mapped, but the training data is very small,
no matter what. It's like millions, maybe a 1 or two million, some like that. But some very small number, but like it doesn't seem like that scalable.
There has to be, I don't know, it feels like you want to somehow
10X the data or 100X the data somehow.
Yes, but we also can take advantage of, of homology models, right?
So the models that are of very good quality because they are essentially obtained based
on the evolutionary information.
So you can, there is a potential to enhance this information and, you know, use it again to empower the training set.
And I think I am actually very optimistic.
I think it's been one of these sort of, you know, churning events where you have a system that is, you know, a machine
learning system that is truly better than the sort of the more conventional biophysics
based methods.
Yeah, that's a huge leap.
This is one of those fun questions, but where would you put it in the ranking of the greatest
breakthroughs in artificial intelligence history?
Okay, so let's see who's in the running, maybe incorrect me.
So you got like Alpha Zero and Alpha Go beating, you know, beating the world champion at the game of Go.
Thought to be impossible like 20 years ago.
Or at least the ad community was highly skeptical.
Then you got like also deep blue original Kasparov.
You have deep learning itself like the maybe what would you say the Alex Net image in that moment.
So the first you know, I work
at achieving human level performance. Super, not that's not true. Achieving like a big leap
in performance on the computer vision problem. There is OpenAI, the whole like GPT-3, that
whole space of transformers and language models, just achieving this incredible
performance of application of neural networks to language models. Boston Dynamics, pretty cool.
Like robotics. People are like, there's no AI, no, no, no, no, no machine learning currently, but AI is much bigger than machine learning.
So that, just the engineering aspect, I would say it's one of the greatest accomplishments
in engineering side, engineering, meaning like mechanical engineering of robotics ever.
Then of course, autonomous vehicles, you can argue for Waymo, which is like
the Google South driving car, or you can argue for Tesla, which is like actually being
used by hundreds of thousands of people on the road today, machine learning system.
And I don't know if you can, what else is there? But I think that's it. So then alpha
fall may be worse saying saying as up there,
potentially number one, would you put them at number one? Well, in terms of the impact
on on the science and on the society beyond, it's definitely, you know, to me, would be one of the, you know, three, which you want. I mean, I'm probably not the best person to answer that. You know, but, you know,
I do have, I remember my, you know, back in I think 1997 when deep blue, that Kasperov. It was a shock. I mean, it was, and I think for the, you know,
for the, you know, pre-substantial part of the world, that especially people who have some,
you know, some experience with chess, right?
And realizing how incredibly human this game,
how much of a brain power you need
to reach those levels of grandmasters, right, level.
It's probably one of the first time
and how good Kaspar was.
And again, Kaspar is actually one of the first time and how good Kasparov was. And again, yes. So Kasparov's arguably one of the best ever, right?
And you get a machine that beats him.
All right, so it's, it's,
first time a machine probably be a human
at that scale of a thing, of anything.
Yes.
Yes.
So to me, that was like, you know,
one of the groundbreaking events in the history for you.
Yes, probably number one. I probably like we don't
It's it's hard to remember it's like Muhammad Ali versus I don't know any of the Mike Tyson something like that
It's like you gotta put Muhammad Ali at number one
Same with same with Diablo even though it's not machine learning based
I still it uses advanced search and search is the integral part of it.
Yeah, right. So it's not, it's sad. People don't think of it that way. At this moment,
invoke currently, search is not seen as a, as a fundamental aspect of intelligence,
but it very well, and very likely is. In fact, I mean, that's what neural networks are.
They're just performing search on the space of parameters.
And it's all search.
All of intelligence is some form of search,
and you just have to become clever and clever at that search problem.
And I also have another one that you didn't mention that's one of my favorite ones.
So you probably heard of this.
It's, I think it's called deep
Rembrandt. It's the project where they trained, I think there was a
collaboration between the sort of the experts in Rembrandt painting in
Netherlands and a group, an artificial intelligence group, where they
train an algorithm to replicate the style of the Rembrandt and they actually printed a
portrait that never existed before in the style of Rembrandt.
I think they printed it on a canvas that, you know, using pretty much
same types of paints and stuff, to me,
it was mind blowing.
And the space of art, that's interesting.
There hasn't been, maybe that's it, but I think there hasn't been an image in that moment
yet in the space of art.
You haven't been able to achieve superhuman level performance in the space of art, even though there was, you know, there's a big famous thing where there was a piece of art was purchased
I guess for a lot of money. Yes. Yeah. But it's still, you know, people are like in
space and music at least. That's, you know, it's clear that human created pieces are
much more popular. So there hasn't been a moment where it's like, oh, this is where now I would say in the
space of music, what makes a lot of money, which I want serious money.
It's music and movies, or like shows and so on, and entertainment.
There hasn't been a moment where AI created, AI was able to create a piece of music
or a piece of cinema or like Netflix show
that is, you know, that's sufficiently popular
to make a ton of money.
Yeah.
And that moment would be very, very powerful
because that's like a, that's an AI system
being used to make a lot
of money.
And like, direct, of course, AI tools, like even Premiere, audio editing, everything I
do, the editor of this podcast, there's a lot of AI involved.
I won't, actually, this is a program.
I want to talk to those folks just because I'm going to nerd out.
It's called Isotope.
I don't know if you're familiar with it.
They have a bunch of tools of audio processing.
And they have, I think they're Boston based.
Just, it's so exciting to me to use it
like on the audio here,
because it's all machine learning.
It's not, because most audio production stuff
is like any kind of processing,
you do is very basic signal processing
and you're tuning knobs and so on.
They have all of that of course, but they also have all of this machine learning stuff.
Like where you actually give it training data, you select parts of the audio you train
on, you train on it and it figures stuff out.
It's great.
It's able to detect the ability of it to be able to separate voice and music, for example, or
voice and anything is incredible.
It's just clearly exceptionally good at applying these different networks models to separate
the different kinds of signals from the audio.
That's really exciting Photoshop, Photoshop Adobe people also use it, but to generate a piece
of music that will sell millions, a piece of art, yeah.
Now I agree, and that's, as I mentioned, I offer my AI class and an integral part of this is a project, right?
So it's my favorite, ultimate favorite part because typically we have these project presentations
the last two weeks of the classes right before the Christmas break.
It adds this cool excitement. And every time I'm amazed, you know,
with some projects that people, you know, come up with.
And so, and quite a few of them are actually, you know,
they have some link to arts.
I mean, you know, I think last year,
we had a group who designed NAI producing
Hokus, Japanese poems. Oh wow. So and some of them. So, you know, it got trained
on the English space. High course, high course, right? So and and some of them, you know,
they get to present like the top selection. They were pretty good. I mean, you know, they get to present, like, the top selection.
They were pretty good. I mean, you know, of course, I'm not, I'm not a specialist, but you, you read them and you see it's profound.
Yes, yeah, it seems, it's kind of cool.
We also had a couple of projects where people tried to teach AI how to play rock music, classical music,
I think, and popular music.
Interestingly enough, classical music was at the grandmasters of music like Bach, right?
So there is a lot of almost math.
He's very mathematical.
Exactly.
So I would imagine that at least some style of this music could be picked up,
but then you have this completely different spectrum of classical composers.
And so, you know, it's almost like you don't have to sort of look at the data,
you just listen to it and say, nah, that's not it, not yet.
That's not it. Not yet. Yeah, that's how I feel too.
There's an opening eye as I think open muse
or something like that, the system.
It's cool, but it's like, yeah,
it's not compelling for some reason.
It could be a psychological reason too.
Maybe we need to have a human being tortured soul
behind the music.
I don't know.
Yeah, no, absolutely, I completely agree.
But yeah, whether or not we'll have,
one day we'll have a song written by an AI engine
to be in like in top charts, musical charts.
I wouldn't be surprised.
I wouldn't be surprised.
I wonder if we already have one and it just hasn't been announced. We would know. How hard is the multi protein folding
problem? Is that kind of something you've already mentioned, which is baked into this
idea of greater and greater complexity of proteins. Multi-domain proteins is that basically become multi-protein complexes?
Yes, you got it right.
It has the components of both, of protein folding and protein-protein interactions.
Because in order for this domain,
I mean, many of these proteins,
actually, they never form a stable structure.
One of my favorite proteins,
and pretty much everyone who works in a,
I know who might know who works in a,
with proteins, they always have their favorite proteins.
So one of my favorite proteins probably my favorite proteins, one that I worked when I was a
postdoc is so called post synaptic density 95 pzd95 protein. So it's one of the key actors
in the majority of neurological processes at the molecular level.
So it's a essentially it's a key player in the post synaptic density.
So this is the crucial part of this synapse where a lot of these chemicalolecular processes are happening.
So it has five domains, right?
So five protein domains.
So you pre-lush proteins, I think 600 something,
I mean, I said, but the way it's organized itself,
it's flexible, right?
So it acts as a scaffold. So it is used to
bring in other proteins. So they start acting in the orchestrated manner, right? So, and the type of
the shape of this protein, it's in a way there are some stable parts of this
protein, but there are some flexible.
And this flexibility is built in into the protein in order to become sort of this multifunctional
machine.
So do you think that kind of thing is also learnable through the alpha fold to kind of approach?
I mean, the time will tell.
Is it another level of complexity?
Is it like how big of a jumping complexity
is that whole thing?
To me, it's yet another level of complexity,
because when we talk about protein-protein interactions,
and there is actually a different challenge for this,
called capri.
And so that is focused specifically
on marker molecular interactions,
protein, protein, protein, DNA, et cetera.
So, but it's, you know,
there are different mechanisms
that govern molecular interactions
and that need to be picked up,
say, by a machine learning algorithm.
Interestingly enough, we actually participated for a few years in this competition.
We typically don't participate in competitions.
I don't know, don't have enough time, you know, because it's very intensive. It's a very intensive process.
But we participated back in, you know, about 10 years ago.
So, and the way we enter this competition,
so we design a scoring function, right?
So the function that evaluates whether or not your protein,
protein interaction is supposed to look like experimentally solved right? So the scoring function is a very critical part of
the model prediction. So we designed it to be a machine learning fun. And so it
was one of the first machine learning based scoring function used in Capri. And, you know, we essentially, you know,
learn what should contribute,
what are the critical components contributing
into the protein-protein interaction.
So this could be converted into a learning problem
and thereby it could be learned.
I believe so, yes.
Do you think Alpha Fold 2 or something similar to it
from DeepMind or somebody else will result in a Nobel Prize or a multiple Nobel Prize?
So like, obviously, maybe not so obviously, you can't give a Nobel it to the designers of that program. But do you see one or multiple
Nobel prizes where Alpha Fold 2 is like a large percentage of what that prize is given for?
Would it lead to discoveries at the level of Nobel prizes? I mean, I think we are definitely destined to see the Nobel Prize
becoming sort of to be evolving with the evolution of science and the evolution and science as such
that it now becomes like really multi-facet, right? So where you have you don't really have like a
unique discipline, you have sort of the a lot of cross- right? Where you don't really have a unique discipline, you have a lot of cross-disciplinary talks
in order to achieve really big advancements.
So, I think the computational methods will be acknowledged in one way or another.
And as a matter of fact, they were first acknowledged back in 2013, right?
Where the first three people awarded the Nobel Prize for the product, for study the product in folding, right? The principle. And, you know, I think all three of them are computational
biophysicists. And so, you know, that I think is, is unavoidable. You know, it will come
with a time. The fact that, you know, alpha fold and, youfold and similar approaches, because again, it's a matter of time that people
will embrace this principle and will see more and more such tools coming into play. But, you know, these methods will be critical in a scientific discovery.
No, no doubts about it.
On the engineering side, maybe a dark question, but do you think it's possible to use this machine learning methods to start to engineer proteins? proteins. And the next question is something quite a few biologists are against
summer for study purposes is to engineer viruses. Do you think machine learning
like something like alpha-fault could be used to engineer viruses? So to
answer the first question, you know, it has been you know a part of the research in the protein science.
The protein design is a very prominent areas of research.
Of course, one of the pioneers is David Baker and Rosetta,
algorithm that essentially was doing the nova design and was used to design new proteins.
And designer proteins means design a function.
So like when you design a protein, you can control,
I mean, the whole point of a protein with the protein structure
comes a function like it's doing something.
Correct.
So you can design different things.
So you can, yeah, so you can, well, you can look at the proteins
from the functional perspective.
You can also look at the proteins from the structural perspective, right?
So the structural building blocks. So if you want to have a building block of a certain shape, you can try to achieve it by introducing a natural, one of the natural applications of these algorithms.
Now, talking about engineering, a virus with machine learning.
With machine learning, right?
So, luckily for us, I mean, we don't have that much data.
Right?
Yeah.
We actually, right now, one of the projects that we are carrying on in the lab is we're
trying to develop a machine learning algorithm that determines whether or not the current
strain is pathogenic.
And the current strain of the coronavirus? Of the virus. I mean, so there are applications to
coronaviruses because we have strains of SARS-CoV-2, also SARS-CoV-MERS, that are pathogenic,
but we also have strains of other coronaviruses that are not pathogenic.
I mean, the common-called viruses and some other ones.
Pathogenic meaning spreading.
Pathogenic meaning is actually inflicting damage.
There are also some seasonal versus pandemic strains of influenza, right? And to
determining the what are the molecular determinants, right? So
that are built in into the protein sequence into the gene
sequence, right? So and whether or not the machine learning
can determine those those components, right? Oh,
interesting. So like using machine learning to do that's
really interesting to
to given the input is like what the the protein sequence and then determine if this thing is going to be able to damage to to a biological system. Yeah. So, so I mean, good machine learning
you're saying we don't have enough data for that? We, I mean, for for this specific one we do,
You're saying we don't have enough data for that? I mean, for this specific one we do.
We might actually have to back up on this.
We're still in the process.
There was one work that appeared in bioarchive by Eugene Kunin, who is one of these pioneers
in evolutionary genomics.
And they tried to look at this, but the methods were sort of standard supervised learning methods.
And now the question is, can you advance it further by using not so standard methods. So there's obviously a lot of hope in transfer learning, where you can actually try to transfer the information that the machine learning learns about the proper protein sequences, right?
And, you know, so there is some promise in going this direction, but if we have this, it would be extremely useful, because then we could essentially forecast the potential mutations that would make a current strain more or less pathogenic.
Anticipate them from a vaccine development, for the treatment of any viral drug development.
That would be a very crucial task. But you could also use that system to then say,
how would we potentially modify this virus
to make it more pathogenic?
That's true, that's true.
I mean, you know,
the again, the hope is, well, several things, right?
So one is that, you know,
it's even if you design a sequence,
so to carry out the actual experimental biology,
to ensure that all the components working
is a completely different method.
A difficult process.
Yes.
Then we've seen in the past,
there could be some regulation of the moment,
the scientific community recognizes that it's now
becoming no longer a sort of a fun puzzle for machine learning.
It could be well.
Yes, so then there might be some regulation.
So I think back in what 2015 there was, you know,
there was an issue on regulating the research on, on influenza strains, right, that the way
where, you know, several groups, you know, use sort of the mutation analysis to determine
whether or not this strain will jump from one species
to another.
And I think there was like a half a year more moratorium on the research, on the paper,
published until, you know, scientists, you know, analyzed it and decided that it's actually
safe.
I forgot what that's called, something of function test and function.
Let's walk.
Gain a function, yeah, yeah, gain a function loss of function.
That's right.
Sorry.
It's like, let's watch this thing mutate for a while to see like, to see what kind of
things we can observe.
I guess I'm not so much worried about that kind of research.
If there's a lot of regulation, and if it's done very well and with competence seriously, I am more worried about kind of this, you know, the
underlying aspect of this question is more like 50 years from now. Speaking to
the Drake equation, one of the parameters in the Drake equation is how long
civilizations last. And that's that seems to be the most important value, actually,
for calculating if there's other alien intelligence
civilizations out there, that's where there's most variability.
Assuming, like, if that percentage, that life can emerge,
is like, not zero, like if we're super unique,
then it's the how long we last is basically the most important thing.
So from a selfish perspective, but also from a Drake equation perspective,
I'm worried about our civilization last thing.
And you kind of think about all the ways in which machine learning can be used to design
greater weapons of destruction.
And I mean, one way to ask that if you look sort of 50 years from now, 100 years from now,
would you be more worried about natural pandemics or engineered pandemics?
Like, who's a better designer of viruses, nature or humans, if we look down
the line, I think, uh, in my view, I would still be worried about the natural pandemics,
simply because I mean, the capacity of the nature producing. Yeah, it does a pretty good job, right?
Yes.
And the motivation for using virus engineering viruses for as a weapon is a weird one
because maybe you can correct me on this, but it seems very difficult to target a virus, right?
The whole point of a weapon, the way a rocket works, it was starting point.
You have an endpoint and you're trying to hit a target.
To hit a target with a virus is very difficult. It's basically just,
right? The target would be the human species.
Oh man. Yeah. I have a hope in us. I'm forever optimistic that we will not...
There's no... There's insufficient evil in the world to lead that to that kind of destruction. Well, you know, I also hope that, I mean, that's what we see. I mean, with
the way we are getting connected, the world is getting connected, I think it helps for the world to become more transparent. Yeah. So the information spread is, you know, I think it's one of the key things for the society
to become more balanced. Yeah, and we're another.
This is something that people disagree with me on, but I do think that the kind of secrecy that governments have. So you're kind of speaking more to the other aspects, like research
community being more open, companies are being more open. Government is still like, we're
talking about like military secrets. I think military secrets of the kind that could destroy the world will become
also a thing of the 20th century. It will become more and more open. I think nations will
lose power in the 21st century. Like, lose sufficient power towards secrecy. Transparency
is more beneficial than secrecy, but of course, it's not obvious. Let's hope so. Let's hope so that, you know,
the governments will become more transparent. So we last talked, I think, in March or April,
what have you learned? How is your philosophical, psychological, biological worldview changed since then?
Or you've been studying it now and stop from a computational biology perspective.
How is your understanding and thoughts about this far has changed over those months from
the beginning to today?
One thing that I was really amazed at how efficient the scientific community was.
I mean, and, you know, even just judging on this very narrow domain of, you know,
protein structure, understanding the structural characterization of this virus from the
component point of view of, you know, whole virus point of view, you know,
if you look at at SARS, right, the something that happened, you know,
or less than 20, but you know, close enough, 20 years ago, and you see what, you know, when it happened,
you know, what was sort of the response to by the scientific community,
you see that the structure characterization did occur, but it took several years, right?
Now, the things that took several years, it's a matter of months, right? So we see that, you know,
the research pop up, we are at the unprecedented level in terms of the sequencing.
Right?
Never before we had a single virus sequence so many times.
So which allows us to actually to trace very precisely the sort of the evolutionary nature of this virus, what happens, and it's not just
the, you know, this virus independently of everything, it's, you know, it's the, you know, the sequence of
this virus linked anchor to the specific geographic place, to specific people, because our genotype influences also the evolution of this.
It's always a host-pattern evolution that occurs.
It'd be cool if we also had a lot more data about so that the spread of this virus, not maybe, well, it'd be nice if
we had it for like contact tracing purposes for this virus, but it'd be also nice if we had
it for the study for future viruses to be able to respond and so on.
Well, but it's already nice that we have your graphical data and the basic data from individual humans.
Yeah, exactly. No, I think contact tracing is obviously a key component
in understanding the spread of this virus. There is also, there is a number of challenges,
right? So, X-Price is one of them. We just recently took a part of this competition. It's the prediction of the, of the number of infections in different regions.
So, and, you know, obviously the AI is the main topic in those predictions.
Yeah, but it's still the data. I mean, that's a competition, but the data is weak on the training.
Like, it's great.
It's much more than probably before, but it would be nice if it was really rich.
I talked to Michael Mina from Harvard.
I mean, he dreams that the community comes together with like a weather map to where a virus is, right?
Like really high resolution sensors on like how,
from person to person, the virus is a trap,
all the different kinds of viruses, right?
Because there's a ton of them.
And then you'll be able to tell the story
that you've spoken about of the evolution
of these viruses like day to day mutations that are occurring.
I mean, that would be fascinating,
just from a perspective of study,
and from the perspective of being able to respond
to future pandemics, that's ultimately what I'm worried about.
People love books.
Is there some three, or whatever number of books,
technical fiction, philosophical that brought you joy
in life, had an impact on your life?
And maybe some that you would recommend others.
So I'll give you three very different books.
And I also have a special runner-up.
And an honorable matching.
It's an audio book.
And that's the, yeah, there's some specific reason behind it.
So, you know, so the first book is, you know, something that sort of impacted my
area of stage of life. And I'm probably not going to be very original here.
It's Bulgakov Smarshtor and Margarita.
So that's probably, you know, for Russian maybe it's not super original,
but it's a really powerful book for even in English.
So I read it in English.
It is incredibly powerful.
And I mean, it's the way it ends, right?
So I still have goose bumps when I read the very last
sort of, it's called prologue, where it's just so powerful.
What impact did they have on you,
what ideas, what insights did you get from it?
I was just taken by, you know, by the fact that
you have those parallel lives
apart from many centuries, right and somehow they they got sort of interwined into one story
and and that's to me was fascinating and you know, of course the you know
the romantic part of this book is like you romance, it's the romance empowered by its magic, right?
And that, and maybe on top of that,
you have some irony, which is unavoidable, right?
So because it was that, you know, the Soviet time.
But it's very deeply Russian.
So that's the way the humor, the pain, the love, all of that is one of the books that kind of captures something about Russian culture that people outside of Russia should probably read.
What's the second one? one that it happened, I read it later in my life, I think I read it first time when I
was a graduate student and that's the Solzhenitsyn's cancer ward.
That is amazingly powerful book.
What is it about?
It's about, I mean, essentially based on, you know,
Sajunitsyn was diagnosed with cancer when he was
reasonably young and he made a full recovery.
But, you know, so this is about a person
who was sentenced for life in one of these, you know,
camps.
And he had some cancer, so he was transported back to one of these Soviet
republics, I think, South Asian republics. And the book is, his experience being a prisoner, being a, you know, a patient in the cancer
clinic, in a cancer ward, surrounded by people, many of which die, right? But in a way, you know, the way it reads, I mean, first of all, later on I read the accounts
of the doctors who described these, you know, the experiences in the book by the patient
as incredibly accurate.
Right?
So, I read that there was some doctor saying that every single
doctor should read this book to understand what the patient feels. But again
as many of the Solzhenitsyn books, it has multiple levels of complexity. And obviously, if you look above the cancer and the patient,
I mean, the tumor that was growing and allegorically, the Soviet, and, you know, and he actually,
he, he agree, you know, when he was asked, he said that this is what made him think about
this, you know, how to combine these experiences. Him being a part of the, you know, of the Soviet regime, also being a part of someone sent to the Gulag camp,
and also someone who experienced cancer in his life. The Gulag or Kipalago
and this book, these are the works that actually made him, you know, you see a Nobel Prize. But, you know, to me,
I've read different, other, you know, books by Salshinism. This one is, to me, the most powerful
one that I read. And by the way, both this one and the previous one, you read in Russian?
Yes. Yes. So now there is now the third book is an English book,
and it's completely different.
So we're switching the gears completely.
So this is the book which, it's not even a book,
it's an essay by Jonathan Neumann,
called The Computer and the Brain.
And that was the book he was writing,
knowing that he was dying of cancer.
So the book was released back, it's a very thin book, right?
But the power, the intellectual power in this book, in this essay, is incredible. I mean, you probably know that
Phon Noemun is considered to be one of these biggest thinkers, right? So, he's
intellectual power was incredible, right? And you can actually feel this power in
this book where, you know, the person is writing knowing that he will be, you
know, he will die. The book actually got published only after his death.
Back in 1958, he died in 1957.
But so he tried to put as many ideas that he still hadn't realized.
So this book is very difficult to read because every single paragraph is just
compact, filled with these ideas and the ideas are incredible. Even know, so he tried to put the parallels between the brain computing power, the neural system and the computers, you know, as the world.
He was working on this like 57.
57.
So that was right during his, you know, when he was diagnosed with cancer and he was essentially. Yeah, he's one of those, as a few folks people mentioned, I think Ed Witten is another that
like every but everyone that meets them, they say he's just an intellectual powerhouse.
Yes.
Okay.
So who's the honorable mentor?
So, and this is, I mean, the reason I put it sort of in this separate section because
this is a book that I reasonably recently listened to.
So it's an audio book and this is a book called Lab Girl by Hope Jaren. So Hope Jaren,
she is a scientist, she is a geochemist that essentially studies the fossil plants. And so he uses the fossil
plant, the chemical analysis to understand what was the climate back in, you know, in
thousands years, hundreds of thousands years ago. And so, something that incredibly touched me by this book,
it wasn't narrated by the author.
And it's an incredibly personal story, incredibly.
So, certain parts of the book, you could actually hear the author crying.
And that, to me, I mean, I never experienced anything like this, you know, reading the book,
but it was like, you know, the connection between you and the author.
And I think this is, you know, this is really a must read, but even better, I must listen to audiobook for anyone who wants to learn about
sort of, you know, academia science research in general, because it's a very personal
account about her becoming a scientist.
So we're just before New Year's, you know, we talked a lot about some difficult topics
of viruses and so on.
Do you have some exciting things you're looking forward to in 2021?
Some New Year's resolutions, maybe silly or fun, or something very important and fundamental to the world of science or something completely
unimportant.
Well, I'm definitely looking forward to towards things becoming normal.
Right?
So, yes, so I really miss traveling. Every summer I go to an international summer school.
It's called the School of Molecular and Theoretical Biology.
It's held in Europe, organized by very good friends of mine.
And this is the school for gifted kids from all over the world
and they're incredibly bright.
It's like every time I go there, it's like, you know,
it's a highlight of the year.
And we couldn't make it this August,
so we did this school remotely, but it's different.
So I am definitely looking for it next August coming there.
I also, I mean, you mean, one of my personal resolutions, I realized that being in a house and working
from home, I realized that actually I apparently missed a lot, you know, spending time with my family, believe it or not.
So you typically, you know, with all the research and teaching and everything related to the
academic life, I mean, you get distracted and so, so, you don't feel that you know the fact that you are away from your family
doesn't affect you because you are you know naturally distracted by other things and
You know this time I I realized that you know that that's
So important right?
It's playing your time with the with, with your kids. And so that would
be my new year resolution and actually trying to spend as much time as possible. Even when
the world opens up, yeah, that's a beautiful message, that's a beautiful reminder. I asked
you if there's a Russian poem you could read that I could force you to read and you said,
okay, fine, sure.
Do you mind reading?
You said that no paper needed.
So, yeah, so this poem was written by my namesake, another Dmitry, Dmitry Kemirfeld. came here felt and is a you know it's a recent poem and it's it's called
Sorceress, Viedema in Russian or actually Caldonia so that's sort of another
sort of connotation of Sorceress or which and I really like it. And it's one of just the handful poems I actually
can recall by heart.
I also have a very strong association
when I read this poem with Master Margarita,
the main female character, Margarita.
And also it's about, it's happening about the same time we are talking now. И также, это, you know, это обо, это было, обо,
на same time, мы уже говорили.
Сейчас, так, уже, уже, уже, к Крисмас.
Рема и в Идине, в России.
Я хочу, чтобы в Белой в Юге, в Белой Зи, в Белой Мгле,
на рас 5 из-за пящих городов ты, безумная, летал
на мяткле, взявка ежась от январских холодов.
Пахсочельник апельсином и треской, ветер вялил новогодние
шары и подласны твоей силе колдовской гибли души и
кручинили смиры.
Так ты узела глаза своей ерясть, что любой кому спускалась
благодать, был готов за эту ведеменскую связь без
заглядки душу дяволу отдать.
На погостах весерелась вороньем, но а я, без предрассудков
и рубах, выбегал, чтобы почувствовать твое изумленное But I, without the presence of the herbs, would run out to feel your frozen breath on the gulps. To eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to eat the food, to Bila Izibiya, Bila Mga. That's beautiful. I love how it captures a moment of longing.
And maybe love, even.
Yes.
To me, it has a lot of meaning about, you know, this,
something that is happening, something that is far away,
but still very close to you.
And yes, it's the winter.
The winter. Something magical about winter, isn't it? Yes.
What is the, why I don't know, I don't know how to translate it, but a kiss in winter is interesting.
Lips in winter and all that kind of stuff. It's beautifully.
I mean, Russian as a way, as a reason Russian poetry is just,
I'm a fan of poetry in both languages, but English doesn't capture some of the magic that
Russian seems to. So thank you for doing that. That was awesome. Admichi is great to talk to you again.
It's contagious how much you love what you do, how much you love life. So I really appreciate you taking the time to talk today.
And thank you for having me.
Thanks for listening to this conversation with Dimitri Korkin
and thank you to our sponsors.
Brave browser, Nest Suite, business management software,
Magic Spoon Low Carb Serial, and A Sleep Self-Cooling Matress.
So the choice is browsing privacy, business success,
healthy diet, a comfortable sleep, choose wisely my friends, and if you wish,
click the sponsor links below to get a discount and to support the spot gas.
Now let me leave you with some words from Jeffrey Eugenides. Biology gives you a brain.
Life turns it into a mind. Thank you for listening
and hope to see you next time. Thank you.