Future of Coding - Beyond Efficiency by Dave Ackley
Episode Date: March 4, 2024Dave Ackley's paper Beyond Efficiency is three pages long. With just these three pages, he mounts a compelling argument against the conventional way we engineer software. Instead of inflexibly insisti...ng upon correctness, maybe allow a lil slop? Instead of chasing peak performance with cache and clever tricks, maybe measure many times before you cut. So in this episode, we're putting every CEO in the guillotine… (oh, that stands for "correctness and efficiency only", don't put us on a list)… and considering when, where, and how to do the robust thing. Links $ patreon.com/futureofcoding — The most recent bonus episode is a discussion with Stefan Lesser about new "laws of physics" we can invent inside the computer. Don't destroy the earth, then make sure your thing can't be destroyed, then don't destroy your data, and finally, do your damn job, AAAAAAAAAAAAAAAAAAAAAAAAAAA. A Software Epiphany, and the accompanying HN discussion — giga viral, so sick PartyKit? Nice! What started as a simple todo list turned into an ocean of tech boy milk and, ultimately, the AI apocalypse. Jepsen is a rough, rugged, deeply thoughtful and fantastically cool approach to distributed systems testing, by Kyle Kingsbury. Also, we didn't talk about it, but his reversing / hexing / typing / rewriting / unifying technical interview series is essential reading. Ivan's examples of robustness vs efficiency were RAID, the CAP theorem, Automerge, the engineering of FoundationDB, and Byzantine fault tolerance— all of which stake out interesting territory in the efficiency/robustness tradeoff spectrum, all of which are about distributed systems. Can programming be liberated from the von Neumann style?, a paper by John Backus. We Don't Really Know How to Compute!, a talk by Gerald Sussman. The Robust-First Computing Creed is rock solid. The Wikipedia article on von Neumann architecture did not come through with the goods. Ivan works with Alex Warth now, and thus may fairly speak in half-truths like "I've been working with constraints recently…" The Demon Hoard Sort Bogosort is never coming to Dreamberd The Witness was made by Jonathan Blow, who has Aphantasia, but he also made a game called Braid, and Braid is good. Datamosh is a creative misuse of the lack of robustness that comes from storing diffs instead of full state snapshots. Here's a lovely gallery of examples. Abstraction by xkcd Reverse Engineering the source code of the BioNTech/Pfizer SARS-CoV-2 Vaccine Can't let Lu get through the above without derailing onto Fiverr, PCP, Fight Club, and the Dust Brothers. Randy Newman was nearly quoted in Ackley's Indefinite Scalability for Living Computation — god help you if you read our show notes and don't listen to the episode. "It is difficult", says Upton Sinclair when asked about Jimmy Miller being Jimmy Miller, and how we all ought to approach our own sense of Jimmy Miller. Music featured in this episode: Hawker News by user: spiralganglion Corporate World by the Dust Brothers No more jokes! Find us at these normal places: Ivan: Mastodon • Website Jimmy: Mastodon • Website Lu: Mastodon • Website Dave: Mastodon • Website Send us email, share your ideas in our Slack, and support the show on Patreon. Yes, do all three please. http://futureofcoding.org/episodes/70Support us on Patreon: https://www.patreon.com/futureofcodingSee omnystudio.com/listener for privacy information.
Transcript
Discussion (0)
so we had a crazy week jimmy a like wild week like unprecedented amounts of buzz around what
you and i have been doing we had somebody write a blog post about one of our podcast episodes and
that made it to the front page of hn yeah i was so sad i didn't catch it when it happened wait
no really yeah really yeah like we went gigaviral this week it's it's incredible
no no no no hang on a second hang on no i'm actually deadly serious and we got we got 400
downloads from it 400 downloads from front paging hn what i don't even see this yeah no it's uh it
was because it was a blog post titled i had an epiphany which is not something
i'm ever gonna click on yeah just to be honest yeah right i mean i don't browse that file website
but it has a great search function yeah a great search function yeah if you search the word
epiphany it was yesterday this blog post um do you want to do a dramatic reading of it lou a software epiphany yeah yeah john wiles oh my god episode 61
the programming is theory but okay which is exciting for me i just want to put that out there
suddenly i understand software even though i don't respect podcasts as an information delivery system guys you've just been called a delivery
system i recently listened to a podcast that felt like having an epiphany the podcast in question
was episode 61 of future of coding so i guess they're saying that that specific episode of a podcast was a podcast. Yeah, fair enough.
It is essentially a read-through slash review of two papers with which I was unfamiliar,
but which I believe are actually quite famous and influential.
That feels like a roast of these.
All right.
Why did listening to-
You made it past the funny part.
The funny part is the whole like yeah i don't
respect podcasts but i listened to this podcast and it was good actually which is i think it's
great to be fair i i think that is the correct opinion i agree that's my basic view of podcasts
including this one well there's a reason you're on this one and not on you know lex friedman's
podcast or what have you right yeah though i did just
unblock him today on twitter really yeah i unblocked him why i realized that so me and
steve my boss we sort of 50 50 run the teal draw a twitter account right and i'm realizing that a
lot of the notifications that come into it are from people that I've blocked on my main account,
my personal account.
And it just made me feel a bit awkward.
So I've just gone and unblocked everyone.
So if I've blocked you in the past, this is your second chance.
You have one chance.
I just mute people.
I don't block them.
Like, it's fine if they go look at my stuff.
It's just like, I don't really care about that meme account.
That's not what I'm on here for.
I'm just going to mute it so that they stop getting...
Or like this week, it's like,
you know, oh, SpaceX did something.
Now all of Twitter is just SpaceX.
Definitely just SpaceX.
There was nothing else that happened on Twitter this week.
Yeah, no.
For you, it was like,
oh yeah, programming doesn't exist anymore.
There's only SpaceX.
But yes.
Twitter's a vile website and nobody should use it.
Yeah.
The rest of this, I just want to put it,
the rest of this blog post is really, really nice.
It's a person going and listening to our episode on
programming is theory building and i also threw in gilbert ryle uh as like background reading for
that and they realized like oh yeah now we're had it right about programming is theory building
and it explains so much of what this person sees in real life software and i totally
agree it's uh it was it was nice i i this is my favorite paper i think that more people should
have this epiphany and it's it's so typical hacker news style where like most of the comments are
like this isn't anything there's not really a big epiphany here i don't know how you don't realize
these things and then they don't realize these things.
And then they don't state the things
that the paper actually states.
It's great.
It's like A+.
Title is a bit overblown for the actual epiphany.
This is manifestly not the case.
I don't see anything particularly unique
to software in this thesis.
Epilepsy warning.
I also think this is way off base.
Made me seasick.
You might understand software, but you clearly don't understand web design.
The pop-ups are terrible.
And I found the first sentence very off-putting.
The value of computers is that you can write code once, and the computer can execute it repeatedly ad infinitum.
As a programmer, your middle theory of the codebase has value to its owners, but it's
not the product.
The codebase is not the product.
The product is whatever the output is created and consumed by the relevant stakeholders.
Oh, bud. relevant stakeholders oh but not only is this like quintessentially hacker news this is actually almost identical to donald
newth's reply to nauer which is just hilarious like this is it's just the way that everyone
reads it they say first i agree and yet i don't agree with the conclusion which doesn't you that's not how that's not how logic works like it the if
nauer is making his argument the way he thinks he's making it this is a logical conclusion that
you should follow and i just it's just my favorite i love that like so many people yeah so here i have
donald newt saying my experience agrees well with peter nauer's hypothesis that programming
is theory building, and they
strongly support his conclusion that programmers
should be accorded high professional status.
But I do not share Nauer's unproved
assertion that reestablishing
the theory of a program merely
from its documentation is strictly impossible.
On the contrary, I believe that improved
methods of documentation are able to communicate
everything necessary for the maintenance and modification
of programs. So a slightly different different focus here but amounts to the same thing that like
the theory is not the main part that the software itself and the documentation are all we need
uh and it's just i don't know i i wish now i wrote that paper a little bit more clear but i'm happy
that uh that we could do the service of of
talking about it too and other people could grab on to the concept you know it sounds trite when
you just say the conclusions without actually diving into all the details so i'm not surprised
that hacker news took away that it's just a trite conclusion my favorite is one person who said
made with party kit if anybody wants to make their own
party kit yeah party kit nice show an office with those guys yeah yeah i was like oh that's nice
this is my favorite hacker news comment this is a top level comment there's this seminal paper by
peter nauer called programming is theory building that arrived at the exact same conclusion when it comes to programming oh so good that's how he knows he's
summarized it well though right yeah yeah yeah it's just it was just so good it's like chinese
whispers but there were no mistakes uh-huh you know yeah i like that one which is like it is true
yeah um programming is theory building i mean if, to that person's, you know, not credit, but credit,
like he doesn't link to the paper directly, but still, yeah.
Yeah, it was honestly, this is, for me at least,
this was super satisfying to see.
Like I'm excited that people are listening
and enjoying the things that we're talking about,
and especially for this paper.
This was one of those papers that I worried we wouldn't do justice to and it's been a well
received episode so i'm very excited and that that comment gets an upvote from me that's the
stuff i come to hackernews for nice lou how was your week i'm not even joking i want to know
what's it been like uh it's been crazy. I was also on Hacker News.
A little bit.
The title was equally unhelpful.
The title was, I think I need to go lie down.
Yes. So on Monday, my boss, well, you know, at Teal Draw, we've just finished like working on some kind of, I would say, kind of boring stuff.
Like we're just getting the library ready
changing a few apis it's it's not necessarily visually flashy so you know we thought we'd
earned a bit of fun so my for everyone everyone got some you know pick something fun to do
my fun task was steve said to me hey luke can you that's my my impression of steve he said hey luke hey luke can you get
like 200 likes on twitter like and that was literally my to-do list for the week i think
i posted it up on mastodon yeah you know part one step one write a to-do list step two um get 200 likes on twitter step three speak to gitbook lovely guys um and
and yeah i got 200 likes and then we got about i don't know 10 000 um
and so yeah it's been a fun week so for for anybody who doesn't know because this podcast is probably coming out like maybe
two to 18 months after these events do you want to tell them what this is about yeah so like um
the gpt vision api came out so what you can do is just chuck the ai a uh a picture
of of a little user interface of a little website, and it can turn that into a
website. But the fun thing that we've been doing is then it places that website back on the canvas,
which you can then annotate again, like draw some new things on, and then give that back.
And you get in this cycle where you're like having this back and forth visual conversation
by drawing stuff.
So like turning a picture into a website, that's not new.
But the new thing is being able to then draw on whatever they give back to you, which feels
really weird, feels really weird.
And everyone's freaking out.
Why does it feel weird like i tried
to use it and for whatever reason my api key was just rejected repeatedly probably because of you
know servers falling over who knows what let me in bang bang bang i think you need to you need to
put five dollars on your account oh i put i put i put not five hundred dollars but more than five
dollars on my account oh well there you go maybe yeah maybe they just shut it down i think it's because
they invented agi who fired sam and so you know they know that you might take down the system so
they're rejecting you your comments against ai have you know made you get banned so that's the
the like the domino stuff that happened is like hey lou can
you get 200 likes that's like monday or whatever right yeah wednesday it's like oh my god we're on
the front page of every internet um you know people are using this to like make games and
everybody's having this this giant fit and it's it's incredible and terrifying and this whole cluster of emotions
all at once friday open ai implodes and i see these as like there's like a linear you know
straight line path from steve saying hey lou can you get 200 likes to uh you know monday it's going
to be like i don't know sam altman is dead or something like that right like this is going to be yeah this is like who knows by the time this episode goes out
will we even have jobs i don't know it'll be like the dyson sphere has happened or something like
that and we're all living is like an energy cluster orbiting somewhere in a spheroid around
the sun yeah that's the that's the accelerationist view that these you
know silicon valley hacker news types want us to e slash act trying to reshape society into yeah
yeah which is why i was on the front page of hacker news um but hey do you know what should
have been on the front page of hacker news but wasn. Lou, it's your first segue.
It should have been Beyond Efficiency by David H. Ackley.
Dave, that should have been
on the front page of Hacker News.
With any luck, somebody will find this episode
a year after it comes out
and write a blog post
titled i had an epiphany yeah a hardware epiphany maybe this time maybe and then people will finally
discover this this very short highly readable super approachable i don't know what to call it
because it's not a traditional sort of essay like you'd get in a
journal even though it it was um uh submitted for we're reading a pre-print but it was um
accepted to uh communications the acm so it's in the it's in the acm but it's it's this like
three column like three page and then some references the references were actually
interesting basically
like uh not not i don't i'm trying to avoid saying a screed because it's not that angry but it has
that like flavor of like like uh like demagoguery like it's like trying to whip up people's enthusiasm
and energy and like get them to you know change how they approach their work in a way that i find
very appealing so yeah i don't
know what to call this but it's uh it's short and sweet i believe uh it's called a viewpoint essay
okay which is like i guess that's like an opinion piece i don't know a letter to the editor yeah
yeah a letter to the editor yeah uh yeah i I think of all of the papers we've read,
this one I can definitively say did not need an editor.
This did not need anyone to shorten it.
There were no bits that needed to be cut out.
Everything in here is essential to the argument being made.
If anything, personally, I would have liked a little bit more.
I did go and watch a
bunch of or listen to a bunch of david actley um uh youtube videos in the robust first playlist
so that helped me kind of get this mindset i will say that uh i'm i'm not i haven't absorbed it
enough i feel like i'm a little like I hope this conversation will help me kind of get more where this is.
Because it's a big shift from where we are.
And I feel like I don't quite understand all that entails.
And speaking of chain reactions from a bit earlier i would say that this paper or like this series of papers
that dave actually made um a while back is actually what has made me sit here today like
i got into coding again like i learned it when i was younger and i picked it up again
when i was learning about all this robust first this simulation style of computing and that's what
got me back into coding and that's why i'm here and that's uh why sam altman got fired right and it's you know all that gpt4 needed was tl draw to become sentient and
that's why we're gonna you know the heat death of the universe is upon us all all because we went
beyond efficiency exactly it's all connected yeah which almost sounds like beyond efficiency
almost sounds like we're getting even more efficient than efficiency.
That's like what my first thought when I saw beyond efficiency was, but it's actually kind of the opposite.
This is a paper kind of arguing that we should throw out efficiency.
That efficiency is not always good and that efficiency comes with some real trade-offs
around what he calls robustness and i think on the face of it that is going to sound
very unpalatable to a lot of the people in our general community i think of you know
i don't know that of course the person that comes to mind
is somebody who gets named too much on this podcast,
which is Jonathan Blow.
I mean, you say that we shouldn't care about it.
We should care about efficiency
even less than we do right now to a lot of people.
And they're going to feel like you're really out of touch with how little we care
about efficiency but this is i guess like way beyond that kind of take to like we should get
rid of cpus and ram yeah yeah like this is a it starts from a a very simple place of robustness
which is broadly defined as like resilience to error
whether that error comes about as the result of you know a cosmic ray hitting your cpu
or because there's a malicious hacker who broke in and sprinkled malware through your system
or there's a coding error or any of the other reasons you know bad data or something like that
any of the reasons that a computer program might fail to
perform its intended purpose, those failures are, you know, robustness is our bulwark against that
insanity, against those failures. And that in the drive to make computers more efficient,
and specifically to make the execution of software more efficient
we have had to give up robustness and we had more robustness in the past and there's some things
you know in the analog computer era say computers were more robust because they had to be you know
analog and and very much more physical in nature but as we switch to digital computers and you
could be more certain
that uh that there was like perfect fidelity of data transmission and that sort of thing
we have gradually bled the robustness out of our systems in the name of efficiency and so this is
you know that's the the small seed of the idea is like recognize that robustness and efficiency
have this relationship where you can trade one off to
gain the other. And now that you recognize that relationship, let's look at some of what we've
lost and let's consider how we might recover it. I think that's a really provocative position to
start from. Yeah. And I think that's what got me really interested in this idea in the first place
you know I think there might be listeners listening surprisingly thinking no that's
ridiculous what a crazy idea and I think that's the point you know like it's it's framed in a way
and it's presented in a way that is intentionally surprising and provocative. It's this viewpoint essay. It's
supposed to capture your attention. It's supposed to make you think, wait, what, really? So don't
be put off by the fact that it sounds a bit ridiculous. I think that's actually the point.
It's putting out a bit of intrigue into your mind when it first hits you.
So we should hit them with
the bubble sort example because that's like we've been saying hey everybody you know this is going
to be a little bit absurd and we haven't yet any we haven't given an example to show like
absurd how like what do you mean oh it's all well and good to say abstractly yeah robustness and
efficiency of this trade-off and it'd be nice nice to try and be conscious of that trade-off.
But there's this example where bubble sort
as opposed to merge sort and quick sort,
it's like significantly slower
because bubble sort does just a ton more comparisons, right?
Like it always takes a pair of adjacent elements
in a list that it's trying to sort.
And it compares that pair of adjacent elements. And if they're in the wrong order it swaps them and it just does that over and over and over again and the algorithm is like just
super duper simple whereas quick sort and merge sort are a little bit more clever and through
that cleverness they avoid doing a whole bunch of redundant checks right because bubble sort as the
list gets more and more sorted, you're going
to be frequently comparing elements that are already in the right order. And merge sort and
quick sort try not to do that, right? If your list is already mostly sorted, quick sort and merge
sort are going to just blast through it. But bubble sort is so much more tolerant to faults
that can occur in the hardware that's executing it or that can occur
in a component that collaborates with it and as an example um ackley suggests like what if you had
sorting happening where the function that is checking the elements to see whether they're
like that checks a pair of elements to see whether they're sorted correctly,
returned the wrong answer 10% of the time.
And if you run quick sort or merge sort with that kind of faulty comparator,
you end up getting tons and tons and tons of errors.
So if you have 52 items, like a deck of cards,
and you sort them all,
and then measure like how many errors do you get?
This is total card positioning error
so it's like how far off are these cards is the idea so like this is five positions off or this
you know it should have been five up or down in an order quick sort gets like 250 something errors
merge sort gets like 180 something errors and bubble sort gets like 10
errors because that little bit of extra work that bubble sort does where it does all those extra
comparisons just gives it more chances to recover from that erroneous result from that faulty
comparator but even if you run quick sort and merge sort multiple times in a row, the error rates don't
improve all that much.
They improve maybe another 50% or something like that.
So there's something about bubble sort that is just like inherently more robust to failure
than quicksort and merge sort.
Yeah.
And do you know what I like is earlier on, actually says about how you know when sorting a long list either one merge sort or quick
sort blows the doors off bubble sort later on he shows this graph of bubble sort being way more
robust than those two and follows with the line now whose doors have fallen off? It's like a very long, long setup.
And he actually does say that if you repeat about six times with a quicksword or merge sword and take the most common answer, you do get competitive with bubble sort.
But he calls that a cheap hack and-facts fix tailored to one situation.
But I think this is interesting because in the worst case with quicksort,
you get 250 positions off.
But when we're talking about comparing efficiency between quicksort and bubble sort,
he tells us that if you try to sort a billion numbers and you try to do it
with bubble sort
it's just going to take forever
whereas if you try to do it with quick sort
he says it took under 10 minutes on my laptop
I looked up I see people having
like you can do in Java in like 30 seconds
on modern things
this is a very very fast thing
whereas if you still try to do bubble sort
you're basically never gonna get an answer at a billion things so i think this is makes this even
more stark to me which is like quick sort and merge sort yes they did they were less robust
but they weren't like exponentially less robust in a of ways. Yes, they were for this small thing,
but if we amplify numbers, it's not as bad.
And yet, still the argument here is
their efficiency gains aren't worth it,
even though they're quite a lot more efficient.
We don't want that.
We want robustness.
And I think the thing that was a little confusing to me
at this point is, like, why?
And I'm curious if any of you all can articulate, like, why do we care that bubble sort can, you bit, is this viewpoint essay is introducing a new kind of way of measuring software. Ackley introduces this concept of CEO, and he states that
currently, at the point of writing, the goal of software is CEO, which stands for correctness and efficiency only.
And, you know, the first example, comparing the efficiency of these different sorting algorithms,
he's looking at efficiency. He's then presenting, we can measure these in another way that you may
not have thought about, called robustness. And it's sort of just introducing this new idea that I guess
I had never thought about. And in my experience, people often haven't thought about what if we
have some error? I think in the video, he says, or he opens it with, why do computers crash?
Is it something we just have to deal with? When is it okay to deal with it? When
is it actually really important that it doesn't happen? And is there a way that we can rate
different software in that scale? Not just speed, not just correctness, assuming nothing goes wrong.
Because in reality, things go wrong. In the myths and misconceptions episode we talked about um the fuzzy edges of
software and how they have to interact with the real world and this robustness concept i think
is trying to tackle that i think that's part of the why it is vague though he says it could be
for any reason yeah i'm kind of reminded of you know the kind of erlang let it crash uh mentality but
on steroids right like this is like erlang's like okay we have these little processes and they can
crash and we bring them back up but this is saying like let's go even further let's assume that like
we have software and like we hit it with x-rays and random bits just change.
Let's assume that we have software and we have hardware that we can't actually depend on giving us a deterministic answer.
We have any number of things could just break at any moment.
How can our software deal with this?
As I was looking more, and I don't think it's actually ever explicitly
stated in in this paper but the idea is like this is the only way uh that we're going to i guess it's
stated a little bit but that we're going to like get beyond the scale of computing we're at right
now right now everything we do even with our supercomputers, even with open AIs,
clusters, etc., is, according to David Agley, like small-scale computing. We need to get beyond this
to get these massive computations. And at that point, the centralized manner of these things
won't work. We cannot make decisions where we have to submit everything to CPU
or we have to grab everything out of RAM.
And anything we have right now is a hack on top of that.
If you look at the Jepson test,
look at all these databases and all the failure cases
they have in these distributed systems,
I think you could see this as the hack.
We're trying to take these systems that are supposed to be deterministic
and somehow make them work in the face of all this indeterminism.
This paper is arguing we should start from the other point.
What if our fundamental primitives were unreliable from the beginning
and we had to program to adapt to all of this?
What systems would we get out at the end?
One of the things I had a lot of fun doing
as I was reading this
was trying to think of some of my own examples
of places where this trade-off has come up
and then trying to find like,
okay, what are the common themes
about those places I'm thinking up?
Which is of course like,
yeah, of course there's going to be themes
because I'm the one thinking it up
and I'm going to get onto a train of thought and that's where all the examples are going to come from so there's
probably great examples that i totally missed but some of the very first ones that i thought of were
like raid so if you have a raid array of discs you have to choose depending on which raid mode you pick do you want the space to be used redundantly so that you can be resilient to
failure of a disk right every every disk in your rate array gets mirrored so if one of them fails
you have a backup copy versus do you want to stripe the data between your disks so that you
can read that data out twice as fast because you can parallelize those reads between your disks so that you can read that data out twice as fast because you can parallelize those
reads between your disks which is like you know um even in the era of the ssd that's still a huge
performance win and so there's that trade-off right do you want it to go faster or do you want
it to be more resilient to failure another one is cap theorem right if you have a distributed system
a bunch of nodes that need to communicate and there's some admission that like okay sending
a message from one node to the other is going to take time and it's going to go over an unreliable
connection and so there's a likelihood that that communication could fail there's different ways
you can structure the communication so that you either have like if it's a web service right and
there's requests coming in and this is you know these servers need
to occasionally coordinate to be like oh somebody just added an item to their shopping cart i need
to like update the the database and then the database nodes need to all communicate with each
other and say oh yeah there's a new item in the shopping cart everybody catch up with that do you
want those distributed database nodes that have to like gossip changes to each other to be more available
so that if you know there's a network failure or something like that nodes will still respond even
if they have partial data right like the new item in the shopping cart hasn't been updated everywhere
do you want availability so that those nodes will still respond with the shopping cart contents and
some of them will say oh it doesn't have that item and other ones will say yes it does or do you want like consistency
and correctness where you say hey we're not going to actually admit that the shopping cart has that
item in it until all the nodes know about that item and we're going to wait until we've got
confirmation that everybody knows it before we say yeah the shopping cart has that thing and that trade-off between availability
and and consistency you have to choose you can't have both that's the that's the cap theorem i have
i have a confession yeah and you can't you can't tell dave this okay right so this paper, it's arguing against efficiency.
You know, it's championing robustness.
However, me reading and learning from this paper
has actually helped me to write more efficient code.
Oh!
And, like, prioritise more efficiency sometimes,
which I feel like I'm betraying the robust first cause.
But let me explain. Right.
So I think before I discovered this whole new school of thought about thinking about this tradeoff, this other measure of robustness versus efficiency,
I always thought, you know, to write faster code, code that runs fast, you just need
to know, I don't know, the tricks, you need to be a better programmer, you know, if my code is slow,
because I suck or something. And through discovering this other measure of robustness,
and this trade off, it's just made it a lot easier for me to figure out how to make things
more efficient. Because I know how to make things more robust, you add redundancy, you lean into the error, you lean into the non-determinism.
So it's quite easy to just flip that. And when I want to make my code run faster now, I'm looking
for places to remove that and to make it less robust. And it feels wrong. you know i would never admit this to professor ackley but you know it's
it's changed the way i think about the code that i write i think it's good whether we're trying to
be efficient whether or robust or correct or whatever i think it's good to have this extra
measure of robustness so that you know what you're sacrificing or you know what you're gaining yeah like you're conscious that there's a trade-off now as opposed to it just being
invisible as it is to probably most programmers or at least unacknowledged and backwardsly
automatically intuitively just doing the efficiency thing not even knowing that you could if you
wanted to get even more efficiency by getting
rid of more robustness go don't go looking for opportunities to make things faster go look for
opportunities to make things less robust yeah exactly yeah yeah but i just have to cross my
fingers when i do it yeah i i think that this is i think that's a really interesting way of looking
at these things and i do think it's super helpful to realize where we have redundancy, where we're copying things we don't need to.
You know, these are often the causes of our slowdowns.
But I think it's I think there's a really interesting idea here that I wish was more explicitly argued for in this paper, but I think there is this kind of argument for it
that it is that sort of removing of redundancy
that leads to these software errors,
but that it's not a programmer's fault.
It's not that we write buggy software
or that users are stupid and do the wrong things
that cause these software problems. At the core is just the von Neumann model of computation,
that this is the ultimate culprit that caused all this problem. And I know that there's a paper
that's definitely on my list for later maybe to read,
which I can't remember who wrote it,
but Can Programming Be Rescued from the Von Neumann Model,
I think is the name of the paper.
Can Programming Be Liberated from the Von Neumann Style?
And that's a, I can't remember, who wrote that?
Do you remember, Ivan?
I did.
Okay.
You wrote it?
Yes, yes. That's a robustness first answer
yeah you can only you can only you know answer what you have local information
to quote ackley uh a mistaken person gives wrong directions even if you repeat the question
so yeah i i wrote this um yeah so john backus wrote this in 1977 it's a
acm turing award lecture um so luckily i have access to non-local information here um
but yeah i think this is really interesting and this is really you could see this research here as kind of taking that question and saying
yes but we don't quite know how yet like what we get here is not an ultimate answer to how can
what's the alternative right what how could we get rid of the cpu how could we get rid of ram
but we get some suggestive hints in this paper and I know you all have watched some of Ackley's stuff as well, so I'm going to rely on some of that.
And it's this, you know, the things, the demos I saw are kind of this cellular automata basis, and we see that even von Neumann was interested in this being the model of computing going forward. And so what you have instead of a central processing unit where you have access to this
memory and there's these caches that are all local to you and you can guarantee, quote
unquote, some certain deterministic output depending on all of these things.
Instead, you get these tiny little computers that are all interacting with each other
and can all scale and connect interchangeably together.
And so the computer itself is small.
It has very limited resources, and it can only see its little world,
and it has to accomplish its goals not by taking control,
not by becoming the leader,
but by cooperating with all of the bits around us.
I'm reminded if you've seen the talk from Gerald Sussman
on we don't know how to compute yet or something like that.
No, I haven't.
Okay, this is like 2011.
And Sussman here says, it's quoted in Brett Victor's
The Future of Programming. Which I also haven't read. It's quoted in Brett Victor's The Future of Programming.
Which I also haven't read.
It's the one Brett Victor thing I refuse to read because it's bad.
Really?
Wait, is this another robust first answer?
I don't know now.
No, no.
Well, okay.
So I have read it.
You really haven't watched?
Watched.
Read.
Watched.
Wait, which one are you talking about?
The talk that he gave
Where he dresses up in a suit and has an overhead projector
Oh, crap
Yeah, I have seen that one
I was thinking about the essay Learnable Programming
Yeah, yeah, yeah
They both have programming in the name and I just snapped a grid
Learnable Programming is bad
Oh, wow
I think it's there that he quotes
No, maybe it's not that one.
Anyways, maybe I'm wrong,
but someone quotes it and says,
I got this, or was it a hickey talk?
I don't remember.
Now I'm all confused.
Extraordinary claims without evidence
may be rejected without evidence.
Yeah, let's don't get on that tangent
because I'll just...
So anyways, Sussman says that eventually
we're going to be buying our compute by the bushel
and that these computers will just be
in the paint of the wall
and they'll add compute to our wall
by painting it on.
Right, this is the idea.
It was not that talk, but anyways.
Yeah, this is kind of Ackley's model
is something like that.
So I've got this idea that like robustness
and the trade-off in particular
come up frequently and prominently
in cases where you're doing distributed computing.
That was the thing I was building to
when talking about like,
oh yeah, these different examples that I found,
they all come up in cases of distributed computing, right?
The other ones I had were like AutoMerge and FoundationDB and Byzantine fault tolerance,
right?
Which Ackley even talks a little bit about and references some work about that.
That when you have lots of tiny computers and there's communication between them, just
because physics, communication is fallible.
If for no other reason than just like, like oh i lost my network connection in the middle of transmission or whatever or somebody unplugged an ethernet
cable and foundation db if anybody hasn't seen there'll be a link in the show notes to a talk
from strange loop about how they engineered foundation db by building the most unreliable
server rack possible and then only running their software on that rack as they
were developing it where it's like you could unplug the network cable or the power or smash
the hard drive or anything like that at any time and it had to be robust to that kind of failure
and then they what they set out as a goal was like let's guarantee perfect robustness against
that adversarial environment
and then make it fast and then make it go as fast as possible given a certain minimum threshold of
robustness which is a really cool approach as opposed to let's make it fast and then try and
bring robustness up a little bit which is like that's your yjs versus auto merge debate right
there yeah and and this is the robust first creed it's it's um i i'm
gonna try and google it right now okay but it begins with something like first be robust right
i'm just gonna google like listen to that keyboard clacking no you carry on for a second. You're doing great.
Oh, thanks.
Yeah.
Wow.
That feels really good to know that I'm actually doing a good job at podcasting.
All right.
I found it.
Too bad.
All right.
The robust first computing creed.
First be robust, then as correct as possible, then as efficient as necessary.
It's not saying don't be efficient
it's just saying be robust first then you're allowed to be efficient and the example uh you
just described just really reminds me of that so the thing that i also found i only thought of one
or two examples of things that are kind of about this robustness, correctness, efficiency trade-off that aren't
in the context of a distributed system. And one of them is constraint solver programming,
and the other one is John Carmack's approach to optimizing game performance. I can talk a bit
about what each of those are, but they're both cases where robustness
is not the concern but correctness versus efficiency is and so i'm curious if either
of you have any examples of cases where robustness is the concern but it's not a distributed system
because it's easy to think of like oh yeah distributed systems are fallible therefore we
we need robustness and from robustness you get correctness and from correctness you know
you get the desirable outcome of your program but like are there cases where we worry about
failures of robustness that aren't about distributed systems i mean in some sense
everything's a distributed system that's what makes this hard right except except for the like
the abstract model of the von Neumann architecture, right?
Well, even then, because the example I'm going to give
is persisting to disk.
There's all sorts of things you have to do.
You can find all sorts of things about how fsync
is not enough to persist to disk properly.
And you can look at all these tricks that databases
that are production grade do to make sure, you know,
you got the write-ahead log and you got like all of these things
to ensure that like, yes, this really is on disk.
And if something fails, I have it on disk.
And that's ignoring distributing it.
It's ignoring like needing it on multiple machines.
It's just a single node system.
And yet you have to do all this because
the storage is distributed in some sense yeah that's what i mean right yes like it's it's
because of cache hierarchies and because of ram versus disk what have you and that is in the that
all of that is the von neumann model i disagree because the von neumann model just assumes a single pool of storage and a single compute unit a la the Turing machine, right?
Von Neumann architecture, it's just talking about you have memory and you have compute and they're separate.
It's about the hierarchy.
That's the thing that von neumann so von neumann's architecture is like all right you
want for this much hard drive space you want approximately this much ram and you need more
hard drive space than the ram so that if you have to page out uh or swap and then it's like okay
have this much in your l3 cache hierarchy and this much in your l2 and your l1 and then this
much register space and then okay from from uh the wikipedia article
the document describes a design architecture for the electronic computer with these components a
cpu for stuff a central unit for instruction register and program counter memory for data
and instructions external mass storage and then input output devices all right so we're both wrong
so like it already has this built-in well So it already has this built-in hierarchy of the close thing,
which are the registers.
You have the processor registers,
and then you have the instruction registers, one level out.
And then you have the memory, the RAM,
and then you have the external mass storage.
It has that idea.
That's the big thing about von Neumann,
is the memory hierarchy
because you have to have this all right slower moving and faster moving stuff cooperating
together so let's let's ignore von neumann for a minute turing and church and girdle and
those other sort of abstract models of computation are not inherently distributed
and it makes me wonder if there is a if there are any like concrete computers like any not
abstract models but actual physical computers that aren't distributed because i'm really
interested in this question of like is there a case where robustness, specifically robustness, not correctness, matters other than in the case of distribution?
And I'm trying to think of any examples of that.
I think here's the thing. myths paper you know what is intended to be and what is theoretically supposed to happen with
software it's often not what actually ends up happening you know we're humans we're we make
mistakes and even if it's not a distributed system they're probably intended to be helpful to a
person or another system in some way. You know, we're not just
making these things to exist in a vacuum, right? So the number of times I have, you know, built,
coded a new engine for a new creative coding thing or a little game, and I often start by
trying to eliminate redundancy in some way, because I'm trying to be clever
and I'm trying to make it run smartly.
You know, so like a classic example is,
you know, I don't need to re-render everything every frame
or I don't need to recalculate the position
of these certain things every frame.
I don't need to recalculate the order of,
and this was a real one that happened to me.
I had, I made a little
just simple, like, 2D game engine, it just drew sprites, and it needed to keep the order of these
sprites, right, on the screen, and I came up with a clever way, I will just sort of keep track of
the order, and just keep track of certain things, so I don't need to like sort them every frame you know because
sorting it's really fast but you know it's even faster to not do it you know which is kind of
stupid right but like you know I thought this is so clever this is so smart I'll sacrifice some
robustness and redundancy and I'll just do it once instead of on every frame. And I did that for about a week. And then I realized that
somewhere in the code, somewhere, I was making a mistake and I was putting them into the wrong
order. And I spent like a day trying to find the place I was doing it wrong. And then I realized,
wait a second, I could just sort them every frame. It's completely redundant.
It's because of a mistake I made somewhere else.
But you know what?
Just be robust, you know, just do it again and again and again.
And the only person using that was me.
It only works on one computer.
I think this robustness, it can happen on a small scale.
It can happen on a small scale. It can happen on a big scale. Like it can happen on a little one line,
like one functional helper you write for yourself
to make an RSS feed,
which I did in the last couple of weeks.
Or it could be like,
hey, we have these like data centers
that we need to all be working
and all be talking to each other.
I think it's like it's a general concept.
You know, those things can be efficient. They can, or not efficient. They can be robust or not robust. So I would say you can rate every program
on robustness. And it's a question of priority, I think.
Yeah, I'll put out one example that I think is fairly similar to what you were saying, Lou,
on a robustness trade-off that's not distributed.
And then I have another example that's very different that I think is also a good robustness example.
But I wrote an editor, and I needed color, like syntax highlighting.
And so what did I do?
I wrote a parser to parse it out, and I just parsed on every single frame.
And when I parsed on every single frame, it was always right.
In my new version of the editor, I have tried to have Rust Analyzer support.
It's like this LSP, and it has semantic highlighting with tokens.
And now I have something that when I edit it, I need to update those tokens
because I don't yet have from Rust Analyzer
the answer on what it's going to be.
And I had to deal with so many errors
that I didn't have to deal with
because I was just, yeah,
just reparse the whole entire text
every single time from scratch.
And I can do that at 1,000 frames per second
on a reasonable Rust file
without writing anything smart from my parser.
It was the dumbest parser.
Yes, it was robust to any of those things.
Once I got the parser working,
I never had to touch any of that code ever again,
whereas this token code I'm still just messing with.
In some ways, it's an efficiency trade-off
because I'm not doing it every single time and i'm
just getting deltas of tokens and that's the way that rust analyzer talks to me and but it also is
like so much worse in so many ways uh the other one though that i think is arguably if we ignore
like the fact that cpus and stuff are distributed blah blah is arguably a robustness without talking about
a distributed systems are like the control computers in uh spaceship space shuttles like
the space shuttle and like apollo and stuff they had multiple computers all running the same program
and they had to have a quorum before they would do this and the idea was because like oh actually having gamma rays and stuff hitting
your computer actually can happen right and so this was kind of like an ad hoc mechanism afterwards
instead of having something that would have been robust in the face of these kinds of things
we throw in redundancy to try to get that robustness trade-off now i've seen research
that suggests this doesn't...
One of the ways people have talked about writing this
is not running the same program,
but having four different groups of programmers
write the same program.
And this study claimed that that didn't work
because the errors people make
are almost always correlated around the hard parts.
And so the parts that are wrong
are more likely to be wrong in independent implementations.
That's fascinating.
Yeah.
So you think you're,
and I don't know if I believe it or not,
but because part of what they talked about
is the programming language made it hard,
so you could maybe try it in different programming languages
and different parts are going to be hard, blah, blah, blah.
But it's interesting because here again
this feels like this ad hoc approach
where we're trying to build in redundancy into systems
that don't have it baked into the core
and I think this is what's so interesting about
let's throw out the von Neumann model
and let's have hardware that isn't reliable that's like one of
the suggestions here yeah and it reminds me of foundation db that talk from strange loop where
they did that and it was amazing it reminds you of that jimmy it also reminds me of haskell
and their choice to have laziness this was a choice not because they were like oh okay well um they
wanted purity in a language and they wanted laziness uh well really they chose laziness
to force them to make sure that the language was pure that's the idea that making a pure
strict language would have been easier but once you you have this laziness, as soon as you have these
potential I.O.,
you get all sorts of problems.
So you have to have this purity.
I think of this same sort of thing.
I butchered that explanation.
But I think this is kind of this
forcing thing here where
if our hardware were
not reliable from the beginning,
how would we change the way we compute?
I wonder if you could get that without having to go to hardware.
Like Lou approaches the design of their spatial programming experiments
and various other things by starting with the virtual machine
and then building the interface, which is the backwards way to do it.
It works for them, I have been told it works great yeah agreed so could lou make an
unreliable vm and that be the forcing function that that makes them build a robust interface
on top of it now i want to try that yeah i i think that this is one of the things that's
actually asymmetrical here that i find interesting is that's very easy to do right it is so easy
given our deterministic correct ceo software to write something that isn't always going to
guarantee you those things it's very easy for me to throw in random faults.
It's very easy for me to throw out random bits.
It's very easy for me to corrupt random memory.
On the other side, if my hardware by default does that,
recovering these things is impossible.
Now, when I say impossible, I mean guaranteeing it,
making sure it's always the case.
Now, what we can do is we can start approaching it asymptotically.
We can start getting better and better, and we can have different models that can make sure that most likely these things happen.
I know in distributed systems with consensus, you can't gain common knowledge with a deterministic protocol
but if you have an indeterministic one you can gain this knowledge with a probability of one
things like that there's like results where you can like use the indeterminacy to help you gain
something uh but yeah i find that actually like it makes me stop and say i guess is robust first the right answer because one is more powerful than
the others no yeah yeah no i think i think the spaceship example although it's like a bit exotic
it is um a good way of thinking about the kind of errors that you want to happen so like say you
were on a spaceship or like even worse you're on a spaceship, or like, even worse, you're on a
spacewalk in a spacesuit. And you have a little computer that tells you how much oxygen you've
got left. You know, you maybe you have 99% of your tank is full, or maybe you have 1% of your tank
is full. That is something that you would want to be pretty accurate you wouldn't want that to be
completely wrong you wouldn't want it to say you have 80 when you actually have two percent left
you wouldn't you wouldn't you know and vice versa and i think in robustness the idea is that you're
still gonna have mistakes there are still gonna be mistakes in code. Code goes wrong, things go wrong,
shit happens. But considering that mistakes do happen, what kind of error would you want? Now,
in CEO software, when something goes wrong, it goes completely wrong because we have certain
assumptions about everything being right all the time. If there's an error, it's not going to show
you 99%, it's going to show you 1%, or it's not going to show you 99%, it's going to show you
1%, or it's not going to show you 1%, it's going to show you not a number percent, you know?
Whereas with robust first computing, it's probably always going to be a bit wrong
because of this non-determinism, but it's only going to be a tiny bit wrong.
You know, so maybe, oh, so what?
It says 99% when you're on 98%,
but at least it doesn't say 99% when you're actually on 0%.
And I think that's the idea.
It's what kind of mistake do you want to happen?
That's similar to what you get out of systems
that use constraint solvers and gradient descent,
which is something I've been doing a little bit of recently oh yeah well okay i haven't been doing it i i have been on a team
uh with uh alex worth and alex worth has been doing it and uh doing a very good job of it
so i won't take credit except i will oh but using gradient descent the idea is you don't care about getting or at least
one of the things that you can do with gradient descent is say i don't care about getting an exact
like a precisely calculated answer because there are all kinds of different math where you can't
do that right at least not on a computer you might be able to do it with symbolic algebra
you might be able to do it by like manipulating terms in an abstract way. But if
you want to actually do calculation, there's a whole class of things you can do that aren't,
or that you want to do that aren't closed form and can't be like, you know, straightforwardly
calculated, but you can approximate them. And if you can model them using gradient descent and you
say like, you know, here's my descent and you say like you know here's
my function and here's the curved shape that it has through space and i want to you know slide
down a hill and find a you know a local minimum of error and that's good enough that way of working
gives you that property where you won't get exactly the quote- correct answer but you'll get an answer that's close enough
and if you build your system assuming that you're only ever going to have close enough answers
it's very enabling like it enables you to do a whole bunch of things that you couldn't otherwise
do if you were depending on correctness so it's to me this is interesting because it's like it's
not just a trade-off between you know the behavioral properties of your system it's to me this is interesting because it's like it's not just a trade-off
between you know the behavioral properties of your system it's like there are some systems
that are inexpressible if you require a certain notion of correctness that are expressible if
you relax that correctness requirement i i'm i struggle with this kind of this idea of robustness. Because when I hear things like,
you know, you only want it to be off by a little bit, that makes perfect sense to me, right? I can
completely imagine systems where I've said, okay, you know, I want to put in some safeguards here.
So if I see something anomalous, I maybe ignore it, right? Like, and it might be right, but like, I'm not
going to, it could also be like a phase transition in the system. And I don't want to go into that.
So I'm going to like, you know, whatever I can imagine where you can reduce error and make sure
that it doesn't propagate through the whole system and cause all sorts of issues. But when I see
things like the bubble sort example, it's slightly different than that things like the bubble sort example it's slightly different than
that so the bubble sort example we get one run here of doing a comparison with this 10 failed
comparisons and we see that in this one run bubble sort did better than merge sort and quick sort
but that's not always the case there is a case if we kind of
like abuse big o notation here for a second there there is a case where like in the best well i
guess first in the best case bubble sort does o of n comparisons this is when the list is already
sorted well we can imagine a list that's already sorted and we get, yes, absolutely unprobabilistically unlucky,
but it technically can happen.
It's one possible run
where we hit that failure chance every single time.
We have an already sorted list and we do bubble sort
and we hit that failure case every single time.
And now our list is complete.
We went from a sorted list to a completely unsorted list.
Yeah.
And in fact it
might have it will be even worse because it's going to constantly do all these comparisons
and it actually might i like does it actually if we got the you know i know i'm like you i'm saying
10 and now i'm saying we happen to get that 10 you know a bunch of times in a row but it wouldn't
end it would it wouldn't terminate if we always swapped
it around right if we crank that failure rate up to 100 bubble sort will never end for sure
i guess none of them will but anyways it'll never end and it will be inefficient doing it
but i i guess like you're not actually you're minimizing the the probability that you get a large error is low,
but you're not actually saying it can't ever have a large error.
There are runs of bubble sort
where it gets a much larger error.
It's just improbable that that happens.
And I guess that's the thing
that kind of confuses me about this robust idea
is there's no guarantee
that we get an approximate answer answer hold on a sec you're
saying none of them will ever terminate well that was like a silly example if we took it up to 100
but no no no but this there's there's something interesting here so let's instead of saying it
returns you know oh if i'm sorting a list of numbers instead of returning you know is a less
than b it returns as a greater
than b every time that's one kind of hundred percent failure but there's a different kind
of hundred percent failure which is that the comparator always just returns false right
because if it returns like a greater than b instead of a less than b all three of these
will terminate they'll just sort the list backwards they'll sort it you know descending
instead of ascending or whatever, right?
But if it always returns false, I think merge sort and insertion sort will eventually stop.
Or not merge sort and insertion sort, quick sort and merge sort will eventually stop because they work based on the assumption that it's like after we've checked two things and we have our answer, we don't have to check them again. Whereas bubble sort is going to keep checking them again and again and again until it determines that it's true, right?
Until it's like checked every pair and gotten a true result.
Which means merge sort and quick sort will terminate with a wrong answer and bubble sort will never terminate because it's still
robust to giving you a wrong answer or another way of looking at that is that if we that bubble
quick sort and merge sort are more robust in the face of swapping a mess up in our comparator code
that replaces it with a constant false because what
they'll do is they'll give you something that's closer to an answer than what bubble sort will
give you which is no answer i don't agree with that did you see that uh ai that they got it to
play tetris right and they got it to try and learn how to win and you know like you can't win tetris
you can only lose so you know what it did, right?
It just kept playing forever?
It just never played.
No, it just paused.
It just paused the game.
That's what Bubble Sort does if it figures out that it's rubbish, you know, if there's no winning.
Yeah.
Have you seen this Demon Horde Sort thing?
Demon Horde Sort, I believe it's called.
I have not seen Demon Horde Sort.
Yeah, i did look
at this cellular automata sorting almost all right now it's the sound of my keyboard clicking demon
but but hold on i want a pigeonhole on this rope no i want a pigeonhole i want a pigeonhole no no
we can do both if let's assume which is true in the real world that a lot of our lists are approximately sorted
right if we just always return false quick and merge sort we'll return this original list
and they'll be approximately sorted bubble sort would return nothing bubble sort wouldn't return
yes it would return nothing it would never wouldn't return. Yes, it would return nothing.
It would never return.
That's different than returning nothing, Hickey.
Okay.
It's not.
Peyton Jones.
It's not different than returning nothing.
It returns nothing because it doesn't return.
But anyways, I think that you have to consider,
and I think even Ackley in this paper would consider,
in that scenario quick short
and merge short being more robust to handling that failure we should ask him yeah yeah yeah i
actually don't know i actually don't know yeah we should ask him so there's a thing that i wanted to
wrap did you did you finish your pigeonhole jimmy yeah okay yeah that's fine okay so there's a thing that i want to wormhole on which is uh i've got two examples maybe 2.5 examples of things that are like not distributed
systems where some of these trade-offs come up that i want to that i want to meow meow um one
of them is john carmax talking about uh how to do performance optimization oh my god i completely forgot about this i'm so sorry
this is this is this is gone somewhere else you said this ages ago you mean this podcast has a
correct order that that things are supposed to go in yeah this is what always happens when the
paper is either one not structured well or two too short to just walk through it which in this case this
is well structured just too short to spend a whole episode just walking through the paper yeah i have
some things to respond to for from over an hour ago but i'm gonna save them we haven't even been
recording for over an hour i say because i'm gonna edit it down we're being robust in terms of our conversation we can always come back the car
nothing gets lost so carry on the car mac thing
this podcast is never going to end um we're robust against a bad episode if we never end
the episode it can't be a bad episode that's true that's true um that's
true so this this carmack thing is most people when they are optimizing code what they do is
they try and eke out every little performance win that they can right like they run some micro
benchmark or whatever or they look at you know they they run a flame graph and they look at you know a particularly typical
frame or run and they try and find okay where's the the hot path through my code how do i make
that less hot how do i you know find some case where i'm like you know n squared looping within
a loop or whatever and flatten it out or what have you and they do that process and what and what tends to happen is that you get much better average
case performance but you will still have these spikes of worst case performance and carmax
approach is that you should instead try and do the opposite that you want as little variability
as possible you not you know better average case and that if you take this mindset of like prefer
you know minimizing your variability most of the typical kind of performance optimizations that you
do no longer work because they they'll they'll they'll lead you to basically the kind of case
where it's like oh i'm gonna do my gc in a way where it's like
i don't run gc on every frame because that's slow instead if i save up gc and only do it every once
in a while overall the performance of the system will be better but i'll occasionally have these
big pauses where i do gc and that's bad if what you care about is very uniform execution of your program,
which is an interesting thing that like matters to Carmack because in games, you know,
it's better to have a stable frame rate than a fast frame rate that occasionally has big hiccups
in it. Like that just feels bad. But I think this is something that comes up in other cases also,
and it's a lens that I'm fond of. And so i just wanted to suggest that that is a place where it's not trading off robustness
versus efficiency but it is trading off maybe correctness versus efficiency or or it's one of
these like places where you can trade off against efficiency that seems relevant the other one
bubble sort will terminate i'll just sort the list backwards
did you just implement that i just googled a python implementation of bubble sore and then
i modified it and we were just thinking of it having some sort of while loop but it just does
two nested for like a nested for loop setup doesn't that depend on how you implement bubble
sort i mean this is like canonical bubble sort.
Canonical?
Head canon bubble sort.
Yeah.
If it goes through,
all it does is like,
once it goes through some iteration,
if it never swapped anything while it was walking down,
then it can terminate early.
Otherwise it knows that if it does all the possible,
all those combinations,
it will end up sorted.
It knows. And so I just made it so it swaps every single time and always knows that it it does all the possible all those combinations it will end up sorted it knows and so
i just made it so it swaps every single time and always it knows that it's swapped and then it just
starts the list backwards so it gets the exact wrong oh now i would need to compare it to merge
sort and quick sort i would actually bet they probably do the same thing-ish, but it's hard to know.
So what's the name of the sort that does the thing that I was talking about?
I think Bogosort is the most robust sort.
No.
No.
Am I the only person who finds Bogosort not funny?
Yes.
I would say it's-
I don't find Bogosort funny.
I find it dreadfully unfunny.
How is it unfunny?
It's like one of the most recommended things for Dream Bird.
Yeah.
It doesn't make the list.
Sorry.
Oh, you just ran.
You throw your data up in the air.
And if it lands in, you know, the a mona lisa or a shakespeare or
whatever then you can shoot the monkeys right i don't what about pancake sort okay what's pancake
sort it flips the list repeatedly and if it lands in the right sort order it's actually like a legit
it's not like a fake one you can actually use it uh but it is based on the idea of flipping over steps stacks of pancakes yeah
that's fine right like okay so jonathan blow has a fantasia oh god does he yeah so i was listening
to this jonathan blow talk and this is going to segue into my other rabbit hole um uh it's going
to segue um jonathan blow was talking about how he doesn't visualize things in his head
and how that forces him to work with artists in a certain way and i found this all very
interesting and kind of unnerving forces him to make terrible mistakes yeah and to keep bringing
up the witness when he shouldn't that he's you know sworn to stop bringing up the witness and
so so he has this game that he made called Braid.
And in Braid, there's this time travel that you can do
where you can like, you know, be running around
and jumping on enemies and doing all these things.
And then you die.
But when you die, you don't start over again.
Just time pauses.
And it's like, hold down this button to rewind time.
And you rewind back.
It's like playing the tape of your gameplay backwards.
And whenever you want, you can stop rewinding and then go oh okay i'm gonna pick up from here and keep playing
and it does all sorts of wonderful things with this time travel mechanic
and the way that blow had to implement this time travel mechanic was you know he took a couple of
different attempts at it and settled on this idea uh that's very much like the way that you know um h.264 and other video
codecs encode their frames where you have occasionally these full state snapshots where
it's i think in in blow's case it's every two seconds or something like that every every uh
60 frames if the game's running at 30 frames a second, you have these full-state snapshots.
And then in between those full-state snapshots,
you have a series of diffs.
So the full-state snapshots are useful
because if you're rewinding in the game,
you can rewind just at normal speed
or you can rewind like super-duper fast.
And if you're rewinding super-duper fast,
it's handy to be able to say,
jump to this arbitrary position in the past grab the full
state snapshot and then if you didn't you know mean to land exactly on the on the moment in time
that that snapshot corresponds to you can just apply some diffs onto that snapshot and get to
the exact point in time you want it so if you're going you know in between two snapshots you just
use some diffs to get there which is also how video
encoding works right where you have full frames that are stored i can't remember if they're the
b frames or the i frames i should have looked this up but you have these full frames that are stored
and then in between storing those full frames you store diffs that apply on top of each other
and in the case of video encoding, those full frame snapshots
are sometimes unevenly spaced.
Sometimes they only store one
at the very beginning.
Sometimes they store one
whenever they detect
that there's a big change,
like, you know,
oh, we switched from
an indoor night scene
where it's very dark
to an outdoor broad daylight scene
where it's very bright.
Instead of packing in a gigantic diff that's going to handle that scene change we'll just put a full
state snapshot there and that's that's nicer the thing that this video encoding has that is neat
is data moshing which is this practice of only applying diffs, but applying them to the wrong content.
And it's this technique that's emerged in,
you know, like people doing video collages
and remixes and art and all of these things
where they take this video encoding
and they apply the diffs on the wrong content
and you get just weird, surrealistic,
melting, trippy, glitchy visuals out of it
it's a case of using the fact that diffs are not robust for an expressive quality where if the
video encoding had only ever stored full snapshots that would be the like optimally robust thing but
it would be slow right that's a lot of extra data you're
storing every frame as a full frame that's kind of like an unencoded video instead most video frames
are like very similar to the previous frame and so they just store little tiny diffs and that was
an easy efficiency win but that efficiency win gives you an opportunity to hack the matrix and
and come up with some cool visual result and we don't often want that in
our software because that's like malware or corrupt data or whatever but there are cases
where you actually do want to give up robustness because it it gives you a generative interesting
result that you wouldn't otherwise get so that's like that's taking like the opposite view of this
trade-off and seeing what can you do with it that's that's interesting the jonathan blow time
travel thing was just in there to irk jimmy that's not actually relevant to the example i wanted to
give but and me and me and get out of here didn't irk me so oh oh just i'm robust against those kinds of no i'm efficient against it
yeah i think that example is yeah explains the difference that's basically what happened with
my little game engine ordering thing i was trying to do the diffs thing it didn't work so i ended up with some data moshing
esque thing but but unintentionally so i just gave up on it and went back to full snapshots yep
so i'm curious i think i mean i i there's you know little quotes in here in this you know
position paper that we haven't talked about but i think we've gotten the idea here across of like this is
let's put the robustness first let's not care about correctness and efficiency only if we have
to make these trade-offs that's it's important to do and that these things are related that like
efficiency is going to come in tension with robustness. But I'm curious to hear your all's thoughts on like,
what does a robust future of coding look like?
Like what do you see as something that we could be doing
where we started robust first?
And on that note, I wrote some things down,
some computing concepts slash things.
And I want to know if you think they're robust first or not.
Okay.
Yeah?
So maybe this could help inform our answers to Jimmy's questions.
I love this total non sequitur.
Right.
So first of all, is local first computing robust? It desires robustness against losing your network connection or against, you know, other other things that can happen.
It's not just about unreliable networking, but it's also about like, you know, you don't surrender your data to somebody else.
So if somebody else doesn't want to give you your data, Google, then well, too bad because you still have your data.
So it's like robust against the evils of capitalism,
which is good.
So I would say yes.
Yeah, every time I try to think about this,
I think there's so many different dimensions
to be robust on that it's hard to give an answer.
And so much of it comes down to implementation details.
So like CRDTs, with the assumption that you always, you know,
end up eventually getting those messages, they are robust in that they'll have a, you know,
they'll converge, right? We are not going to have non convergent data at the end, assuming we get
all of the data. But if we never get that data data we might be wildly off what the final result is
supposed to be um so that feels non-robust there uh and then though there's things like if you look
at like the details of implementing these if you do like a hybrid logical clock for example it's a
popular mechanism for trying to implement how do you get ordering in these things and hybrid
logical clocks are not robust against somebody just arbitrarily saying it is the year 10 000
because from now on your whole system thinks it is the year 10 000 and it's sending all messages
that way well this brings me to my next question perfectly. Is permacomputing robust? I would
like you to answer this question, Lou, because I don't know enough. Right. Well, I think
permacomputing is a kind of computing that attempts to be sustainable and it takes it
extremely seriously. And there's an emphasis on using old hardware. There's an emphasis on simplicity.
There's an emphasis on being able to rebuild things from scratch.
So I think in some extents, it is very robust.
It's robust from like capitalism.
It's robust from the end of the world as we know it because of climate change.
And it's robust against like complexity.
You know, it's robust against things getting you know is it robust against things getting lost in
translation stuff being forgotten but i think that it also goes at odds with robustness a little bit
because there's this huge emphasis on being energy efficient but in robustness there's this assumption
that we have energy we can use redundancy because you know we
have electricity and computers are powerful but permacomputing is more focused on low energy
it's more focused on slower computers so there's like a slight tension but there is an element of
robustness there i think it would be good to try and make them compatible in some way yeah i'm glad you brought that up because that's one of the few
pink as in bad bad actly highlights that i have in this paper which is that actly says efficiency
is key when demands exceed resources yet many computers are usually idling they could have
been checking their work but that's off the radar for ceo software and just to refresh the memories over the listener ceo correctness and efficiency only
and that like oh many computers are idling they could be using that time to be doing useful work
to which i say uh i i turn them off yeah switch them yeah like why maybe why um i i don't think that that's a
compelling argument in favor of of robustness versus efficiency i would rather and the
permacomputing thing gets to this i would say that there are many things that we do in computing
that have a tremendous cost against both efficiency and robustness and that identifying those things
and getting rid of them is something that the permacomputing culture seems really good at is
like because of their relentless focus on simplicity and their desire to be usable on
older computing they've gotten rid of tremendous amounts of that stack that like you know tower of
babble that is modern operating systems and and software frameworks and all of that where it's
you know the xkcd about the x86 cpu is screaming along and it's doing you know billions of calculations per second and the
servo and gecko inside firefox are like you know doing this tremendous amount of work to render
what ends up being just a gif of a cat jumping into a box like the gigantic tower from cpu up
to animated gif is a place where robustness and efficiency both go to die and that if you thin
that stack out and start from you know much more primitive hardware with a very simple vm with you
know some carefully written software on it you get a lot of robustness back and you get a ton
of efficiency back and i think you can that's place where, where people are having some cake and eating it. And then even if they want to sacrifice some efficiency for robustness,
they're still way more efficient than, than the modern stack.
So I have a, I have my own creed that I just came up with like 30 seconds ago.
Hot off the press.
Yeah. It's the, I call it the better computing, computing creed, right? And I almost feel like I want to stick something even ahead of robustness. It's first be sustainable, then be robust, then correct, then efficient, in that order.
Sustainable meaning?
Meaning, meaning permacomputing
don't destroy the earth yeah then make sure your thing can't be destroyed yeah then don't destroy
your data and finally do your damn job yeah that's that's great anyway i'm glad we settled that
i have a i have one more thing to ask about in terms of is it robust?
And it's a weird one, right?
Is the Pfizer coronavirus vaccine robust?
Whoa.
This might be the first episode that gets censored
from a source other than Editor Ivan.
Okay. I have no idea if it is okay because there's this insanely interesting blog post article it's an article and it's called
reverse engineering the source code of the fiverr sars-cov- blah vaccine the fiverr covid vaccine somebody went on fiverr and was like
hey can you gig workers come up with a covid vaccine yeah it's just like this horse stuff
it's um it's this horse ivermectin no it's not ivermectin no no it's it's um ah pcp it's this
horse tranquilizer and you take that and you have this wicked
hallucination and you listen to the Fight
Club soundtrack because it's really
groovy. I wish that happened. Dust Brothers,
man, am I right? They're so good.
They've got these retro
sounding drum machines. They're like, Tung, tung, tung, tung. It's crazy though right this article is talking about this vaccine which is made with the
programming language dna right and there is a hugely different set of priorities to this
programming language than the ones we typically use right like let me read you this bit it says
at the end of the protein we find a stop codon, a stop instruction,
denoted here by a lowercase s.
This is a polite way of saying that the protein should end here.
The original virus uses the UAA stop codon,
but the vaccine uses two UGA stop codons,
perhaps just for good measure. right? It gives the stop instruction
twice. There's this redundancy because if there's something you really don't want to f**k up,
it's stuff that you inject into your body, right? Yes. There's something else, right? The very end
of the vaccine, it just ends quite humorously with loads of a's right it goes
right it says the end of it all the very end of mrna is word i can't pronounce
this is a fancy way of saying it ends with a lot of ah
even mrna has had enough of 2020 it appears oh so it doesn't need to have all this
but basically every time this program this dna program gets run it loses some stuff at the end, just from wear and tear, right? So,
they just chuck loads of ah on the end. And I just thought this was just a really interesting
parallel because this is code that has been written on like an editor on your computer,
but with a hugely different set of priorities. It's interesting that it's not running on a von Neumann architecture, right?
Unless you want to get like, right?
Because we're already in like metaphor land, right?
Like, oh, DNA is code, right?
So we're already in the realm of metaphor to some extent, right?
In the way that, and I guess it's equally metaphorical that like C is code, right?
C is not code.
Get out of here.
Computer code is not code. I'm sorry um computer code is not code i'm sorry what no what um so it's interesting that we're already in this realm of metaphor
and so we can we can probably say yeah the the human body is not a von neumann machine it might
be a von neumann unless you are von neumann himself right yeah exactly yeah von neumann's body i mean it's not a machine
yeah well you know he's more machine now than man um this is like david actually has a definition
for what a living system is and he has this it's like a dynamic persistent system and so he gives
an example of uh little eddies in the water that are swirling around or living yeah in
his definition yeah because they're these dynamic persistent they only last for a little while but
they take energy from other places and so yes and he wants these robust first systems to be
living systems in the same way that makes me so happy i'm like i you know what this
paper is short and it's kind of weird it's not what we usually do i love david ackley this like
this stuff makes me so happy i'm glad that there's you know somebody like like actually out there
pushing for computing to be different in this way and not afraid to get freaky with it like we spend i i
in particular spend so much time on this podcast like lamenting how everything is a monoculture
and how different i want things to be and here's somebody who's like actually genuinely doing it
but like backing it up with the like you know look i did this like actual computer science work
you know analyzing these different algorithms and submitting a thing to acm and like you know he's he's he's putting on the costume of a computer
scientist to go out there and be like actually i just want to make life we stan dave ackley we stan We stan a complex king.
You know, I think that's what I like.
Like, I don't even know how far I agree. You know, I haven't even sort of come to terms with, wait, am I fully robust pilled?
But it really, it just makes my imagination go it makes me think beyond my normal
like bounds when it comes to coding and it makes me to see things in a different way and that's
what i appreciate about it and and i see what he's saying you know it's like i agree but like am i
am i brave enough to truly dedicate my life to robust first find out next time on the patreon exclusive anyway patreon.com
slash future of coding can i can i read something uh from from another one of his papers which
is about von neumann von neumann von neumann for whatever all right right. Randy Newman. Randy Newman.
65 years ago, Randy Newman predicted that hardware determinism would soon be supplanted.
But with design lock-in and network effects, it remains virtually unchallenged today.
Randy Newman said that, right?
That's from Indefinite Scalability for Living Computation,
one of Dave Ackley's other papers. I think it's kind of funny, like, even Randy Newman was against Randy Newman architecture, right? Even Randy himself, you know? So if the creator himself is against it,
and maybe we should question it a little bit too.
Yeah.
Mic drop.
I have to say, I think since we're giving kind of like closing statements
or general opinion statements here on the thing,
I love that there's somebody focused on something very different. I love seeing
different avenues being explored for sure. But this paper left me wanting more of an argument.
I wanted to be convinced. I'm like, okay, robust first, but I don't know what we're aiming at and it reminded me of a conversation i had with
kartik on the slack where he was arguing against dependencies basically like we shouldn't depend
on like anything yeah like even the operating system that you're using or the hardware that
you're making which feels like a almost like a similar kind of you know vibe here warms my heart not exactly the
same but a similar vibe and i didn't get the argument until it was kind of just put as like a
instead of practical terms more of a moral stance right like this is just how things ought to be
like this is a better world where we don't have to depend on all of these things.
And I guess I just don't know how to take this.
Is Ackley giving us a new way of living?
A new way of being that is kind of a moral life way?
Or is he saying, we're going to have better practical applications at the end if we go this route?
And if it's the latter, like this is going to be better and more practical,
I don't know that I'm convinced of that.
If this is instead more like I see things like permacomputing and these sorts of things,
right, like an ideologically driven way of doing this, I might be more on board with that.
Like even if I don't follow it I might see the logic
more and that's what this paper really lacked for me was like how do I get into this mindset how do
I decide that robust first is for me is it that I'm looking for a problem a solution to a problem
or is it looking for you know some vibe, some different way of being?
And when I listened to some of the other things from Ackley,
I definitely got more of that sense of like this is about vibe.
This is about changing our relationship to computing. It's about that sort of, and I wish this paper had more of that.
I just want to know, like what is the argument here?
Why should I buy into this?
And if it's just on practical considerations it just
seems to fall a little flat i think it's got to be more than that ivan are you robust pilled
and by the way yeah i i totally see where you're coming from from there jimmy i agree but at the
same time i'm reminded of the upton sinclair quote that it's very hard to get uh jimmy miller to understand something when his
jimmy millerness depends on him not understanding it and that's embodied in the quote from this
paper today's computing base is optimized for the perfect logician that's jimmy jimmy is the perfect logician like have you seen him he's perfect
he's also very logical um and uh yeah that that that explains to me why he is not so easily robust
pilled as the as the two of us you know me and lou who are not very logical beings excuse me no it's probably true yeah this is how we got here i mean i'll happily admit like
when i read that i was like but you can do all of this with logic too right like that is my first
my first reaction right that doesn't strike me as like oh but why why give up those parts right
that uh-huh your your argument's irrational it's not comprehensive
enough i need more supporting data and you know yeah please explain your thinking in a more robust
way i will just say i also recoil at the word robust but i i got rid of that for this one because
in industry when people say robust it is just it stands for everything i dislike aka like java enterprisey stuff right
i know that's not what we're talking about here but i had to like yeah at first i was like robust
no and then i realized what what this means i usually recoil at it when i see it too because
you know i think no but you're not really robust are you you know where's you can't slice your computer in half what if i drop a nuke on you
yeah you know um when i when i presented cell pond in at splash one of the
one of the most common comments that i got was no but it's not really non-deterministic, is it? It can't be.
You know, how do you, but you had it doing multiple steps. You can't do that.
You had it being true and complete. You can't do that on a non-deterministic system.
And the number of times I said, no, no, no, you can. Look, let me show you.
There's something that just doesn't sit well with with some people and
me at first as well i think that's why it intrigued me so much it doesn't sound like it can work
and there's an element of that that can put me off and there's an element of that
that makes me want to find out more and through finding out more i've to my surprise discovered oh my god it can work yeah it's
it's this thing like actually here is doing what i often try to do with visual programming which is
to say there's this way that people approach writing software and building things on computers
where there's there are trade-offs and dimensions along which they can vary their
approach that they're not even conscious of and so they they automatically do things in a certain way
and it is first of all value to point valuable to point out that those things could be different
right like we could instead of using text as the interface through which we express our intention
to the computer we could do it through something not text.
Or instead of prioritizing efficiency and correctness, we could also consider robustness. And that first step is like establishing that there is a choice that has been made here
at some point and that you have the option to make a different choice.
But then the next thing that I think most people when they're confronted with that immediately
write it off it's not a binary choice it's not like an either or it's that you can now that
you've identified that there's this other way that things could be you can try to have the best of
both you can you know why are we comparing bubble sort versus quick sort and merge sort well because nobody's
made robust sort yet right like there might be a like not you know n squared billion billion
comparisons bubble sort slow awful thing there might be something that is robust in the ways
that we care about and fast not as fast as the thing where you give up robustness but fast
enough for most practical purposes and the same with visual programming versus textual programming
you don't have to completely get rid of textual programming you can have a little bit of visual
expression maybe in your debugging tools or in how you visualize live data or in how you break
up the problem domain
and say, okay, my overall architecture
is going to be expressed visually.
And then my small, like, you know,
in the small programming is going to be done textually
or vice versa.
There's like lots of different things that you can do
once you're aware that this trade-off exists.
And so I appreciate this Ackley paper
and this idea of robustness not as a
you know oh hey i have to give up on one thing in order to have the other thing
in totality but rather that this is a a dimension along which i can make a conscious choice and i
really like that this is like avengers end game for me You have to give up half of all programming.
No, it's like, you know, Smash Bros Ultimate
or Spy Kids 3D at the end when they get everyone to come.
It's because this is like this crazy crossover episode for me
because I'm a huge fan and follower of all of the robust first stuff.
And I listen to the future of coding podcast
and now it's like both in one it's like the crossover episode you know in comic books
where like spider-man saves someone else you know i'm i don't know i have not watched any
marvel movies but yeah i haven't it's's Ivan Rees talking about robust first computer.
What's going on?
What's going on?
It's crazy.
Well, it was nice having you on as a guest for these two episodes, Lou.
Oh yeah.
Where can people find you on the internet?
I know you're on Patreon and some other places.
Toadpond.com, baby.
Toadpond.com.
There's only one place now.
Toadpond.com.
Getting slippy. Getting slippy. Yeah. Yeah.com, baby. Toadpond.com. There's only one place now. Toadpond.com. Getting slippy.
Getting slippy.
Yeah.
Yeah.
Yeah.
Yeah.
It was good having you.
And you'll never guess who our surprise guest will be next episode.
You'll have to just tune back in and see who it is.
Who are you getting?
Rhymes with Onathan Rowe.
It rhymes with Jesus Christ. Jesus Christ.esus christ we can't wait yeah jesus
christ cheese and rice i just like the fact that people's names rhyme with their own names so you
can just say their names and it rhymes with and you're still logically correct look at that perfect logic luke either's fine rhymes with luke any rhyme did the crime
did the crime oh this is this is scraping a very new low for us the episode ended half an hour ago
stop recording stop recording stop recording stop recording stop recording stop recording Stop recording.