CoRecursive: Coding Stories - Story: Hateris - Obsession, Friendship, and World Records
Episode Date: March 3, 2025What if a simple game became a gateway to computational breakthroughs? David Freiberg and Felipe set out on a journey to conquer Hateris, a notoriously difficult JavaScript game. Their interest ignit...ed when a new world record was set, showing that surpassing the game's high score was possible. Their journey was full of challenges, from building an emulator in different programming languages to tackling complex algorithms. They pushed the boundaries of what's possible but the story didn't end there. Collaborating with fellow enthusiasts, including a Japanese Tetris expert, led to further breakthroughs. By sharing insights and building on each other's work, they set a records after records. Their story highlights the power of curiosity, collaboration, and the joy of discovery. Episode Page Support The Show Subscribe To The Podcast Join The Newsletter Â
Transcript
Discussion (0)
Hello, this is Co-Recursive and I'm Adam Gordon Bell.
Today, as always, we're diving into the people behind a piece of software.
Because sometimes what looks like a simple puzzle can unravel into a web of complexity.
Today, we're exploring the story of two people who got caught up in a JavaScript game
from 2010.
It seemed like just another Tetris clone, but it quickly became an obsession that would
challenge their skills, their wallets, even their relationships.
My name is David Freiberg.
I am a material science PhD, and I am very, very obsessed with a JavaScript game that
came out in 2010.
I'm Felipe. I'm a software engineer at HubSpot.
I'm not quite as passionate about this problem as Dave,
but I've gotten dragged kicking and screaming into it.
So I now guess I'm a domain expert on it.
The game is called Haters. If you've played Tetris,
you know that moment when you need just the right piece.
Like I've got it all lined up. If I can get that four in a row, that I, I can clear four
lines.
Now picture a version of Tetris where you never get the piece you want.
Where the game seems to have it out for you.
That's Haters.
It's a hard hard game and for years it had a seemingly unbeatable, very very low high
score.
And then these two friends, Dave and Felipe got obsessed.
They poured months and months of their lives and a bit too much money into toppling that
record.
And depending on when you listen to this episode, because the competition for this game is actually
fierce.
But as of right now, Dave and Felipe are the current record holders.
World record holders.
But before we get into all that, let's go back to 2010 and back to the great science fiction writer, Sam Hughes.
Back in 2010, Sam Hughes created Haters, just published it on his blog as a novelty.
About a month after the game was published, a Japanese slash-dot commenter named Diazuke
got 30 points, and it stayed there for years and years and years.
2015, I took a class in general game playing on MIT OpenCourseWare. And this was an interesting class.
Instead of writing a program to play chess, or to play checkers, or to play Connect 4,
you were writing a program that could play any game given the rules.
Instead of just being given a position, you were given a position and also all of the
relevant rules that went with it.
I started reading QNTM and I started looking at Hatress
and I messaged Felipe a couple of times asking him,
could we apply it to Hatress?
Could we apply this game playing?
And Felipe quite wisely pointed out
that my general purpose game player
could barely beat me at Connect 4
and it couldn't beat me at Checkers.
And Hatress is way, way, way
harder than pretty much any human game.
And yeah, Felipe was right about that, so we shelved it.
So what happens next is that in June of 2021, a Japanese computational Tetris expert, and
yes, there is such a person named New Jade, sets a new world record.
We thought it might genuinely be impossible
to get past 30.
This max score belief was based on simulations
that they had run.
Unlike Tetris, Haters is totally deterministic.
You always get the worst piece based on the well you have.
And even in narrow wells,
wells being the Tetris term for that place
where the pieces fall,
even in narrow wells,
the game stayed ballooned faster than you'd think.
So 30 points, 30 lines cleared, it felt like an absolute ceiling.
But then New Jade hit 32, so clearly they had missed something.
And Dave started working on an emulator and I was like, okay, we're doing this, I guess,
because Dave has started working on the emulator and I'm not going to be able to
gently guide him away from the idea of doing this.
That's what kicked off the project in my brain.
That's where it went from impractical idea we're discussing, which is 90% of our conversations,
80% of our conversations to idea we are doing. This is normal for these two, but before we get
deeper into how they tackle this challenge, you need to understand something about them.
Dave and Felipe, they're buddies from back in high school, but they are not casual in their approach
to these challenges. Dave's the instigator of projects. Felipe is the one who reigns them in with practical concerns.
But whatever they're working on, they're always knocking around ideas.
Dave will send me some idea like, I was thinking about these machine learning algorithms to solve
a game. And I will say, Dave, that is wildly impractical. How do we make your idea practical?
Because I like the core of it, right? I really want to like, maybe we can't solve every game,
maybe we can focus on one game. And we'll narrow
the idea, we'll refine it, we'll talk about it. And we'll come up with sort of a practical
approach. And that's really where the conversation will end, right? I'll be like, yes, we can
predict all stock markets forever. But you know, you could maybe focus on predicting
one stock and you can use all these other variants, factors to figure it out. And I
read this good paper, maybe go look at that. And then if Dave actually likes the idea, two idea two days later to a week later he will show up and say I wrote a Mathematica script for
this here is the data that I got from it here's the parts that don't work I didn't use a database
because I don't believe in those and all the data is sitting in text files in my hard drive right
now and it's full of these and I'll be like Dave I gave you like four practical ideas here why did
you do it this way and so honestly it's most most of this stuff goes as it comes as snippets of conversation had over Discord, right? It's
like Dave sends me a message that said something along the lines of like, I just tested the
algorithm sorter and I got these results. And I say something incisive like, well, those
results aren't very meaningful, Dave. What can we actually learn from these? And he's
like, well, let me pull up Mathematica. I, make a graph. And we look at the graph and make some
more commentary on it. We'd have sort of a discussion about it,
right. And the conversation would pick up on and off throughout
the day as we're available. So I'll be like, okay, I have to go
to make dinner and I'll go make dinner. I'll come back and
table and posted five more paragraphs in my discord and I
go read them and I make a bunch of commentary. And Dave's off
swimming because he's he's done a he's practicing for an Iron
Man. Were you rising for an Ironman at that time?
I think you were and so he's like I just got back from swimming 8,000 miles
Let me post some more thoughts and so like 8,000 meters come on Dave
Your magical exercise numbers are meaningless to me. I do a lot of biking and I do a lot of swimming
so a lot of this is I'll look at a whole bunch of data right before I leave and
Then as I'm biking to the river or when I'm in the river, I have nothing
else to think about or focus on.
So I'm just sort of mulling over the data in my head.
And when I get back, I, I send Felipe messages in the same way that cannonballs
are sent from cannons sometimes.
And so he'll be hit with a barrage of a couple thousand words based on
whatever I was mulling over.
Any reference to practical stuff like VS code is going to be Felipe.
I forced me to use things like Git and version control and backup copies.
Still haven't forgiven him entirely.
I mean, we only lost most of a project due to not keeping any backups and not
using version control, so you know, it's not like I, it's not like I've had us to get you there until we did that.
Yeah, this is very true.
So this is a thing that Dave and Felipe do.
They work on projects together.
They send around messages and then at night, depending if they're busy or not, they might
jump on a Discord voice call.
They might get a VS code live share running.
They might work on some coding.
Sometimes the projects get somewhere.
Sometimes they wouldn't. Sometimes at the end, they'd write up their work, share it on the blog.
Most of the time they wouldn't get around to that. But yeah, Haters was always one of these topics on the
back burner. It would come up now and again. Dave wanted to tackle it. Felipe pointed out it seemed so
out of reach. So other projects would get priority. But like many backburner ideas, Haters was quietly preparing to take center stage.
Because little did they know, once they truly committed to beating this very hateful game,
there'd be no going back.
Because when a project grabs your interest, especially when there's a group of you,
the effort you pour into it over countless evenings and weekends can be surprising.
It can add up.
So yeah, New Jade hit a new record, 32,
and Dave thought maybe they could beat it,
but first they needed to build an emulator.
So we built not one emulator, we built six total.
Started in Mathematica just to get something that worked
and to get an idea of how slow things were, in
Mathematica, a single move took about a tenth of a second to calculate. And
there's a fair amount of computation that goes into Hatris, because the
Hatris algorithm doesn't give you a piece at random. It looks at the well and it
calculates which piece is going to be worst according to a minimax algorithm.
It makes sense that it would take a little bit.
Our current one runs approximately 200,000 times faster. So currently we run about five microseconds per move per core. So even if even on a single thread that's still tens of thousands of times
improvement. The very basic emulator consists of just trying every possible spot
that a piece could go and doing the minimax algorithm from there. It was not very fast in
Mathematica. So what Felipe eventually convinced me to do was rewrite it from Mathematica to Python,
which was barely any faster, and then from Python to Rust. And this I will blame Felipe on. I will
blame this on you because I don't
think I would have gotten into Rust if you hadn't dragged me into it. I do take the blame
for that particular approach. As background, we had no other experience with the language.
We did six AOC problems and our seventh thing was re-implementing Google's AlphaZero algorithm complete with
GPU neural nets from scratch.
It was a bit of a leap.
Yeah, AlphaZero, that DeepMind project out of Google.
That was their inspiration.
I mean, there was a published paper with formulas and stuff in it.
So how hard could it be for the two of them to re-implement it all?
At the time when he brought up AlphaZero, I was playing a lot of Go,
and I had seen what AlphaGo could do.
And so I was like, oh, actually,
I do think this has a lot of potential.
Let's talk through the actual practical implications
of doing this.
And at that point is when I knew
this was the idea we were going to do, right?
We took the parts that we understood and implemented them,
and the key element being the parts that we understood,
because at some point I remember we're talking about this, and then the topic of Drishlet noise came up and I'm like I
don't know what that is Dave and Dave looks at me and says I also don't know why that is and I'm
like so are we using it? I'm like I don't think we actually implemented that anywhere. Does it matter?
And then we look and we're like it looks like it could be important. Yes but the problem is that
when we actually implemented it our results got worse because they were trained without the Dirichlet noise.
Yes, because we didn't understand what we were doing, Dave, because we didn't understand the paper.
Even if they didn't totally get all the details, they got an AlphaZero approach up and running.
And then they could run it and they could watch the logs and they could see if it was actually learning anything and getting better. So you would see a text file and it was a
text file because Felipe never felt like writing a proper log and so I did it and
what you see there is when we're in the AlphaZero phase you would see a
printout of a game and every new game that was played you would get a list of
lists of numbers where each list of numbers
represents the well written as binary. You would get logs of these and each
well would come with a score if it had gotten any lines, it would come with the
network's heuristic estimation, it would come with a couple other bits of
metadata like how many children it had and so forth. And you would see tens of
thousands of these. We would get one every few seconds and we would control F through to manually
look for the high scoring ones. Most of them were not terribly high scoring, of course.
I have a very fond memory of sitting with Dave on a voice call actually and just looking at
games that had been played through and like trying to figure out if there were commonalities
or traits and what the network was trying to think of
as we went through these things.
If I remember correctly, you made a little emulator
that let us look at the games in mathematics
and it was clicking through and seeing,
okay, what happened here?
Why did it do this?
And then looking at a movement being like,
that move is terrible.
Why would it do that?
Why can't the network figure out
that it shouldn't be doing this?
It was probably like one in the morning or something
as we're doing this.
But yeah, there was a problem with this AlphaZero approach.
What we needed was a data center's worth of TPUs
hooked up to a bunch of hard drives
and computers running all day
so that we could change meta parameters and test things.
So that was a big factor.
Our implementation was the implementation of two people
who read and half understood a paper and then skipped over some fuzzy details
that clearly didn't matter that
Turns out probably kind of mattered they were probably important and I think we just fully didn't
Grasp the scope of what needed to happen in order to make this an effective
Solution so we were kind of like stuck in that circle of tweak a thing it gets worse
But did it get worse because we tweak the thing or they get worse because we're in a local minima and don't know how to fix it?
And then we don't can't iterate on that because we don't have the hardware in order to iterate on it properly.
So it was just a thing where it was like that feeling you get when you know,
you're approaching a problem in the incorrect direction and that yes, you can keep tweaking things,
but nothing is going to get you out of this hole. It's I think the sentiment.
Imagine climbing a mountain
and then realizing that the peak in front of you
you've been trying to scale is not the top,
but just a false peak.
The real peak is behind it.
That's where they were.
Resources were running thin,
and each passing day it felt like a warning.
This problem was much bigger than they thought.
You get some obscure error
that is like some rust backtrace error that is a seg who meant
overflow somewhere.
You're like, what does this even mean?
Why is this seg faulting?
And we just look at it and either fix it or say, actually, we won't use this library anymore.
It's fine.
We don't need to use sparse matrices.
That won't help anyway.
But yeah, there was also moments of hope.
My personal record for playing Haters myself is 10 points. So
the moment when it first beat me, that was exciting. And there was a couple
weeks period where it slowly rose from single digit points to 22 points. And
that did feel like it was getting better. And then it just it just stalled. And no
matter what we do we couldn't improve it. But there were moments in there where we were really happy.
There were moments in there where we found ways of speeding up our emulator and getting
more games per second.
I'm not saying it was all bad.
It's just the bad is the part we remember because the bad, chronologically speaking,
was the majority of that time.
For months they kept pushing on this problem, working late nights, exchanging code.
But now, on the edge of a breakthrough, they just felt so much doubt.
This is kind of one of the moments that happens in any given project.
You reach the redirection point, right?
Where you're like, I'm still interested in the core idea behind this.
I still want to solve this problem.
But whatever it is I'm doing isn't working for whatever reason.
And honestly, I can pretend that this is a measured decision where you sit down
and you make a list of pros and cons, but really it's how annoyed am I at this?
And do I still want to keep fighting with this specific set of problems?
Right.
I think we had a conversation where I basically said, I don't think at the scale
we have, unless we're going to shell out a ton of money to Google, we can do this.
Is that when we applied for the TPU grant?
Because we looked at this, we looked at like, what would it cost us cloud-wise if we wanted to do this properly?
The answer was it would take about a month's worth of a thousand TPUs running full-time.
So we applied for a grant to get a month's worth of a thousand TPUs running full-time
and we got the grant and we never used it. Think about Merchant of Venice, the pound of flesh not including any blood.
That's more or less the scenario we encountered here, where we had the ability to use for free thousands of TPUs for about a month.
We did not have the ability to use for free any of the CPUs that we would need to coordinate those TPUs.
Or the hard drives. Granted, the TPUs were still doing the majority of the calculation, vast majority, but the
cost even of that, of the ancillary controls and logging and so on and so forth, would
have been well over our budget.
Our budget of zero dollars in a shoestring, to be clear.
So frustrated and realizing that this neural network approach they had been throwing themselves
at was just beyond the compute resources they could afford, they were ready to give up.
But then Nujet appeared on GitHub and started shaking things up.
He posted a gist that included his entire approach in Japanese.
And Google Translate was sufficient with pictures to figure out what was going on.
We'd been looking at Haters for quite a while at this point, so we, like, there was nothing in the explanation that we needed to have explained to us as far as, like, how does the game work? What mechanism am I using?
We were mostly interested in the broad strokes, like, what are you looking at?
And he made some really good graphics, actually, that really, like, captured the idea of the shapes of things that he was looking at.
And that was very helpful. And it was a pretty succinct explanation. It didn't get into a lot of like the technical implementation
details, which was exactly what we needed. So it was just a really good bright up job
on his part, even in Japanese, which we don't understand. We're like, none of this is that
hard compared to what we're trying to do. I bet we could come up with a better heuristic.
We're good at looking at problems. And I think that idea sort of overtakes everything else and we're like, okay, what does a heuristic look like if
we're starting to think about this problem? What are the things we should
consider that new Jade hasn't thought about looking at this problem? Because
sure, he's a computational Tetris expert that's been studying this problem
longer than we have and come up with a better solution. But obviously we can do
that, we can do better than that, we can come up with a better idea. And I think
that's what starts things. Is that right, Dave? Yep. We decided that good artists borrow and great artists steal.
And that is precisely what we did.
And then while they were re-implementing or stealing or whatever you want to call
it, new Jade starts posting new high scores.
I think Dave sent me a message that says something like the new record is what?
66 points.
And I was like that, come on, that can't be right, Dave.
The last record was 45 points. And we've speculated that the roof is at like 50 maybe 60 because
of course every time there's a new record right we raised the roof by like
10 points like there's no way it's humanly possible and at this point I say
something along the lines of famous last words oh the maximum score is probably
around 100 then I think that's makes sense and so where you say okay let's
just start by recreating the work see if we can get something, and then we can start tweaking the heuristic to make
it more suitable with our understand. And what we'll bring to the table is we'll
do a lot more data analysis and we'll have a better idea of a heuristic or
things we can add to it. From there, it took us about five months because we
made two crucial mistakes in our re-implementation, and each mistake cost
us literally months of runtime.
Re-implementing Nujade's solution isn't super easy, especially if you don't have a PhD in computational tetras or you're not familiar with computational game searches.
Imagine exploring multiple paths of a maze simultaneously. You go 10 different ways
through the maze, and when some of them dead end or look like they're not promising,
that's when you can branch off on some of the ones that do look promising, so that you always have
your best 10, your best N running in parallel forward through the game space. You drop the rest
and you focus on these winners, but you're always moving through multiple paths at once.
And how many of those you move through at once, that is your beam's width.
That's what makes a beam surge.
But simulating all of these demands
a lot of computing power.
The conversations with these things always go like this.
How long could this take?
I don't know, a day?
Sure, a day seems reasonable.
And then it's a day later, and we're like,
how far has it gotten?
Well, it finished the first beam.
Okay, so two weeks? We were very hopeful we were going to run for less than a month.
And then it was just much slower than we had expected. And so like, each run takes a month.
And it's a month of looking at the logs go, seeing them, seeing the score slightly increase. And being like,
Oh God, I hope we didn't have a bug in this portion of the code. Because if we did, we're gonna get two weeks in and have to start over.
And we're looking at log files, and we're discussing what's going on.. And obviously it's a lot of just refreshing log files and talking back and forth and
be like, well, the newest high score is this.
I remember I would sign off work on Discord and I'd see a message from Dave that would
be like, the current score is 30.
And then a message an hour later, the current score is 32.
And we'd hop off a working session, right?
And I'd hop back onto the console connect to look at the logs.
And I'd send a echo message to echo to anyone else on the server being like,
Dave, are you still online looking at the logs? I'd get a message back being like, yeah.
Like, yeah, okay. And so you did climb to 32. Well, the record was what at that point?
The record at this point was 66. And we eventually got our way up to 53 over the
best one of these searches.
Which was still very good. It was second best of the world. It was better than
anybody else except Nujade, but it was not quite good enough. And so that's when
Felipe got the idea to instead of using the single desktop computer that was at
this point already five or six years old.
Instead, let's actually purchase some computer time and just get this over with in a day.
This is where things got serious.
Instead of running this program at home and it's just something fun,
they were about to rent a 72 core machine in the cloud and the clock would be ticking.
This was on AWS and if anything went wrong, that money would be gone for good, leaving them with
nothing but an unfinished beam search.
Now this puzzle wasn't just about pride and wasn't just exploring their wallets
were now at stake as well.
Yeah.
So I've done a bunch of DevOps stuff, but AWS is my most familiar cloud providers.
I was like, AWS seemed like a reasonable choice for this.
And so I went out and I priced it and I looked at various solutions. I said, okay,
what we're looking for is a large number of threads. I forgot the instance type.
This one has a lot of threads. If we're gonna run this for 24 hours, this feels
like a reasonable price to pay to get this project done. And at this point we
have a shining brand new heuristic with better own new magical fields to it that
will totally make everything better. We've gotten 53 points so we know that
it can be done.
And so I think the math we did was like four cores takes a month, divide the month in,
and we need this amount of cores.
And it was like, I forget how many it was, a few hundred.
Do you remember, Dave?
I think it was like 72.
I think it was 72.
Yeah.
So we said 72 cores, a lot more RAM, we can make this run quickly.
And we did some back of the map good math and came up with a budget. And I said, this is how much it's going to cost, Dave. And Dave was like, RAM, we can make this run quickly. And we did some back of the map good math
and came up with a budget.
And then I said, this is how much it's gonna cost Dave.
And Dave was like, okay, we can go have these on it.
That feels reasonable.
Never trust my budgeting decisions ever.
I have always, every single time,
wildly underestimated the cost of doing the thing.
And so we get it set up by set up the AWS account.
None of this is like very challenging
because it's all just set up to be click and go,
set up the instance, the SSH into the instance, instance copy your code over because now we're using version control so
we can just get clone the stuff over you're welcome Dave.
Install the dependencies get everything going and kick it off right and honestly it's blazingly
fast but you're like wow this is going very quickly exactly as we predicted.
By midnight they were glued to the log file, hoping each new entry would push them past 66 points.
Each beam would make progress, and one would pass the world record.
So then problems start to come up. We actually shoot past 66 points,
and then we start to realize, well wait a minute, if the game lasts longer
than we predicted, we're gonna have to pay more than we predicted.
This was at the 48 hour mark.
This is twice the expected runtime,
but their code still has to finish.
They have to let it run because it's cleared 66 lines,
but until they lose the game,
until they finish the search,
the program won't finalize.
It won't print out the score.
So you'd hit a new world record, but you couldn't see it.
You couldn't log it as a world record.
You just knew that if there was no bugs in your program,
it existed.
So the better your program does,
as you're sitting there watching it,
the worst case your bill gets.
And then there's Amdahl's law.
Because we forgot multi-threading only works for the actual beam.
When you're saving everything,
when you're saving the progress from one time step to another,
the file read and write, that's single threaded.
That's not sped up at all.
That was a negligible amount of time when we only have four cores,
but when we have 70, that's now the majority of the time.
And so our cloud computing budget sort of retroactively increased until finally the
game finished just before midnight and we got 86 points.
What?
20 point improvement on the record.
Huge, huge, huge milestone.
This is a Friday night, right?
My wife is like, are we going to do anything?
I'm like, I have to stare at this screen and see if we're getting a record. It's like, okay,
we have plans tomorrow. You remember? I'm like, yeah, this is fine. David, I will be done by this
by like two in the morning at the latest, right? And so now it's midnight. We have a world record.
We can see the world record, right? And we're like very confident there are no bugs. Yes, so the issue is at the end of the game we now
have a position and that position is the final position of the winning game.
That's not an entire game. We need every position that every piece has gone in
for the entire history of the game. To do that we then reference all those save
time steps we made earlier, which is also a single threaded process,
which is also something we didn't anticipate.
This is even slower than going forward.
We thought it was gonna take at most an hour,
and an hour later it was barely started.
We did some back of the envelope calculations
and we predicted that it would finish
by the time we woke up in the morning.
It did not. By the time we woke up in the morning, it turns out our back-of-the-
envelope calculations was based on the smaller time steps at the end because
the beam is running out of options. The number of wells stored at each time step
is smaller. When we woke up, it was going to take us another 12 hours to finish.
My dude, we're looking at our budget like, I guess we can't stop it at this point, right?
Cause it's all gone, right?
If like you have no way of putting it together.
If you stop.
Correct.
If we stop at any point before it finishes, then there was no point to the entire run.
We need every position of every well or else we just can't get there.
By do this is peak software design, right?
You should always design your processes so that if stopping them at any point,
it destroys all your work.
That is how you were supposed to design software.
No backups.
It keeps you on your toes, I think.
But finally we had the record.
It was 86 points.
We immediately downloaded it.
We got the files transferred off AWS.
We stopped the money spigot going into AWS.
But they're not done yet.
Their wallets are taking a break,
but the simulator outputs its internal state for each move,
which doesn't match the format
that the high scoreboard needs.
That requires a move-by-move replay.
We took our list of numbers,
had a Mathematica script that turns it
into a list of pictures, and each picture representsica script that turns it into a list of pictures.
And each picture represents where the piece is going to go.
And then we sit there on a screen share for an hour,
manually hitting arrow keys to move the piece into the appropriate position.
And we vowed to automate it before the next record,
and then we didn't, and the next record took even longer.
Fortunately, Haters has an undo button, so it was possible to fix mistakes.
It also has a replay code button where you can enter the replay code of a game.
And that's how very much preferable to doing it by hand.
Because it's a deterministic replay code so you can generate it by just knowing what the moves are.
We just never wrote the script for it, so we were like, how long could this take? Finally though, we had it. And by this
point we had found New Jade. We found him hanging out on a computational Tetris Discord server,
and we were able to get in touch with him that way. He sent us his congratulations. He mentioned
offhand that he had his eyes set on a new approach, but that that approach was going to take a backseat for now. They post their world
record of 86 on the Haters site. It's a big jump and they're finally landing a
world record. They also tweet at Sam Hughes to share the news. Yeah, he
congratulated it. He confirmed that the record was legitimate. He was extremely
impressed at the level of effort we'd apparently gone through, because
we mentioned at the time that it had taken us 11 months to get this.
And yeah, from there, we write up a blog post.
And what reasonable human beings would do at this point is stop thinking about this
problem you have already solved and gotten the world record for.
Of course, reasonable is an interesting word to use when you're talking about
dedicating the resources of a small engineering team
basically to an obscure game from 2010's leaderboard.
Reasonable is not the point.
So they in fact did not move on.
In fact, they planned to give a talk about all the work they had done.
So we get invited to do the DC rest talk.
We spent like three weeks planning this talk and making slides, to give a talk about all the work they had done. So we get invited to do the DC Rust talk.
We spend like three weeks planning this talk and making slides and our talk is way too
long initially and we cut it down and there's a whole conversation of does our audience
want to see this fancy infographic I made?
Maybe not.
Maybe we just cut that.
And all this time we're talking about this talk and about our project and you look back
at the project and you look at your code base and you're like, well, we could have done
that slightly better.
Well, that's a tweak we could have made.
And now that we're looking at it, because Dave says, well, if we're going to give this talk,
it would make sense to open source our code so people can go look at our mistakes.
And I'm like, well, I don't want to do that without a readme.
And Dave says, and we should probably clean up this horrible spaghetti mess over here
where we have one giant method that does everything.
We could add some unit tests maybe.
So just so people don't think we don't ever write unit tests, we don't ever write unit tests.
Well, pondering all this, right, back steeped in their hatred work, Dave gets an idea for a new heuristic,
where you just look a little bit more ahead, see if you can clear a line in the next couple moves.
And that gives you much more grounding for your heuristic about whether you're in a good place.
He calls it the burden hand heuristic, I believe, at the time. We have a whole conversation about how when, if this gets published,
we can name it after him because that would make a lot of sense.
And then I get a message like two hours later that says,
I explained the idea to my dad, he said, oh, that's a quiescent, look ahead.
Yeah, that's used in chess engines all the time.
I'm like, okay, that fits.
So your dad is a chess engine person? Is that like?
He's a programmer, amateur mathematician, That fits. So your dad is a chess engine person? Is that like?
Uh, he's a programmer, amateur mathematician, and the world's foremost Petri Net enthusiast.
I do not say world's foremost Petri Net expert.
There are people who know more than him.
There is nobody who loves Petri Nets more than he does, though.
I grew up listening to cautionary tales about how if you're training a genetic algorithm,
you've got to remember to include mutations or else you'll get stuck in a local maximum.
You learn a lot through osmosis even if you're not trying.
And so I still frequently call him up and talk about this sort of thing.
And he also attended our Rusty C talk.
He's always interested in new ongoing programming projects.
That's wild. It's funny to me that you could say like,
oh yeah, and then I remember what my dad told me when I was 12.
Like, consider a beam search before you go into a...
Yeah, but that is actually the kind of advice you would have given me when I
was 12. We, for many years, did project
Euler problems together. We're both in the top 0.1% of
project Euler solvers, and Felipe can attest to the
getting dragged into that as well. He and I have
actually solved the hardest problem on the entire site. Took us about a year on and off. But what
happens next is maybe as exciting or at least as surprising as solving a hard Euler problem.
This was basically the first and maybe only really fist in the air moment where you're just,
you can't believe how well something's
working.
Our huge beam search, the one that took days to run, that was a width of 25 million.
So we were looking at 25 million positions in parallel.
So they test their new heuristic.
They try it with a width of 100,000.
That's 250 times less games in parallel than their big AWS run.
But they have this better heuristic, right?
So they're searching through this tree of possible games,
but they're navigating towards these longer runs.
They're simulating better games.
And so even though it's 250 times less games,
they get a score of 82 points, nearly their world record.
And to be clear, I did not believe this number.
Dave told me this number, I was like,
Dave, we have a bug somewhere, that can't be right.
It's impossible that this heuristic got that good,
that fast.
And I disagreed until we ran it for a million.
And for a million, it gave us 148 points.
Now, I am also convinced that there's a bug
in the emulator somewhere. Because after Rust DC, we did a bunch of optimizations to make it faster.
And I am convinced some optimization we did caused there to be a bug, and that bug caused the game to be easier.
And it wasn't helped by the fact that we actually did have a bug earlier on in the implementation that we didn't catch for a while.
And it had been giving us false games.
So the thing to do once again, they need to put together the gameplay.
They need to verify the record.
We hopped on a call.
We said, we really, really should have written that script last time.
Why didn't we do that?
And Dave very sensibly said, because we thought we were done and we weren't going to do any
additional work and never ever again visit this project again.
And I said, yeah, okay, let's make sure we write it next time.
So you just got a screenshot, you just got an image that shows where the piece has
to go, the shape of the well essentially.
But it doesn't show how to get the piece there.
There's a bunch that require complex rotations to get past tight corners and
all that sort of thing.
And the R script did not include any of that because if we'd done that, we would have just
written an automatic replay code generator, which we, after this, we actually sat down
and did because that was too much.
It takes us about two hours to actually input the whole thing.
And mind you, every time we make a move, I'm like, this is clearly where we're going to
find the bug, right?
Like we're just joking about that every few minutes where it's like, obviously this move
is illegal and that's, this is not going to work.
And so old record was 86 and this was 148, which was the largest increase in Tetris history
either in absolute terms or proportionally. There is no way anybody will top that. That's
beyond any of our predictions for the maximum possible score for this game.
It's over. There's no way.
Felipe, what happens a week later?
Someone gets an even higher score.
And this one's kind of on us.
Dave has a friend that he does Advent of Code with sometimes.
Dave hasn't said this, but he's often on the Advent of Code leaderboard because he's very good at that type of programming.
I blame his Mathematica skills and his desire to be up at midnight solving a problem.
Um, and Dave is having a chat with his friend and did you show him the Rusty Sea Talk?
Is that right?
Yes.
He saw the Rusty Sea Talk independently and then he messaged me.
Yeah.
So this is Tim.
They're talking.
They, this says, Hey, I saw your
crusty seat talking I actually know how to write rust really well. So I made my
own version of the emulator and then he gets 157 points after Dave tells another
burden hand heuristic because that's just the type of people we are. It feels
like your dad shouldn't have the recommendation for this if you're in a
competitive coding contest don't tell the other people on the keyboard your
tricks.
I can't endorse that because every single time we've worked on a project and we've been
doing something really hard and we've talked to someone in the same field, they've been
like overwhelmingly happy to share whatever it is they're working with in amounts of detail
that I consider like actual work.
Like it was like, here's what I'm doing, here's how I'm doing it, here's what I'm thinking,
let me know if you want me to look at your code type stuff.
So we take the same attitude.
If someone asks us what we're doing, we're going to tell them in
as much detail as we can because yeah, it's a competition, but the goal is not
to win the goal is defeat Hatress at its core, right?
And like the break the will of the game because that's what we're all after.
Dave and I now spend the week sitting on various, um, discord chats as
170 gets posted and then 232 gets posted and we're like we really need to come up with a better
way of doing this. Are these all your friend or? This is all Tim. Yep and Tim
has a better computer than we do I will I will give him that part. He's also
better at Rust than we are. He used to ask Dave hey I see you guys did what did we
do? Used a priority queue but you don't need to do that. You can just do this.
And we're like, oh yeah,
I guess we could have done it that way.
That would be much faster.
Yes, thank you, Tim.
And so he's tweaking his heuristic.
He's changing it.
He's posting more scores and Dave and I are,
seething is not the right term,
but we're definitely sitting in Zoom chats.
We're like, okay, what do we do to like,
we can't beat them on like technical skill
or on like hardware.
We need to do something to change things up.
And we got nothing. Like our conversations are like maybe we can do an analysis and do
a thing. So do you remember the day that we get was 10 posts um 232 points and we're like
oh gosh okay well I don't know if we can break the 200 point barrier and 12 hours later New
Jade shows up on Twitter and posts a score
of 289 points.
We talked to him, we shared our heuristic and he did something radically new, right?
He used an entirely new set of techniques, but basically he posts, all right guys, this
is it, this is the definitive score and we say, yeah, okay, I guess we're done working
on Haters.
So, we started with neural networks and we abandoned it for New okay, I guess we're done working on Haters. So we started with neural networks
and we abandoned it for New Jade's heuristic approach.
And Tim also used the heuristic approach.
He was just better at it than we were.
While we were doing that, New Jade was busy
abandoning his heuristic approach
in favor of a neural net approach.
But the neural net approach was very different.
So AlphaZero is designed by Google. It's
designed by people who have millions of dollars of hardware. There is another
and objectively better algorithm for neural net learning of chess that was
developed by a Japanese chess Shogi enthusiast because Shogi does not have
anywhere near the resources that chess and Go do. And so this is called NNUE. It uses sparse neural networks to
dramatically accelerate how much effective compute you have. If you have a
huge neural network but 99% of the inputs to that neural network are zero,
your network is effectively running a hundred times faster than it otherwise would be. And he
used this to make a world record-setting Shogi program. And the Shogi program has
a name that comes right out of anime. It's called the end of Genesis TNK
evolution turbo type D. That's the actual name of a program. We had never heard of it. This
came out in 2018. We had never heard that this was a possibility. So yeah, Nujet
documented this approach for the record, which is how Dave and Felipe learned
about it. He incorporated David's bird-in-the-hand strategy with his new
neural network approach. And this all led to the discovery of the very first loop.
And we'll explore loops shortly,
but by incorporating NUJAC's insights back into their code,
David and Felipe quickly hit an even higher record,
366, taking the record back from him.
And then around this time,
I pinged them about maybe doing a podcast episode.
So in the same way that preparing for the Rusty Seatalk got us talking about the problem
again, when you messaged us, that also got us talking about the problem again.
And so we started looking not necessarily a new approach, but a new way of analyzing
the data we already have.
And we actually have a graph of what are called loops.
Now, in theory, if you can have a position in Hatris
and then get back to that same position,
you could run forever.
And to prevent that, the Hatris algorithm
has a built-in loop prevention rule,
which if it sees that you're gonna get to a well
that you've seen before, it'll give you some other piece so you can't get there. New Jade's 289 point
record was also the very first record to ever discover a loop, and soon after Tim
discovered his own loop, and soon after we discovered some of ours, and these
were moves which if that loop prevention rule wasn't there, would get infinite
points. You could just play forever.
And so then the question that we've been asking ourselves
for the last couple of weeks
and what we've been investigating is,
how many loops are there?
And what's the structure of these loops?
What kind of well leads you to a loop?
Because obviously if you can figure out that,
you are a very good way
towards understanding the game entirely. Because obviously if you can figure out that, you are a very good way towards
understanding the game entirely.
This shifts Haters from just being a machine learning,
a game search problem,
to something that David and Felipe are quite good at,
pattern analysis.
You can only take each loop once,
but if multiple loops start at the same spot,
if you can force Haters to send you on a second loop, the game changes.
It's exciting to see these ideas sparking off one another.
Sharing leads to learning and everyone's growing from each other's insights. Each published idea is building on the last.
Our current approach holding the world record is based on New Jade's approach. New Jade's is based on on Tim's. Tim's is based on ours again. Ours is based on
New Jade's again. New Jade's approached for 32 points all the way at the
beginning of our involvement. He actually did not come up with the idea of using a
beam search for Hatress. That was another programmer by the name of 3pipes. 3pipes in 2019 for a class
project decided to write a Hatress solver and he made an attempt at it. He
extracted the Hatress emulator from the original JavaScript rather than writing
his own, which very interesting approach. He made a Beam search implementation and
it didn't get very far. He got a score of 16, 17 points.
He finished it for his class.
He never touched it again.
He moved on with his life.
But he wrote down what he did and he published it online and New Jade found it in 2021.
And that's what gave him the idea to start the Beam search that led to his world records
and that got this
whole thing started off.
It's not all about flashy thousand word blog posts.
It's just about coming up with some idea, implementing the idea, and then telling people
if it worked or not.
Haters does not matter.
In a very real sense, this record is completely irrelevant.
Nobody but us cares.
The concept of keeping a continual chain of knowledge going, of reading what other people
have done, and then writing down what you've done differently and what you've done the
same and whether it worked, and passing that on for the next person, that's something that's
universal I think.
David and Felipe are impressive.
It's amazing how they tackled a specific game problem,
pushing the boundaries and making unique discoveries.
And yet the thing that they wanna make clear
is they are not in fact,
specially qualified to take on tasks like this.
It boils down to, if you're working on a project
and you are not the expert on the project,
there is a large community of people somewhere,
or a small community of people somewhere who's probably thought about this problem before.
They probably put time and energy into this problem before.
They're delighted that someone cares.
So if you show up and you send them a nice message,
you say, hey, I saw your post about,
oh, let's say Trackmania.
I see you tried this on Trackmania,
and we're trying a similar thing and I was curious about your approach. They will reply to you and if
they haven't touched this project in three years and tell you as much as they
can about it. And those types of like interpersonal connections with other
people across communities working on things ties into the idea Dave just gave
but I think it's even more fundamental the fact that like people who care about
projects are happy to share their knowledge and if you just seek them out
they will share it with you freely, happily, without any particular complaints.
And so, yeah, this is the story.
This is the story of how two friends who couldn't understand the AlphaZero paper, but thought, hey,
we'll implement it anyways, got to the frontier, got to the coal face of game search trees, or whatever
this field is called.
They just kept pushing it, even if occasionally Felipe had to check in with his wife. of game search trees or whatever this field is called,
they just kept pushing it.
Even if occasionally Felipe had to check in with his wife.
Oh, she was, she indulges my,
I am doing a project with Dave
because he doesn't want to hear anymore
about Haters at the Dinner Table.
So it tends to be a lot of, are you doing stuff with Dave?
Yes, okay, I will leave you to it.
Yeah, I don't need the details.
On your side, Dave, is there, I don't know what your situation's like, is there
any relationship strife ever caused by these projects? No, I can't say
that there is. I expect it could happen at some point because I can't see
giving up doing these projects at 3 in the morning anytime soon. But I'll cross that bridge when I come to it.
Where do you work David? Like I missed it.
Oh currently nowhere. I am looking for work. Yeah the most recent place I interviewed it with,
I had five interviews over the course of two months. If you have five interviews including
one where you get a full site tour and it's a three hour process and everything, you really
are getting your hopes up even though you're telling yourself not to get your hopes up. And
they said basically, hey within a week we will get back to you one way or
another on the final offer. And that was three weeks ago. I've emailed them back
just never heard anything. So that's why we do this sort of thing to distract us.
At least that's why I do it. I can't talk for Felipe.
Beyond hitting 366, Haters offered David and Felipe a rare escape.
Right?
It was a place to bond and experiment and swap stories late into the night.
I feel like there's something deeper here.
The reason we chase these seemingly trivial puzzles, I've been known to do it myself,
maybe not quite as far.
But the reason we can pour months of our lives
into side projects, into a JavaScript game from 2010,
is because there's something fundamentally human
about this process of striving.
It's about curiosity and the joy of discovering
and about the thrill of pushing up against the boundaries
of what seems possible.
And sometimes it's simply about finding a good excuse to crack open discord late at
night and tackle something impossible side by side with a very close friend.
That was the show.
Thank you Felipe and David.
I hope your quest continues.
After the interview, I got to hang out with them for a bit and I was shown some of their
unpublished work on loops.
You know, David showed me Mathematica map of these loops and it looks like some demented
subway map, but they're working on something that uses that. A new way to break ground.
I'm sure if they do, they'll post the results on their blog.
And by the time you hear it, it might already be up.
So check out all of Impossible Dreams dot org to see their latest projects.
Sometimes it's hatred, sometimes it's the dead Internet theory.
Sometimes it's Superman fan fiction.
There's always something they're sharing and something they're working on and in my short time with them I saw how unique they are that they're both very different people
But in a way that makes them greater than the sum of their parts
And if you're listening this far I'd be remiss to not mention that you should join as a supporter of this podcast.
I know there's somebody out there working on a side project who needs to hear the story of these two.
Maybe it's you. But stories can be powerful and stories can be something that sticks with you and allows you to learn from somebody else's experience. So that's what I'm trying to do here.
If you enjoy it, support me, or just listen, that's why I always end by saying,
until next time, thank you so much for listening.