CoRecursive: Coding Stories - Story: Hateris - Obsession, Friendship, and World Records

Starting point is 00:00:00 Hello, this is Co-Recursive and I'm Adam Gordon Bell. Today, as always, we're diving into the people behind a piece of software. Because sometimes what looks like a simple puzzle can unravel into a web of complexity. Today, we're exploring the story of two people who got caught up in a JavaScript game from 2010. It seemed like just another Tetris clone, but it quickly became an obsession that would challenge their skills, their wallets, even their relationships. My name is David Freiberg.

Starting point is 00:00:35 I am a material science PhD, and I am very, very obsessed with a JavaScript game that came out in 2010. I'm Felipe. I'm a software engineer at HubSpot. I'm not quite as passionate about this problem as Dave, but I've gotten dragged kicking and screaming into it. So I now guess I'm a domain expert on it. The game is called Haters. If you've played Tetris, you know that moment when you need just the right piece.

Starting point is 00:01:05 Like I've got it all lined up. If I can get that four in a row, that I, I can clear four lines. Now picture a version of Tetris where you never get the piece you want. Where the game seems to have it out for you. That's Haters. It's a hard hard game and for years it had a seemingly unbeatable, very very low high score. And then these two friends, Dave and Felipe got obsessed.

Starting point is 00:01:28 They poured months and months of their lives and a bit too much money into toppling that record. And depending on when you listen to this episode, because the competition for this game is actually fierce. But as of right now, Dave and Felipe are the current record holders. World record holders. But before we get into all that, let's go back to 2010 and back to the great science fiction writer, Sam Hughes. Back in 2010, Sam Hughes created Haters, just published it on his blog as a novelty.

Starting point is 00:02:05 About a month after the game was published, a Japanese slash-dot commenter named Diazuke got 30 points, and it stayed there for years and years and years. 2015, I took a class in general game playing on MIT OpenCourseWare. And this was an interesting class. Instead of writing a program to play chess, or to play checkers, or to play Connect 4, you were writing a program that could play any game given the rules. Instead of just being given a position, you were given a position and also all of the relevant rules that went with it. I started reading QNTM and I started looking at Hatress

Starting point is 00:02:47 and I messaged Felipe a couple of times asking him, could we apply it to Hatress? Could we apply this game playing? And Felipe quite wisely pointed out that my general purpose game player could barely beat me at Connect 4 and it couldn't beat me at Checkers. And Hatress is way, way, way

Starting point is 00:03:05 harder than pretty much any human game. And yeah, Felipe was right about that, so we shelved it. So what happens next is that in June of 2021, a Japanese computational Tetris expert, and yes, there is such a person named New Jade, sets a new world record. We thought it might genuinely be impossible to get past 30. This max score belief was based on simulations that they had run.

Starting point is 00:03:30 Unlike Tetris, Haters is totally deterministic. You always get the worst piece based on the well you have. And even in narrow wells, wells being the Tetris term for that place where the pieces fall, even in narrow wells, the game stayed ballooned faster than you'd think. So 30 points, 30 lines cleared, it felt like an absolute ceiling.

Starting point is 00:03:49 But then New Jade hit 32, so clearly they had missed something. And Dave started working on an emulator and I was like, okay, we're doing this, I guess, because Dave has started working on the emulator and I'm not going to be able to gently guide him away from the idea of doing this. That's what kicked off the project in my brain. That's where it went from impractical idea we're discussing, which is 90% of our conversations, 80% of our conversations to idea we are doing. This is normal for these two, but before we get deeper into how they tackle this challenge, you need to understand something about them.

Starting point is 00:04:18 Dave and Felipe, they're buddies from back in high school, but they are not casual in their approach to these challenges. Dave's the instigator of projects. Felipe is the one who reigns them in with practical concerns. But whatever they're working on, they're always knocking around ideas. Dave will send me some idea like, I was thinking about these machine learning algorithms to solve a game. And I will say, Dave, that is wildly impractical. How do we make your idea practical? Because I like the core of it, right? I really want to like, maybe we can't solve every game, maybe we can focus on one game. And we'll narrow the idea, we'll refine it, we'll talk about it. And we'll come up with sort of a practical

Starting point is 00:04:50 approach. And that's really where the conversation will end, right? I'll be like, yes, we can predict all stock markets forever. But you know, you could maybe focus on predicting one stock and you can use all these other variants, factors to figure it out. And I read this good paper, maybe go look at that. And then if Dave actually likes the idea, two idea two days later to a week later he will show up and say I wrote a Mathematica script for this here is the data that I got from it here's the parts that don't work I didn't use a database because I don't believe in those and all the data is sitting in text files in my hard drive right now and it's full of these and I'll be like Dave I gave you like four practical ideas here why did you do it this way and so honestly it's most most of this stuff goes as it comes as snippets of conversation had over Discord, right? It's

Starting point is 00:05:28 like Dave sends me a message that said something along the lines of like, I just tested the algorithm sorter and I got these results. And I say something incisive like, well, those results aren't very meaningful, Dave. What can we actually learn from these? And he's like, well, let me pull up Mathematica. I, make a graph. And we look at the graph and make some more commentary on it. We'd have sort of a discussion about it, right. And the conversation would pick up on and off throughout the day as we're available. So I'll be like, okay, I have to go to make dinner and I'll go make dinner. I'll come back and

Starting point is 00:05:55 table and posted five more paragraphs in my discord and I go read them and I make a bunch of commentary. And Dave's off swimming because he's he's done a he's practicing for an Iron Man. Were you rising for an Ironman at that time? I think you were and so he's like I just got back from swimming 8,000 miles Let me post some more thoughts and so like 8,000 meters come on Dave Your magical exercise numbers are meaningless to me. I do a lot of biking and I do a lot of swimming so a lot of this is I'll look at a whole bunch of data right before I leave and

Starting point is 00:06:22 Then as I'm biking to the river or when I'm in the river, I have nothing else to think about or focus on. So I'm just sort of mulling over the data in my head. And when I get back, I, I send Felipe messages in the same way that cannonballs are sent from cannons sometimes. And so he'll be hit with a barrage of a couple thousand words based on whatever I was mulling over. Any reference to practical stuff like VS code is going to be Felipe.

Starting point is 00:06:50 I forced me to use things like Git and version control and backup copies. Still haven't forgiven him entirely. I mean, we only lost most of a project due to not keeping any backups and not using version control, so you know, it's not like I, it's not like I've had us to get you there until we did that. Yeah, this is very true. So this is a thing that Dave and Felipe do. They work on projects together. They send around messages and then at night, depending if they're busy or not, they might

Starting point is 00:07:16 jump on a Discord voice call. They might get a VS code live share running. They might work on some coding. Sometimes the projects get somewhere. Sometimes they wouldn't. Sometimes at the end, they'd write up their work, share it on the blog. Most of the time they wouldn't get around to that. But yeah, Haters was always one of these topics on the back burner. It would come up now and again. Dave wanted to tackle it. Felipe pointed out it seemed so out of reach. So other projects would get priority. But like many backburner ideas, Haters was quietly preparing to take center stage.

Starting point is 00:07:47 Because little did they know, once they truly committed to beating this very hateful game, there'd be no going back. Because when a project grabs your interest, especially when there's a group of you, the effort you pour into it over countless evenings and weekends can be surprising. It can add up. So yeah, New Jade hit a new record, 32, and Dave thought maybe they could beat it, but first they needed to build an emulator.

Starting point is 00:08:12 So we built not one emulator, we built six total. Started in Mathematica just to get something that worked and to get an idea of how slow things were, in Mathematica, a single move took about a tenth of a second to calculate. And there's a fair amount of computation that goes into Hatris, because the Hatris algorithm doesn't give you a piece at random. It looks at the well and it calculates which piece is going to be worst according to a minimax algorithm. It makes sense that it would take a little bit.

Starting point is 00:08:54 Our current one runs approximately 200,000 times faster. So currently we run about five microseconds per move per core. So even if even on a single thread that's still tens of thousands of times improvement. The very basic emulator consists of just trying every possible spot that a piece could go and doing the minimax algorithm from there. It was not very fast in Mathematica. So what Felipe eventually convinced me to do was rewrite it from Mathematica to Python, which was barely any faster, and then from Python to Rust. And this I will blame Felipe on. I will blame this on you because I don't think I would have gotten into Rust if you hadn't dragged me into it. I do take the blame for that particular approach. As background, we had no other experience with the language.

Starting point is 00:09:37 We did six AOC problems and our seventh thing was re-implementing Google's AlphaZero algorithm complete with GPU neural nets from scratch. It was a bit of a leap. Yeah, AlphaZero, that DeepMind project out of Google. That was their inspiration. I mean, there was a published paper with formulas and stuff in it. So how hard could it be for the two of them to re-implement it all? At the time when he brought up AlphaZero, I was playing a lot of Go,

Starting point is 00:10:06 and I had seen what AlphaGo could do. And so I was like, oh, actually, I do think this has a lot of potential. Let's talk through the actual practical implications of doing this. And at that point is when I knew this was the idea we were going to do, right? We took the parts that we understood and implemented them,

Starting point is 00:10:20 and the key element being the parts that we understood, because at some point I remember we're talking about this, and then the topic of Drishlet noise came up and I'm like I don't know what that is Dave and Dave looks at me and says I also don't know why that is and I'm like so are we using it? I'm like I don't think we actually implemented that anywhere. Does it matter? And then we look and we're like it looks like it could be important. Yes but the problem is that when we actually implemented it our results got worse because they were trained without the Dirichlet noise. Yes, because we didn't understand what we were doing, Dave, because we didn't understand the paper. Even if they didn't totally get all the details, they got an AlphaZero approach up and running.

Starting point is 00:10:56 And then they could run it and they could watch the logs and they could see if it was actually learning anything and getting better. So you would see a text file and it was a text file because Felipe never felt like writing a proper log and so I did it and what you see there is when we're in the AlphaZero phase you would see a printout of a game and every new game that was played you would get a list of lists of numbers where each list of numbers represents the well written as binary. You would get logs of these and each well would come with a score if it had gotten any lines, it would come with the network's heuristic estimation, it would come with a couple other bits of

Starting point is 00:11:40 metadata like how many children it had and so forth. And you would see tens of thousands of these. We would get one every few seconds and we would control F through to manually look for the high scoring ones. Most of them were not terribly high scoring, of course. I have a very fond memory of sitting with Dave on a voice call actually and just looking at games that had been played through and like trying to figure out if there were commonalities or traits and what the network was trying to think of as we went through these things. If I remember correctly, you made a little emulator

Starting point is 00:12:10 that let us look at the games in mathematics and it was clicking through and seeing, okay, what happened here? Why did it do this? And then looking at a movement being like, that move is terrible. Why would it do that? Why can't the network figure out

Starting point is 00:12:20 that it shouldn't be doing this? It was probably like one in the morning or something as we're doing this. But yeah, there was a problem with this AlphaZero approach. What we needed was a data center's worth of TPUs hooked up to a bunch of hard drives and computers running all day so that we could change meta parameters and test things.

Starting point is 00:12:39 So that was a big factor. Our implementation was the implementation of two people who read and half understood a paper and then skipped over some fuzzy details that clearly didn't matter that Turns out probably kind of mattered they were probably important and I think we just fully didn't Grasp the scope of what needed to happen in order to make this an effective Solution so we were kind of like stuck in that circle of tweak a thing it gets worse But did it get worse because we tweak the thing or they get worse because we're in a local minima and don't know how to fix it?

Starting point is 00:13:08 And then we don't can't iterate on that because we don't have the hardware in order to iterate on it properly. So it was just a thing where it was like that feeling you get when you know, you're approaching a problem in the incorrect direction and that yes, you can keep tweaking things, but nothing is going to get you out of this hole. It's I think the sentiment. Imagine climbing a mountain and then realizing that the peak in front of you you've been trying to scale is not the top, but just a false peak.

Starting point is 00:13:31 The real peak is behind it. That's where they were. Resources were running thin, and each passing day it felt like a warning. This problem was much bigger than they thought. You get some obscure error that is like some rust backtrace error that is a seg who meant overflow somewhere.

Starting point is 00:13:48 You're like, what does this even mean? Why is this seg faulting? And we just look at it and either fix it or say, actually, we won't use this library anymore. It's fine. We don't need to use sparse matrices. That won't help anyway. But yeah, there was also moments of hope. My personal record for playing Haters myself is 10 points. So

Starting point is 00:14:05 the moment when it first beat me, that was exciting. And there was a couple weeks period where it slowly rose from single digit points to 22 points. And that did feel like it was getting better. And then it just it just stalled. And no matter what we do we couldn't improve it. But there were moments in there where we were really happy. There were moments in there where we found ways of speeding up our emulator and getting more games per second. I'm not saying it was all bad. It's just the bad is the part we remember because the bad, chronologically speaking,

Starting point is 00:14:41 was the majority of that time. For months they kept pushing on this problem, working late nights, exchanging code. But now, on the edge of a breakthrough, they just felt so much doubt. This is kind of one of the moments that happens in any given project. You reach the redirection point, right? Where you're like, I'm still interested in the core idea behind this. I still want to solve this problem. But whatever it is I'm doing isn't working for whatever reason.

Starting point is 00:15:03 And honestly, I can pretend that this is a measured decision where you sit down and you make a list of pros and cons, but really it's how annoyed am I at this? And do I still want to keep fighting with this specific set of problems? Right. I think we had a conversation where I basically said, I don't think at the scale we have, unless we're going to shell out a ton of money to Google, we can do this. Is that when we applied for the TPU grant? Because we looked at this, we looked at like, what would it cost us cloud-wise if we wanted to do this properly?

Starting point is 00:15:28 The answer was it would take about a month's worth of a thousand TPUs running full-time. So we applied for a grant to get a month's worth of a thousand TPUs running full-time and we got the grant and we never used it. Think about Merchant of Venice, the pound of flesh not including any blood. That's more or less the scenario we encountered here, where we had the ability to use for free thousands of TPUs for about a month. We did not have the ability to use for free any of the CPUs that we would need to coordinate those TPUs. Or the hard drives. Granted, the TPUs were still doing the majority of the calculation, vast majority, but the cost even of that, of the ancillary controls and logging and so on and so forth, would have been well over our budget.

Starting point is 00:16:19 Our budget of zero dollars in a shoestring, to be clear. So frustrated and realizing that this neural network approach they had been throwing themselves at was just beyond the compute resources they could afford, they were ready to give up. But then Nujet appeared on GitHub and started shaking things up. He posted a gist that included his entire approach in Japanese. And Google Translate was sufficient with pictures to figure out what was going on. We'd been looking at Haters for quite a while at this point, so we, like, there was nothing in the explanation that we needed to have explained to us as far as, like, how does the game work? What mechanism am I using? We were mostly interested in the broad strokes, like, what are you looking at?

Starting point is 00:16:56 And he made some really good graphics, actually, that really, like, captured the idea of the shapes of things that he was looking at. And that was very helpful. And it was a pretty succinct explanation. It didn't get into a lot of like the technical implementation details, which was exactly what we needed. So it was just a really good bright up job on his part, even in Japanese, which we don't understand. We're like, none of this is that hard compared to what we're trying to do. I bet we could come up with a better heuristic. We're good at looking at problems. And I think that idea sort of overtakes everything else and we're like, okay, what does a heuristic look like if we're starting to think about this problem? What are the things we should consider that new Jade hasn't thought about looking at this problem? Because

Starting point is 00:17:33 sure, he's a computational Tetris expert that's been studying this problem longer than we have and come up with a better solution. But obviously we can do that, we can do better than that, we can come up with a better idea. And I think that's what starts things. Is that right, Dave? Yep. We decided that good artists borrow and great artists steal. And that is precisely what we did. And then while they were re-implementing or stealing or whatever you want to call it, new Jade starts posting new high scores. I think Dave sent me a message that says something like the new record is what?

Starting point is 00:17:59 66 points. And I was like that, come on, that can't be right, Dave. The last record was 45 points. And we've speculated that the roof is at like 50 maybe 60 because of course every time there's a new record right we raised the roof by like 10 points like there's no way it's humanly possible and at this point I say something along the lines of famous last words oh the maximum score is probably around 100 then I think that's makes sense and so where you say okay let's just start by recreating the work see if we can get something, and then we can start tweaking the heuristic to make

Starting point is 00:18:26 it more suitable with our understand. And what we'll bring to the table is we'll do a lot more data analysis and we'll have a better idea of a heuristic or things we can add to it. From there, it took us about five months because we made two crucial mistakes in our re-implementation, and each mistake cost us literally months of runtime. Re-implementing Nujade's solution isn't super easy, especially if you don't have a PhD in computational tetras or you're not familiar with computational game searches. Imagine exploring multiple paths of a maze simultaneously. You go 10 different ways through the maze, and when some of them dead end or look like they're not promising,

Starting point is 00:19:04 that's when you can branch off on some of the ones that do look promising, so that you always have your best 10, your best N running in parallel forward through the game space. You drop the rest and you focus on these winners, but you're always moving through multiple paths at once. And how many of those you move through at once, that is your beam's width. That's what makes a beam surge. But simulating all of these demands a lot of computing power. The conversations with these things always go like this.

Starting point is 00:19:33 How long could this take? I don't know, a day? Sure, a day seems reasonable. And then it's a day later, and we're like, how far has it gotten? Well, it finished the first beam. Okay, so two weeks? We were very hopeful we were going to run for less than a month. And then it was just much slower than we had expected. And so like, each run takes a month.

Starting point is 00:19:51 And it's a month of looking at the logs go, seeing them, seeing the score slightly increase. And being like, Oh God, I hope we didn't have a bug in this portion of the code. Because if we did, we're gonna get two weeks in and have to start over. And we're looking at log files, and we're discussing what's going on.. And obviously it's a lot of just refreshing log files and talking back and forth and be like, well, the newest high score is this. I remember I would sign off work on Discord and I'd see a message from Dave that would be like, the current score is 30. And then a message an hour later, the current score is 32. And we'd hop off a working session, right?

Starting point is 00:20:21 And I'd hop back onto the console connect to look at the logs. And I'd send a echo message to echo to anyone else on the server being like, Dave, are you still online looking at the logs? I'd get a message back being like, yeah. Like, yeah, okay. And so you did climb to 32. Well, the record was what at that point? The record at this point was 66. And we eventually got our way up to 53 over the best one of these searches. Which was still very good. It was second best of the world. It was better than anybody else except Nujade, but it was not quite good enough. And so that's when

Starting point is 00:20:56 Felipe got the idea to instead of using the single desktop computer that was at this point already five or six years old. Instead, let's actually purchase some computer time and just get this over with in a day. This is where things got serious. Instead of running this program at home and it's just something fun, they were about to rent a 72 core machine in the cloud and the clock would be ticking. This was on AWS and if anything went wrong, that money would be gone for good, leaving them with nothing but an unfinished beam search.

Starting point is 00:21:30 Now this puzzle wasn't just about pride and wasn't just exploring their wallets were now at stake as well. Yeah. So I've done a bunch of DevOps stuff, but AWS is my most familiar cloud providers. I was like, AWS seemed like a reasonable choice for this. And so I went out and I priced it and I looked at various solutions. I said, okay, what we're looking for is a large number of threads. I forgot the instance type. This one has a lot of threads. If we're gonna run this for 24 hours, this feels

Starting point is 00:21:53 like a reasonable price to pay to get this project done. And at this point we have a shining brand new heuristic with better own new magical fields to it that will totally make everything better. We've gotten 53 points so we know that it can be done. And so I think the math we did was like four cores takes a month, divide the month in, and we need this amount of cores. And it was like, I forget how many it was, a few hundred. Do you remember, Dave?

Starting point is 00:22:17 I think it was like 72. I think it was 72. Yeah. So we said 72 cores, a lot more RAM, we can make this run quickly. And we did some back of the map good math and came up with a budget. And I said, this is how much it's going to cost, Dave. And Dave was like, RAM, we can make this run quickly. And we did some back of the map good math and came up with a budget. And then I said, this is how much it's gonna cost Dave. And Dave was like, okay, we can go have these on it.

Starting point is 00:22:29 That feels reasonable. Never trust my budgeting decisions ever. I have always, every single time, wildly underestimated the cost of doing the thing. And so we get it set up by set up the AWS account. None of this is like very challenging because it's all just set up to be click and go, set up the instance, the SSH into the instance, instance copy your code over because now we're using version control so

Starting point is 00:22:48 we can just get clone the stuff over you're welcome Dave. Install the dependencies get everything going and kick it off right and honestly it's blazingly fast but you're like wow this is going very quickly exactly as we predicted. By midnight they were glued to the log file, hoping each new entry would push them past 66 points. Each beam would make progress, and one would pass the world record. So then problems start to come up. We actually shoot past 66 points, and then we start to realize, well wait a minute, if the game lasts longer than we predicted, we're gonna have to pay more than we predicted.

Starting point is 00:23:24 This was at the 48 hour mark. This is twice the expected runtime, but their code still has to finish. They have to let it run because it's cleared 66 lines, but until they lose the game, until they finish the search, the program won't finalize. It won't print out the score.

Starting point is 00:23:43 So you'd hit a new world record, but you couldn't see it. You couldn't log it as a world record. You just knew that if there was no bugs in your program, it existed. So the better your program does, as you're sitting there watching it, the worst case your bill gets. And then there's Amdahl's law.

Starting point is 00:24:02 Because we forgot multi-threading only works for the actual beam. When you're saving everything, when you're saving the progress from one time step to another, the file read and write, that's single threaded. That's not sped up at all. That was a negligible amount of time when we only have four cores, but when we have 70, that's now the majority of the time. And so our cloud computing budget sort of retroactively increased until finally the

Starting point is 00:24:31 game finished just before midnight and we got 86 points. What? 20 point improvement on the record. Huge, huge, huge milestone. This is a Friday night, right? My wife is like, are we going to do anything? I'm like, I have to stare at this screen and see if we're getting a record. It's like, okay, we have plans tomorrow. You remember? I'm like, yeah, this is fine. David, I will be done by this

Starting point is 00:24:52 by like two in the morning at the latest, right? And so now it's midnight. We have a world record. We can see the world record, right? And we're like very confident there are no bugs. Yes, so the issue is at the end of the game we now have a position and that position is the final position of the winning game. That's not an entire game. We need every position that every piece has gone in for the entire history of the game. To do that we then reference all those save time steps we made earlier, which is also a single threaded process, which is also something we didn't anticipate. This is even slower than going forward.

Starting point is 00:25:32 We thought it was gonna take at most an hour, and an hour later it was barely started. We did some back of the envelope calculations and we predicted that it would finish by the time we woke up in the morning. It did not. By the time we woke up in the morning, it turns out our back-of-the- envelope calculations was based on the smaller time steps at the end because the beam is running out of options. The number of wells stored at each time step

Starting point is 00:25:57 is smaller. When we woke up, it was going to take us another 12 hours to finish. My dude, we're looking at our budget like, I guess we can't stop it at this point, right? Cause it's all gone, right? If like you have no way of putting it together. If you stop. Correct. If we stop at any point before it finishes, then there was no point to the entire run. We need every position of every well or else we just can't get there.

Starting point is 00:26:21 By do this is peak software design, right? You should always design your processes so that if stopping them at any point, it destroys all your work. That is how you were supposed to design software. No backups. It keeps you on your toes, I think. But finally we had the record. It was 86 points.

Starting point is 00:26:36 We immediately downloaded it. We got the files transferred off AWS. We stopped the money spigot going into AWS. But they're not done yet. Their wallets are taking a break, but the simulator outputs its internal state for each move, which doesn't match the format that the high scoreboard needs.

Starting point is 00:26:55 That requires a move-by-move replay. We took our list of numbers, had a Mathematica script that turns it into a list of pictures, and each picture representsica script that turns it into a list of pictures. And each picture represents where the piece is going to go. And then we sit there on a screen share for an hour, manually hitting arrow keys to move the piece into the appropriate position. And we vowed to automate it before the next record,

Starting point is 00:27:22 and then we didn't, and the next record took even longer. Fortunately, Haters has an undo button, so it was possible to fix mistakes. It also has a replay code button where you can enter the replay code of a game. And that's how very much preferable to doing it by hand. Because it's a deterministic replay code so you can generate it by just knowing what the moves are. We just never wrote the script for it, so we were like, how long could this take? Finally though, we had it. And by this point we had found New Jade. We found him hanging out on a computational Tetris Discord server, and we were able to get in touch with him that way. He sent us his congratulations. He mentioned

Starting point is 00:28:01 offhand that he had his eyes set on a new approach, but that that approach was going to take a backseat for now. They post their world record of 86 on the Haters site. It's a big jump and they're finally landing a world record. They also tweet at Sam Hughes to share the news. Yeah, he congratulated it. He confirmed that the record was legitimate. He was extremely impressed at the level of effort we'd apparently gone through, because we mentioned at the time that it had taken us 11 months to get this. And yeah, from there, we write up a blog post. And what reasonable human beings would do at this point is stop thinking about this

Starting point is 00:28:40 problem you have already solved and gotten the world record for. Of course, reasonable is an interesting word to use when you're talking about dedicating the resources of a small engineering team basically to an obscure game from 2010's leaderboard. Reasonable is not the point. So they in fact did not move on. In fact, they planned to give a talk about all the work they had done. So we get invited to do the DC rest talk.

Starting point is 00:29:04 We spent like three weeks planning this talk and making slides, to give a talk about all the work they had done. So we get invited to do the DC Rust talk. We spend like three weeks planning this talk and making slides and our talk is way too long initially and we cut it down and there's a whole conversation of does our audience want to see this fancy infographic I made? Maybe not. Maybe we just cut that. And all this time we're talking about this talk and about our project and you look back at the project and you look at your code base and you're like, well, we could have done

Starting point is 00:29:23 that slightly better. Well, that's a tweak we could have made. And now that we're looking at it, because Dave says, well, if we're going to give this talk, it would make sense to open source our code so people can go look at our mistakes. And I'm like, well, I don't want to do that without a readme. And Dave says, and we should probably clean up this horrible spaghetti mess over here where we have one giant method that does everything. We could add some unit tests maybe.

Starting point is 00:29:41 So just so people don't think we don't ever write unit tests, we don't ever write unit tests. Well, pondering all this, right, back steeped in their hatred work, Dave gets an idea for a new heuristic, where you just look a little bit more ahead, see if you can clear a line in the next couple moves. And that gives you much more grounding for your heuristic about whether you're in a good place. He calls it the burden hand heuristic, I believe, at the time. We have a whole conversation about how when, if this gets published, we can name it after him because that would make a lot of sense. And then I get a message like two hours later that says, I explained the idea to my dad, he said, oh, that's a quiescent, look ahead.

Starting point is 00:30:14 Yeah, that's used in chess engines all the time. I'm like, okay, that fits. So your dad is a chess engine person? Is that like? He's a programmer, amateur mathematician, That fits. So your dad is a chess engine person? Is that like? Uh, he's a programmer, amateur mathematician, and the world's foremost Petri Net enthusiast. I do not say world's foremost Petri Net expert. There are people who know more than him. There is nobody who loves Petri Nets more than he does, though.

Starting point is 00:30:39 I grew up listening to cautionary tales about how if you're training a genetic algorithm, you've got to remember to include mutations or else you'll get stuck in a local maximum. You learn a lot through osmosis even if you're not trying. And so I still frequently call him up and talk about this sort of thing. And he also attended our Rusty C talk. He's always interested in new ongoing programming projects. That's wild. It's funny to me that you could say like, oh yeah, and then I remember what my dad told me when I was 12.

Starting point is 00:31:07 Like, consider a beam search before you go into a... Yeah, but that is actually the kind of advice you would have given me when I was 12. We, for many years, did project Euler problems together. We're both in the top 0.1% of project Euler solvers, and Felipe can attest to the getting dragged into that as well. He and I have actually solved the hardest problem on the entire site. Took us about a year on and off. But what happens next is maybe as exciting or at least as surprising as solving a hard Euler problem.

Starting point is 00:31:36 This was basically the first and maybe only really fist in the air moment where you're just, you can't believe how well something's working. Our huge beam search, the one that took days to run, that was a width of 25 million. So we were looking at 25 million positions in parallel. So they test their new heuristic. They try it with a width of 100,000. That's 250 times less games in parallel than their big AWS run.

Starting point is 00:32:06 But they have this better heuristic, right? So they're searching through this tree of possible games, but they're navigating towards these longer runs. They're simulating better games. And so even though it's 250 times less games, they get a score of 82 points, nearly their world record. And to be clear, I did not believe this number. Dave told me this number, I was like,

Starting point is 00:32:26 Dave, we have a bug somewhere, that can't be right. It's impossible that this heuristic got that good, that fast. And I disagreed until we ran it for a million. And for a million, it gave us 148 points. Now, I am also convinced that there's a bug in the emulator somewhere. Because after Rust DC, we did a bunch of optimizations to make it faster. And I am convinced some optimization we did caused there to be a bug, and that bug caused the game to be easier.

Starting point is 00:32:56 And it wasn't helped by the fact that we actually did have a bug earlier on in the implementation that we didn't catch for a while. And it had been giving us false games. So the thing to do once again, they need to put together the gameplay. They need to verify the record. We hopped on a call. We said, we really, really should have written that script last time. Why didn't we do that? And Dave very sensibly said, because we thought we were done and we weren't going to do any

Starting point is 00:33:20 additional work and never ever again visit this project again. And I said, yeah, okay, let's make sure we write it next time. So you just got a screenshot, you just got an image that shows where the piece has to go, the shape of the well essentially. But it doesn't show how to get the piece there. There's a bunch that require complex rotations to get past tight corners and all that sort of thing. And the R script did not include any of that because if we'd done that, we would have just

Starting point is 00:33:47 written an automatic replay code generator, which we, after this, we actually sat down and did because that was too much. It takes us about two hours to actually input the whole thing. And mind you, every time we make a move, I'm like, this is clearly where we're going to find the bug, right? Like we're just joking about that every few minutes where it's like, obviously this move is illegal and that's, this is not going to work. And so old record was 86 and this was 148, which was the largest increase in Tetris history

Starting point is 00:34:14 either in absolute terms or proportionally. There is no way anybody will top that. That's beyond any of our predictions for the maximum possible score for this game. It's over. There's no way. Felipe, what happens a week later? Someone gets an even higher score. And this one's kind of on us. Dave has a friend that he does Advent of Code with sometimes. Dave hasn't said this, but he's often on the Advent of Code leaderboard because he's very good at that type of programming.

Starting point is 00:34:44 I blame his Mathematica skills and his desire to be up at midnight solving a problem. Um, and Dave is having a chat with his friend and did you show him the Rusty Sea Talk? Is that right? Yes. He saw the Rusty Sea Talk independently and then he messaged me. Yeah. So this is Tim. They're talking.

Starting point is 00:35:03 They, this says, Hey, I saw your crusty seat talking I actually know how to write rust really well. So I made my own version of the emulator and then he gets 157 points after Dave tells another burden hand heuristic because that's just the type of people we are. It feels like your dad shouldn't have the recommendation for this if you're in a competitive coding contest don't tell the other people on the keyboard your tricks. I can't endorse that because every single time we've worked on a project and we've been

Starting point is 00:35:28 doing something really hard and we've talked to someone in the same field, they've been like overwhelmingly happy to share whatever it is they're working with in amounts of detail that I consider like actual work. Like it was like, here's what I'm doing, here's how I'm doing it, here's what I'm thinking, let me know if you want me to look at your code type stuff. So we take the same attitude. If someone asks us what we're doing, we're going to tell them in as much detail as we can because yeah, it's a competition, but the goal is not

Starting point is 00:35:53 to win the goal is defeat Hatress at its core, right? And like the break the will of the game because that's what we're all after. Dave and I now spend the week sitting on various, um, discord chats as 170 gets posted and then 232 gets posted and we're like we really need to come up with a better way of doing this. Are these all your friend or? This is all Tim. Yep and Tim has a better computer than we do I will I will give him that part. He's also better at Rust than we are. He used to ask Dave hey I see you guys did what did we do? Used a priority queue but you don't need to do that. You can just do this.

Starting point is 00:36:25 And we're like, oh yeah, I guess we could have done it that way. That would be much faster. Yes, thank you, Tim. And so he's tweaking his heuristic. He's changing it. He's posting more scores and Dave and I are, seething is not the right term,

Starting point is 00:36:36 but we're definitely sitting in Zoom chats. We're like, okay, what do we do to like, we can't beat them on like technical skill or on like hardware. We need to do something to change things up. And we got nothing. Like our conversations are like maybe we can do an analysis and do a thing. So do you remember the day that we get was 10 posts um 232 points and we're like oh gosh okay well I don't know if we can break the 200 point barrier and 12 hours later New

Starting point is 00:37:03 Jade shows up on Twitter and posts a score of 289 points. We talked to him, we shared our heuristic and he did something radically new, right? He used an entirely new set of techniques, but basically he posts, all right guys, this is it, this is the definitive score and we say, yeah, okay, I guess we're done working on Haters. So, we started with neural networks and we abandoned it for New okay, I guess we're done working on Haters. So we started with neural networks and we abandoned it for New Jade's heuristic approach.

Starting point is 00:37:29 And Tim also used the heuristic approach. He was just better at it than we were. While we were doing that, New Jade was busy abandoning his heuristic approach in favor of a neural net approach. But the neural net approach was very different. So AlphaZero is designed by Google. It's designed by people who have millions of dollars of hardware. There is another

Starting point is 00:37:49 and objectively better algorithm for neural net learning of chess that was developed by a Japanese chess Shogi enthusiast because Shogi does not have anywhere near the resources that chess and Go do. And so this is called NNUE. It uses sparse neural networks to dramatically accelerate how much effective compute you have. If you have a huge neural network but 99% of the inputs to that neural network are zero, your network is effectively running a hundred times faster than it otherwise would be. And he used this to make a world record-setting Shogi program. And the Shogi program has a name that comes right out of anime. It's called the end of Genesis TNK

Starting point is 00:38:39 evolution turbo type D. That's the actual name of a program. We had never heard of it. This came out in 2018. We had never heard that this was a possibility. So yeah, Nujet documented this approach for the record, which is how Dave and Felipe learned about it. He incorporated David's bird-in-the-hand strategy with his new neural network approach. And this all led to the discovery of the very first loop. And we'll explore loops shortly, but by incorporating NUJAC's insights back into their code, David and Felipe quickly hit an even higher record,

Starting point is 00:39:13 366, taking the record back from him. And then around this time, I pinged them about maybe doing a podcast episode. So in the same way that preparing for the Rusty Seatalk got us talking about the problem again, when you messaged us, that also got us talking about the problem again. And so we started looking not necessarily a new approach, but a new way of analyzing the data we already have. And we actually have a graph of what are called loops.

Starting point is 00:39:46 Now, in theory, if you can have a position in Hatris and then get back to that same position, you could run forever. And to prevent that, the Hatris algorithm has a built-in loop prevention rule, which if it sees that you're gonna get to a well that you've seen before, it'll give you some other piece so you can't get there. New Jade's 289 point record was also the very first record to ever discover a loop, and soon after Tim

Starting point is 00:40:14 discovered his own loop, and soon after we discovered some of ours, and these were moves which if that loop prevention rule wasn't there, would get infinite points. You could just play forever. And so then the question that we've been asking ourselves for the last couple of weeks and what we've been investigating is, how many loops are there? And what's the structure of these loops?

Starting point is 00:40:36 What kind of well leads you to a loop? Because obviously if you can figure out that, you are a very good way towards understanding the game entirely. Because obviously if you can figure out that, you are a very good way towards understanding the game entirely. This shifts Haters from just being a machine learning, a game search problem, to something that David and Felipe are quite good at,

Starting point is 00:40:56 pattern analysis. You can only take each loop once, but if multiple loops start at the same spot, if you can force Haters to send you on a second loop, the game changes. It's exciting to see these ideas sparking off one another. Sharing leads to learning and everyone's growing from each other's insights. Each published idea is building on the last. Our current approach holding the world record is based on New Jade's approach. New Jade's is based on on Tim's. Tim's is based on ours again. Ours is based on New Jade's again. New Jade's approached for 32 points all the way at the

Starting point is 00:41:34 beginning of our involvement. He actually did not come up with the idea of using a beam search for Hatress. That was another programmer by the name of 3pipes. 3pipes in 2019 for a class project decided to write a Hatress solver and he made an attempt at it. He extracted the Hatress emulator from the original JavaScript rather than writing his own, which very interesting approach. He made a Beam search implementation and it didn't get very far. He got a score of 16, 17 points. He finished it for his class. He never touched it again.

Starting point is 00:42:09 He moved on with his life. But he wrote down what he did and he published it online and New Jade found it in 2021. And that's what gave him the idea to start the Beam search that led to his world records and that got this whole thing started off. It's not all about flashy thousand word blog posts. It's just about coming up with some idea, implementing the idea, and then telling people if it worked or not.

Starting point is 00:42:39 Haters does not matter. In a very real sense, this record is completely irrelevant. Nobody but us cares. The concept of keeping a continual chain of knowledge going, of reading what other people have done, and then writing down what you've done differently and what you've done the same and whether it worked, and passing that on for the next person, that's something that's universal I think. David and Felipe are impressive.

Starting point is 00:43:05 It's amazing how they tackled a specific game problem, pushing the boundaries and making unique discoveries. And yet the thing that they wanna make clear is they are not in fact, specially qualified to take on tasks like this. It boils down to, if you're working on a project and you are not the expert on the project, there is a large community of people somewhere,

Starting point is 00:43:28 or a small community of people somewhere who's probably thought about this problem before. They probably put time and energy into this problem before. They're delighted that someone cares. So if you show up and you send them a nice message, you say, hey, I saw your post about, oh, let's say Trackmania. I see you tried this on Trackmania, and we're trying a similar thing and I was curious about your approach. They will reply to you and if

Starting point is 00:43:46 they haven't touched this project in three years and tell you as much as they can about it. And those types of like interpersonal connections with other people across communities working on things ties into the idea Dave just gave but I think it's even more fundamental the fact that like people who care about projects are happy to share their knowledge and if you just seek them out they will share it with you freely, happily, without any particular complaints. And so, yeah, this is the story. This is the story of how two friends who couldn't understand the AlphaZero paper, but thought, hey,

Starting point is 00:44:16 we'll implement it anyways, got to the frontier, got to the coal face of game search trees, or whatever this field is called. They just kept pushing it, even if occasionally Felipe had to check in with his wife. of game search trees or whatever this field is called, they just kept pushing it. Even if occasionally Felipe had to check in with his wife. Oh, she was, she indulges my, I am doing a project with Dave because he doesn't want to hear anymore

Starting point is 00:44:36 about Haters at the Dinner Table. So it tends to be a lot of, are you doing stuff with Dave? Yes, okay, I will leave you to it. Yeah, I don't need the details. On your side, Dave, is there, I don't know what your situation's like, is there any relationship strife ever caused by these projects? No, I can't say that there is. I expect it could happen at some point because I can't see giving up doing these projects at 3 in the morning anytime soon. But I'll cross that bridge when I come to it.

Starting point is 00:45:07 Where do you work David? Like I missed it. Oh currently nowhere. I am looking for work. Yeah the most recent place I interviewed it with, I had five interviews over the course of two months. If you have five interviews including one where you get a full site tour and it's a three hour process and everything, you really are getting your hopes up even though you're telling yourself not to get your hopes up. And they said basically, hey within a week we will get back to you one way or another on the final offer. And that was three weeks ago. I've emailed them back just never heard anything. So that's why we do this sort of thing to distract us.

Starting point is 00:45:42 At least that's why I do it. I can't talk for Felipe. Beyond hitting 366, Haters offered David and Felipe a rare escape. Right? It was a place to bond and experiment and swap stories late into the night. I feel like there's something deeper here. The reason we chase these seemingly trivial puzzles, I've been known to do it myself, maybe not quite as far. But the reason we can pour months of our lives

Starting point is 00:46:08 into side projects, into a JavaScript game from 2010, is because there's something fundamentally human about this process of striving. It's about curiosity and the joy of discovering and about the thrill of pushing up against the boundaries of what seems possible. And sometimes it's simply about finding a good excuse to crack open discord late at night and tackle something impossible side by side with a very close friend.

Starting point is 00:46:44 That was the show. Thank you Felipe and David. I hope your quest continues. After the interview, I got to hang out with them for a bit and I was shown some of their unpublished work on loops. You know, David showed me Mathematica map of these loops and it looks like some demented subway map, but they're working on something that uses that. A new way to break ground. I'm sure if they do, they'll post the results on their blog.

Starting point is 00:47:09 And by the time you hear it, it might already be up. So check out all of Impossible Dreams dot org to see their latest projects. Sometimes it's hatred, sometimes it's the dead Internet theory. Sometimes it's Superman fan fiction. There's always something they're sharing and something they're working on and in my short time with them I saw how unique they are that they're both very different people But in a way that makes them greater than the sum of their parts And if you're listening this far I'd be remiss to not mention that you should join as a supporter of this podcast. I know there's somebody out there working on a side project who needs to hear the story of these two.

Starting point is 00:47:53 Maybe it's you. But stories can be powerful and stories can be something that sticks with you and allows you to learn from somebody else's experience. So that's what I'm trying to do here. If you enjoy it, support me, or just listen, that's why I always end by saying, until next time, thank you so much for listening.

Your Ad Here

CoRecursive: Coding Stories - Story: Hateris - Obsession, Friendship, and World Records

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.