Algorithms + Data Structures = Programs - Episode 214: Advent of Code in BQN (vs Python)

Starting point is 00:00:00 I solved that with recursion, and you solved it in BQN with generating the permutation of the functions plus and multiply, and then applying that to the list of numbers. Inverse of indices. That is a BQN idiom for basically giving you the histogram of a list of numbers. Welcome to ADSP the podcast episode 214 recorded on December 16th, 2024. My name is Connor and today with my co-host Ben, we discuss the different algorithmic approaches to solving advent of code problems in different programming languages. Anyways, how's it going? We're getting close to the holiday season. Yeah, Christmas is really coming up fast. Yeah, it's incredible that we're already halfway through december um oh my god nine days away yeah i didn't yeah i didn't even realize it's really coming up i did i did all

Starting point is 00:01:10 my present wrapping yesterday so that was done uh i put up my tree did christmas decorations oh yeah you i guess yeah you're living the life of uh someone who owns a place and can fit a tree yeah i suppose so. I haven't hit that stage of my life yet where there's technically a tiny Christmas tree sitting on the dining table. Cool. Well, your background looks great. Well, yes, thanks to Google and their virtual, I don't know, technically, this is an augmented reality.

Starting point is 00:01:41 What would you call this? It's just a virtual background. Yeah, I suppose. Yeah. Anyways, how have things been? It's been a couple of weeks since we chatted. Good. Yeah.

Starting point is 00:01:52 Nothing much to report. I wanted to spend this time to ask you some questions about APL, about AOC. Yeah. Because I thought that might be interesting i think that's we were we were initially that was the topic that we had slotted for the astute listener i think we referenced it at the end of the two-part board games episode that we were intending to uh have a conversation about apl where you as a c++ dev asked me some questions we we failed to talk about that but now we're back with uh you know we haven't been derailed yet and now that we're in

Starting point is 00:02:31 the middle of december we might tie in some advent of code uh stuff as it goes so yeah so i've i've watched your advent of code uh code report videos uh i think all of them up to this point um i've reached our what we're up to day 16 or 16 today and you have probably 10 videos i think maybe 9 10 something like that something like that i have finished all of 14 days i read day 15 and just I've been super busy the last couple days and I've made I think it was day one through six or seven then I skipped to day 10 and once I have some more you know there's a couple holidays coming off so I plan to make some more videos but there is yeah close to upwards of 10 I think and I'm doing them all in bqn which is apl 2.0 if you want right right um so i've been doing a few of them i've done the first three event of codes and days and uh i did day 7 and i did day 11

Starting point is 00:03:36 and uh i've been noticing and also from previous years of you you know, Advent of Code, the first days are generally easy. And I think the first days lend themselves to 10-minute videos, right? And in particular with Advent of Code, some of the early ones are fairly algorithmically explainable in 10 minutes, right? And your BQN videos reflect that somewhat. So I wanted to ask you like i so i noticed um a couple of the days um on day on day 10 you use recursion on day you know like

Starting point is 00:04:17 as we get later on in advent of code what what i think of as BQN or APL, I think of it as a language that is really good at highly mathematical things like arrays of any dimension, just filtering, reducing, all that kind of stuff. And reading through some of these problems in Advent of Code, a couple of them I solved in my my approach to event of code i have in previous years done it to ostensibly to learn languages like when you're i did it in haskell um when you're i did it in orc these days i think the best approach to event of code is to for each day find the language that fits the find the tool that fits the job. Right. So there are a couple of days, like one of the early days was like, Oh, totally.

Starting point is 00:05:08 I did it on the command line in, in a one liner. I think I sent you that. It was like three or so. Um, one of the later days, uh, which one was,

Starting point is 00:05:17 Oh, the, the, the day where you had to, uh, it was day seven, I think, um,

Starting point is 00:05:23 where you had a number and, and a bunch of numbers that you could put plus or multiply in between them to make and and the question was for every line of numbers can you can you make the sum oh yeah by combining the others with plus and multiply in any of the places i solved that with recursion and you solved it in BQN with generating the permutation of the functions plus and multiply, and then applying that to the list of numbers. That was fine. I'm interested to know what you did for day 11,

Starting point is 00:05:59 because day 11, I used Python, and the key to day 11 was dynamic programming, was memorizing one of the functions. And in Python, that is super simple. You just put a decorator up, which is why I thought Python was a good tool for that day. So I'm interested if you've tackled day 11 in BQN and what you did for that. So let me refresh my memory luckily i have the vs code instance so i can just type day 11 so this was oh yeah this was the one where it up as well here it's kind of one of those like

Starting point is 00:06:39 it's similar to is it the collapse conjecture i might be totally off where it's like if it's even you divide by two if it's odd you multiply by three and add one or something like that uh i can't yeah well it's a sort of a finite automata in one dimension right it's a finite automaton where you look at the value in the cell and you do something and one of the things you can do is split the cell in two and so it becomes this exponentially growing thing yes yeah and then parts a and b and so i think the rules and i'm doing this just by looking at the code is that if your num so you're get you you're given an initial list of numbers and there's i think I think, three different rules. And I'm just reading. I'm translating for BQN code.

Starting point is 00:07:26 So this is a test of how easy is it for me to read BQN code. Yeah. And I can actually, I'm not sure if, I can share my screen, and we will hopefully not completely destroy the listener's ability. But, you know. Okay. If I get to. So I am very much like probably one of our regular listeners when it comes to reading bqn and apl i i've read iverson's paper on it um but you know connor here has

Starting point is 00:07:55 written a master's thesis about about combinators and uh the last time i have never used APL or BQN even really to solve an Advent of Code problem. I've read Raymond Smolian's book and that's it. So please explain. Yes. So we'll do our best to not lose both Ben and the listener here. So I think hopefully we'll be successful on the first one because you can just stop me at any point. But if we lose the listener we do apologize so i mean there's three rules that we're going to walk through in a sec which is all in

Starting point is 00:08:30 this blink function i typically skip the first three or four paragraphs of every problem because it's it's it's unessential inessential to solving the problem but it had to do with like every time you blink this list of numbers is going to morph. And part A is just, I think, is it what's the sum of the numbers after 25 blinks? And the second one is after 75 blinks. Right. And we'll come to why it becomes tricky because basically the size of your list is going to explode. Yeah. In typical Advent of Code style, we've both seen Eric's keynote.

Starting point is 00:09:07 He gave a keynote at CP North last year, and I think he's given that same keynote in a couple of conferences where he talks about Advent of Code. Very worth a watch. And one of the things he says is Advent of Code tends to contain these exponential-style problems because one of the interesting things is you don't want to make a problem where someone can solve it in C++ and not in Python, for instance, right? So the language doesn't, you know, the cost of doing these exponential part twos is too big,

Starting point is 00:09:40 even if you're using a compiled language without finding, well, I mean, these days you might have like a whole farm, a whole cloud, you could pay lots of money to do it. But it does, I'm going to say for this problem, even in BQN, like you can't brute force it. And if you think you're brute forcing it, probably that BQN internally has something which is working to your advantage, right? The reason I say sometimes is I think, and I've been rather surprised with this year's Avenue of Code, which is 2024, in case you're listening to this years in the future. I think only one or two of the problems, I actually had to do something. And this was actually one of the problems where my first solution, I couldn't just increase

Starting point is 00:10:22 the number or change the thing. Like literally, I can't remember which day it was. I think it was the day where you had to find the number, the trail heads from zero to nine increasing by one. That was a single character difference in my solution. Right. Because it was just, can you get to it and how many different ways? And I was already deduplicating.

Starting point is 00:10:43 I was already calculating all the paths and then just like looking at the position and so the solution was just delete the deduplication and count how many ways did you get there and and deduplicate as a single character in apl so i just deleted it was either adding it or deleting it and i was shocked because usually you expect like oh i hope i did this it was removing it yes yeah in preparation for talking to you, Dad, I watched your code report videos last night. So, yes, it was. Maybe we can get to feedback afterwards. Well, you have a bug in one of them, which we'll come to.

Starting point is 00:11:17 Oh, well, it's a bug that, well, we'll save that for later. So, anyways, we're on day 11, part A, part B, 25 iterations versus 75 iterations. And you're given a list of numbers, if I can get this tooltip to disappear. And here are the rules. So this blink function basically is going to be what performs this action. So it says here if it's equal to zero. And then you're just, this is basically a ternary expression. Yeah.

Starting point is 00:11:44 So you've got a, you've got a one, just to set up, you've just this is basically a ternary expression yeah so you've got a you've got a one just to set up you've got a one dimensional array of numbers right and for each number each cell in that array you are doing this operation it's like a finite automaton that so the the array changes with every blink and each cell of the array does something to change yeah and so if it's equal to zero you change your number to one that's the first rule if it i and so what this reads is if mod two equals zero so it's divisible by two uh you just want to uh what does this say you want to take or no so it's if the length i missed out the first part of this if the length of the system function repper which means convert

Starting point is 00:12:25 this to a string is divisible by zero so if you have an even number of digits if it's decimal representation has an even number of digits yeah then what you want to do this is the second uh operation is you want to what i have written in my code is uh i'm i'm doing sort of some redundant work here i'm converting it to a string twice So technically I could have assigned this to a local, but for the purpose of this, it didn't really matter. So you're converting it to a string and then you're cutting it. And what does a cut mean?

Starting point is 00:12:54 A cut means to basically slice it in two. So I'm making use of a very beautiful thing in BQN where you give it a shape, which is a two. And then I'm not actually filling in the final dimension and this is just saying give me a two row matrix with the correct number of columns that will split that so you're going from a rank one array aka a vector to a matrix of two rows which is going to basically have the two digits in each row and then I'm calling a function called parse float, which takes a string and then converts that to an integer.

Starting point is 00:13:28 And I'm doing that on a row wise basis. So I'm basically taking this matrix of characters and treating each row as a string and then converting it back into an integer. So this is just a very cute way of converting a number with an even number of digits into two, uh, two numbers. So in, in English, you're, you're cutting the number in half textually.

Starting point is 00:13:53 So 1,234 becomes 12 and 34. Yeah. For example. And it's quite nice. And in fact, actually, you can do this a little bit slower with the system BQN function, which just takes any arbitrary bqn expression and evaluates it but i can't remember if it was this day but one of the days i ended up running into like uh runtime problems and i discovered by looking at other folks solutions that if you know your expression is is just an integer or a float literal it's way faster to use this parse float because that's going to be uh much simpler simpler than evaluating just any arbitrary bqn expression anyway so condition one is zero

Starting point is 00:14:32 change it to one condition two even number of digits split it in half and condition number three which is just the else case so there's nothing to check for you multiply by 2024 which is just okay so there are three things you can do for those three cases. And it's the splitting one. It's the cutting the... It's the one cell becoming two that, as you can appreciate over time, makes the array grow and grow exponentially. Yeah.

Starting point is 00:14:56 And for part A, I don't actually think I even needed this step function. So I have this extra step function, which uses blink. But for part A, I basically just called blink with the repeat operator which is uh i don't know if it's what do you call it morally spiritually equivalent but it's it's kind of like a for loop without the indices you just pass it a function and you perform that function that many times right Right. So it's like Haskell, is this like Haskell's iterate? Or it's the general, I think if it is, so it's like the scan of it, if you like, the repeat operator is like an accumulation.

Starting point is 00:15:38 But if that were expanded out to a scan, it would be, you'd be getting X, F of X, F of F of X, F of F of F of of x just so happens that the repetition is giving you the end of that sequence so you're you're getting f of x apply f applied to x 25 times iteratively yes there is definitely a asymmetry between scan and repeat except i would compare this more to a reduction because scan outputs the intermediate state the key here is that there's only one input to your function um i think of it as the generalization of iota iota is repeat with the function being plus one right basically you have a start state and a unary function and it applies that unary function

Starting point is 00:16:26 n number of times in this case 25 right uh whereas like reductions and scans they technically take new not state but like they take a new value and have a binary operation where they're taking the state that it's carrying along plus the new information typically in the form of a list or something but yeah very similar to the haskell i can never remember there's iterate and then there's another one that are both somewhat similar but yeah so anyway you iterate this function 25 times you repeat it yeah that gives you the array after 25 blinks yeah and you'll you'll see that there's a bunch of what we'll call a ceremony around this, like this tack reduce. This is a two-character way of spelling last. And that's because of the modifications to this solution that we had to do for part B.

Starting point is 00:17:19 So part A was much simpler. It just had a blink function, repeat 25 times add those up you're done it didn't have this extra tack reduction which is uh like i said a two character way of spelling last because right so because tack is take the thing on the right exactly yeah and so if you reduce that over an array you get the thing on the far right which is the last thing for it and for uh in this case we just have a two-element array. It's a painful way of spelling like second, S-N-D in Haskell, or even dot second in C++ on your pair.

Starting point is 00:17:55 And it's actually one of my largest complaints. If I had to make a list of the top ten things about array languages in BQN, I don't actually think any of them have a last primitive. They all have first, but they don't have last. And I think that there's a bunch of other primitives that have been added that are used way less frequently than last. And you always end up, especially when you're building up forks, which are things that are beautiful when they're symmetric, you have a single glyph for first,

Starting point is 00:18:22 and then a two-character gly glyph not even glyph a two character expression that is really the application like this is kind of a trick like the tack reduction the first time you see this it's like what what is that because it's it's not the intuitive way to get the last thing you might want to reverse and take the first thing of that that was how i used to spell it when i started programming in APL. Is there a performance implication here? Is that like, because first is, you know, constant time. TAC reduction sounds like linear time. But would it be possible to have a primitive that returns last in constant time?

Starting point is 00:18:56 Or does something happen in under the hood to make it? Yeah, most, if not all, the array implementations that I know of have an idiom recognition for this. So it's a constant time operation. But if it didn't have idiom recognition, it would be linear. And it still would be preferable to the first on reverse because that's actually going to... Because that's moving things around as well as, yeah. I think actually I should say that CAP is lazy. It has lazy evaluation built into it.

Starting point is 00:19:26 So in that case, if you do a first reduce, it's actually going to be the fastest because it's going to perform some kind of view transformation similar to C++ 2023 views. Reverse it and then just it needs the first and it's going to short circuit right there. So arguably CAP actually has the constant time without hitting recognition. But to get to part B, the modification that's made, and I actually borrowed this. You can see that I have this heavily inspired by J.H. Franklin is the GitHub username. Because I wasn't actually sure what to do here. And it has this very cute trick that uses one of my favorite bqn idioms that basically it does a deduplication and stores your current list in the form of a the array equivalent of a hash map or the counter collection

Starting point is 00:20:11 in python where you keep track of the number of values and the number of times that value shows up and so every single time it's like a histogram operator exactly exactly and and so this the the two characters uh because i skated over it without explaining for the listener uh if you have the inverse of indices that is a bqn idiom for basically giving you the histogram of a list of numbers so indices if you're given basically in the simple case, zeros and ones, it'll give you the indices that correspond to the ones. So for instance, if I have the sequence and BQN is a zero index language. So if you have one zero one zero one, and you perform indices, which is just a slash on that, it'll give you back zero, two, and four, because

Starting point is 00:21:05 those are the indices that correspond to ones. However, that's the subset of what indices is. Technically, it will give you two of every index or three of every index, or in general, n of every index, where n is the number that corresponds to that index so if i changed my sequence from 10101 to 10203 i'd get 022444 if that makes sense okay okay so some people refer to indices actually as copy because you you can think of instead of taking the indices that correspond to ones you're copying the indexes n times for the n that corresponds to that index right so the oak so this is now making me think of the the disk defragmenting day for event of code which was like the input was like an rle string sounds very similar to to kind of what indexes might might do so in fact i can't remember the name of that but almost uh surely that is what i did i think i used a indices or replicator copier whatever you want to call it

Starting point is 00:22:22 to create that if you're if you can basically take that it'll auto generate um i think it's a little bit trickier because i solved that problem incorrectly i i didn't realize you could have more than a single digit number and that totally breaks right like you can you can have a two digit number which means you can't just look at the individual things um and so now that i've explained what indices does, indices inverse, or technically undo is the operator in BQN, its technical name, it will take a list of numbers. It technically does a little trick for this idiom, and it sorts it first. And then it gives you the quote unquote mask that would give you that list of numbers so if if i have the list of numbers zero two two four four four and imagine that those could be rearranged so it's the inverse of

Starting point is 00:23:16 indices so it gives you one zero two zero three exactly which is yeah like right there that is probably we lost a few listeners but like this is two characters on screen and inside basically indices like we have i like to think of it 90 of the time as just the boolean mask version it's it's it's your equivalent of a filter and actually technically i think i'm i've totally messed up my explanation of this. The monadic version is the one that we've been talking about where you just apply it to a list of numbers, and then you can invert it to get the histogram, which is basically how this problem starts out. So this is all building up to the fact that whereas for when I was just solving part A, you can just work on yourarchitect the solution a little bit, add this step function that is this boilerplate of every single time you perform the blink function and on the initial input. You basically need to, like, histogramize each iteration. And then that's why for part A and part B now you have this last because you're needing to get the counts of these. Actually, wait a second.

Starting point is 00:24:46 How does that? That actually doesn't make sense, right? Because you need to multiply them in order to get the total sums at the end of the day. Is that what? Did part B, what did it actually ask for? What it's asking for. Yeah. This is day 11.

Starting point is 00:24:58 Part B. Or rather. As Ben and I are both reading the. Oh, it's asking for the length of the sequence. Okay. Yeah, yeah. So that's why it's getting the last because i was like i was going to say if this was supposed to be the sum of each of the numbers this would be wrong uh but what it's doing is it's getting the counts which is technically the value in these key value pairs and then just summing up the counts uh which then it is correct so uh anyways i'm not sure and then also too like

Starting point is 00:25:28 this is i'm not going to walk through this but in the step function i tacitized aka like i made the whole expression tacit probably to the it harmed the readability of this because there's a there's a lot a little bit i don't want say line noise, but to someone who's not used to BQN, this is not readable. I can see a few things in there. I can see a reduction. I can see some text. I can see some ads.

Starting point is 00:25:56 So, you know, it's definitely recognizable, but it's typical as to what the functions end up like in your videos, I think we could say. Well, I mean, sometimes I'll make something tacit and then after the fact be like, ah, this is not more readable. But I went through the process of making it tacit, so I'll just leave it there. Sure, yeah.

Starting point is 00:26:20 I mean, it's advent of code, not production system stuff, right? So I'm not sure. What are your thoughts after having been walked through? Is this... It's interesting, you know, because it's very different. So part A, you know, Avent of Code problems are set up like this. So that in part A, you can implement the naive solution, right? And the naive solution is just to apply the automaton function to every cell

Starting point is 00:26:53 and just expand the array, and 25 steps later, you count how many you've got. And that, I think, is originally what you did and what I did in Python and what probably everyone did. But then part b requires you and you know there's a sort of thing here where if if part b is just a one-liner we perhaps we should be afraid yes because because it's like i did this i did not set myself up for success for part b yeah and part b you know you'd be waiting years if not centuries with a naive solution um even if you had the amount of memory required so your approach to part b in bqn was very different to my approach to part b and i don't want i don't know what Eric was going for, but I suspect the approach Eric was hoping to lead people towards was the dynamic programming approach, the memoization approach. And it's interesting to see that this one was attackable by BQN in this different way, by changing the representation of the output right i mean this it would have been

Starting point is 00:28:07 really easy to do this in python as well because the counter collection is like a six character way to spell a lot of what was built out here okay okay well how i did it in python was to add uh a memorization decorator yeah the function that generates the stone expansion. And curiously, that was actually my initial thought as well, is that caching and memorization is the solution to Part B. And I actually went and there is a BQN leaderboard of sorts that folks that want then can submit their you know github repo and it tracks who's doing them so you can go and take a look very easily at other folks

Starting point is 00:28:51 solutions uh for inspiration and i would say it was 80 20 uh 80 being folks that were making use of memoization using something i had no idea existed There's a system hash map that you can reach for if you need it. It's very, I would say, un-array idiomatic in that you're using, updating like a hash map and methods on this kind of object. And so you can basically do what you did painfully. Like there's no decorator you can call in P qn like oh right it memoizes you have to go and manually do the oh i've got a a hash map of already pre-calculated stuff and whenever i see that something's in there there's no automatic memoization in c++ either i think i saw some folks like um sy or tristan have been have been have been putting out their solutions on Blue Sky.

Starting point is 00:29:46 And for the days like day seven, where memorization was important, or at least a sort of obvious way to do day two, we might say. Yeah. Yes, they've had to pull in things like hash maps and things. Yeah, exactly. Or unordered map, whatever it might be. So it's interesting. You said like some of these days require you to do things in BQN which are tricky.

Starting point is 00:30:16 And also in C++. Which is why I've come to the, like we said at the beginning of the conversation, which is why I've come to the idea that to begin with if you're doing aoc at all there is there is no there is no cheating there is no like you know like any route to the solution is valid there's no there's no gatekeeping here there's no you must be a purist do it in this language this language whatever in my view any route to a solution is valid i mean you're not learning if you're looking up someone else's solution except that you might be learning even in that case right like if you if you just can't think of a way how to solve like a part b

Starting point is 00:30:55 and you look up someone else's solution you learn from it that's valid too when you sit down and do bqn do you feel there are other days where you did there's no there was another day where you did a there was a loop you actually wrote a loop in bqn a while loop and uh i'm looking at that code and i'm thinking that doesn't look like idiomatic bqn that's not the kind of thing i normally see on connor's videos so you know and that's that we could we're not picking on bqn here because it's the same i think for every language like like dealing with regular expressions in C++ is painful too. Or, you know, we could pick random days and say this is really easy in this language, this is hard in this other language. So the question behind this observation is like, what is BQN for, in your opinion? What is it good at?

Starting point is 00:31:44 What does it solve well? And what is its typical application? Be sure to check these show notes either in your podcast app or at ADSPthepodcast.com for links to anything we mentioned in today's episode, as well as a link to a GitHub discussion

Starting point is 00:31:55 where you can leave thoughts, comments, and questions. Thanks for listening. We hope you enjoyed and have a great day. I am the anti-brace.

Algorithms + Data Structures = Programs - Episode 214: Advent of Code in BQN (vs Python)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.