CoRecursive: Coding Stories - Story: From Competitive Programming to APL

Episode Date: June 2, 2021

Today on the show, we have solving algorithmic programming problems. You know when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard using C and in const...ant memory space? It's that kind of thing. These problems have their roots in algorithmic programming contests. And our guest, Conor Hoekstra, is a former competitor. Episode Page

Transcript
Discussion (0)
Starting point is 00:00:00 Hello, and welcome to Co-Recursive. I'm Adam Gordon-Bell, the host. Each episode, someone shares the fascinating story behind some piece of software being built. And today on the show, we have solving algorithmic programming problems. You know, when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard using C and in constant memory space, it's that kind of thing. These problems have their roots in algorithmic programming contests, and our guest is a former competitor. My name's Connor Hoekstra, and I currently am a senior library software engineer
Starting point is 00:00:34 working for NVIDIA on the Rapids QDF library. Connor's story starts in university, when he was competing in programming contests. But it's interesting how if you follow a single idea long and hard enough, you end up in interesting places, like writing software for GPUs. For Connor, that idea is finding the perfect way to express the solution to a programming problem. His journey will take him into leet code, into getting his dream job, and all the way to APL, which is one of the more unusual and obscure programming languages. But I think I'm getting ahead of myself. This story starts in 2012, when Connor was still in
Starting point is 00:01:10 university. I wasn't a computer science student at the time, but I was taking some computer science classes. And I just happened to walk by the computer science lab. And there was a poster on the wall that said competitive programming contest, free pizza, you know, meet Saturday at lunch. And I like competing in things. And it sounded fun. I had really no idea what it was about. So I showed up. So I studied at Simon Fraser University, which is in Vancouver, BC. It was we had a bunch of computer labs there. So it took place there. If I recall, there's probably about 20 to 30 people that were competing. And you basically just sit down at a Linux workstation.
Starting point is 00:01:49 They give you a couple instructions. You code in any of a list of languages, but most of the popular ones, C++, Java, Python, et cetera, are all usable. You code up your solution from start to finish. This was a qualifier for a contest called ICPC, the Intercollegiate Programming Contest. You have to code like a whole text file from start to scratch. So if you're doing it in C++, you have to include all the include statements. You have to have a main function. And typically they give you a specification as your input will be one integer that represents T, the number of test cases, and then, you know then five lines per test case where each line represents the data
Starting point is 00:02:26 that you need to solve it. So you have to be responsible for reading in that data and then outputting to whatever is standard out to solve it correctly. Then you submit the file. Like it's a single file, you submit it, and then they compile it and they hit it with test cases through standard in.
Starting point is 00:02:42 Yeah, so typically they'll give you two or three test cases. So they'll give So typically, typically they'll give you two or three test cases. So they'll, they'll give the problem statement. They'll give you two or three simple ones that you can test to see whether you've gotten the basic cases, right. And then you submit a dot CPP for C plus plus, or a dot P Y for Python. And then they're going to, it's going to run against a bunch of tests and you might get back a failure. So if you've missed some corner case, maybe it was an empty list or an empty string that you didn't take care of. It's not going to tell you what test case you failed on. It's just going to tell you that you failed and you have to go and stare at your code and figure out, okay, what's the corner case I missed and then resubmit once you think you found it. And
Starting point is 00:03:18 then finally you'll get like a pass and then you move on to the next problem. And you basically, the way competitive programming works is you have a set amount of time. For this contest, it was two hours. You have a set of problems. I think there was six or eight of them. And the goal is to solve as many as possible, as fast as possible. I, in my first attempt at this, didn't know how to actually submit the problem. So I solved, I think, three or four of them and didn't understand that there was a time penalty, like the longer you waited. So when there was 10 minutes left, I went up to the individual, the prof that was running the contest and said, Hey, I'm, I, I'm being unsuccessful with
Starting point is 00:03:52 submitting these. He sort of looked at me strangely, like, that's, that's the whole aim of this. Like, what are you talking about? So you haven't like, he said, how many of you submitted? And I said, well, I've solved four, but I haven't submitted any. And so he just sort of shook his head, came over. I ended up submitting a couple of them, but then ran out of time. But luckily, there was a second qualifier. And so I sort of went back to my dorm room, figured out how to actually compete in these. And then in the second one, I did well enough to get on to one of the university teams. It was like the B or the C tier team.
Starting point is 00:04:23 So you joined the team. Like, were you excited? the university teams. It was like the B or the C tier team. So you, you joined the team, like, were you, were you excited? Did you have to like meet the team or what's the process after you made it onto the team? So yeah, once, once we joined the team, um, it was three members and we basically had practice contests once every couple of weeks. Um, and this was just over a semester. And then we, we ended up all three teams, like the A team, the B team and the C team ended up going to the regionals. And we didn't end up doing that well, because we were we were competing with people that have been studying, you know, their whole university careers at this just practicing at competitive coding, because it's one thing to
Starting point is 00:05:02 get past the beginner level, it's just solving simple algorithmic things the intermediate level you need to know basically like general classes of algorithms but to get to the super advanced level you basically are like memorizing all these custom different like searching uh algorithms and so you'll read a problem and know like oh that's the the floyd insert name insert name specific algorithm that you need to use uh and that just takes time and energy. And you're actually allowed to bring binders of pre-written code into these contests. So as long as you can recognize what algorithm is needed to solve it,
Starting point is 00:05:34 a lot of times it's just time and energy of studying these esoteric algorithms that otherwise you probably wouldn't end up needing to know. So you'll have a binder with all these tabs on it, you'll be like flip to that and start just typing that in. Yeah, yeah. And typically I see or not typically the way that ICPC works. Once you're in the regionals, you have teams of three people and one computer. So the way that it's typically working is that you've got one person coding up their solution that they've already handwritten and solved sort of on paper. And so two people are handwriting their code on paper and then you swap out to type. So you basically need to be able to solve your problem without a keyboard. And a lot of the times when
Starting point is 00:06:18 you're doing that, if you can just pull a piece of paper out and say, oh, this is the algorithm that I need, you can just scribble some notes around it for reading in and reading out or writing out your answer and you're good to go. The algorithm can't just be correct. It also has to be performant because every test case has to be handled in two seconds of CPU execution time. And for simple problems, it doesn't matter
Starting point is 00:06:37 because the input size is not that large. But when you get to the harder problems, one of the first things you do is you look at what's the size. So it's a 10,000 array. You know that basically you only have one times 10 to the eight operations. So if you've got one times 10 to the five elements in your array, a quadratic algorithm, you know looking at the size of the input and automatically like ruling out some brute force quadratic algorithm, which is sort of a neat part of the problem, too, is some people say, oh, how come you don't just do this? And it's like, well, we know that that's not going to work. And that leads to, I think, 90% plus of folks are using C++ because in those problems, like the difference between using Python and C++ is going to be an accepted solution. So I guess it would give you a neat insight into complexity of solving problems because you have practice that sort of cutting things out just based on knowing that it won't meet
Starting point is 00:07:34 this barrier. Yeah, it's interesting because that doesn't typically come up in the real world where you know you have two seconds to solve something. So then, oh, the algorithm I'm choosing or the code that I'm writing is going to be completely driven by time complexity. Typically, you write the code and then you optimize it later if you find that something's too slow. You're not starting with time complexity and then working backwards. But it definitely is a good way to learn the difference between linear, quadratic, NLogN, stuff like that.
Starting point is 00:08:05 Connor's team went on to compete at the Pacific Northwest Regionals. The regionals take place in a large computer lab. Everyone is huddled around computers. And what they do is they have colored balloons for each problem. So like problem A will have a red balloon. Problem B will have an orange balloon. And every time a team solves a problem, they come and bring these helium-filled balloons and tie it to your computer.
Starting point is 00:08:27 So you can look around the lab where the live contest is happening and see just by how many balloons are hanging above someone's computer, basically who's winning. So that whole part of it, I thought was super, super cool. But then, yeah, so after that one regional contest, we definitely, I don't think any of the SFU teams,
Starting point is 00:08:44 I think probably UBC and Stanford, they're typically the two teams that do the best. Sometimes Berkeley pops up there. And after that, yeah, that was basically sort of the end of my university competitive programming career. Five years later, competitive coding long behind him, Conor got invited to a job interview at Bloomberg for a software developer role. And at the time I wasn't a software engineer per se. I was a programmer that used code at my day job as an actuary. So I worked for an insurance firm. Bloomberg ended up reaching out to me and I said, sure, I'd love to interview, even if it's just, you know, just to learn what it's like. And they said, okay, we're going to
Starting point is 00:09:24 do our interview on a site called HackerRank. They've got this sort of side site called CodePair, which enables sort of this live streaming, sort of like a Google Doc, but specifically with an editor. And you can you can actually test your code live. So you write a function and then you can test it on the sample input. And they said, if you're not familiar, you know, check out HackerRank. Here's a couple of problems. And that was the first time I had heard of HackerRank. So I ended up solving a few of the problems, thought it was awesome. And I was like, oh, yeah, this is just like what I used to do five years ago and then ended up just absolutely bombing that interview.
Starting point is 00:09:56 It was so bad. Definitely one of the worst interviews I've ever had in my life. What happened? We were doing CodePair. I don't remember the exact question. I'm pretty sure the answer to it was using std partition, which is just a single liner from the standard algorithm library, which basically partition means something different in every language. But in C++, you'd give partition basically a container, a sequence of elements, and a unary predicate.
Starting point is 00:10:25 And it puts all of the elements that return true for the unary predicate at the beginning, and all the ones that return false at the end. So if you have the numbers 1 to 10, and you do a partition with is even, the even numbers 2, 4, 6, 8 will all show up at the beginning, followed by 1, 3, 5, 7. So I'm pretty sure that was what the answer to the question was. I didn't even know what a partition was at that point in time. So I wasn't going to be writing that down. But it got to this point where I needed to know how to, like, I think swap two iterators or something like that. And I asked if there was a built-in function for swapping two iterators. There is. It's called iter underscore swap. But at the time, I didn't know about it. And the interviewer said, yes,
Starting point is 00:11:11 there is. And I said, do you mind telling me what it is? And he said, no. And I'm pretty sure this was just the warm-up question. And I stumbled my way through it, but that was sort of the end of the interview. And just, so it was just a lack of my knowledge of the standard library. It was just sort of eye-opening too, because I was not used to the classic big tech company where they just ask you like algorithms and data structures questions. Like every single company, Google, Facebook, you name it, that are getting thousands and thousands of applicants. They basically have a standard format where they phone screen you.
Starting point is 00:11:49 If you do well enough, you go to on-site. And then on the on-site, you're going to have four out of five of the interviews purely being on data structures and algorithms. And they don't care what language you do it in. But yeah, you have to get good at answering these sort of questions that aren't going to ever show up in your day job. You're going to be being hired to write some front-end code that doesn't make use of this stuff, but that's how they test you. So I basically went on a journey of over a year just trying to get really, really good at what they call leak code problems, or aka competitive programming problems.
Starting point is 00:12:22 I had a similar experience at some point. I interviewed, I don't know how long this was. I suspect I'm older than you, but I interviewed at Stack Overflow, you know, the like the website where they tell you answers to questions. And I wasn't expecting, yeah, like one of these algorithmic questions.
Starting point is 00:12:37 And I just bombed out. And the guy who was doing it was, he was weary of the world. Like he probably had like four of these a day and was barely paying attention, right? He was the first guy. They probably have this giant funnel where most people never make it past that first stage.
Starting point is 00:12:55 And he was like, he barely had effort to tell me, yeah, this isn't going well. We're just going to hang on here. I'm going to say no and then I'm on to the next meeting. But I found it like super painful to be rejected like that. Cause I was, I felt like I was a good software developer. Just not at that. I mean, did you have that kind of response? I honestly can't recall exactly how I felt. I think probably in the moment for me, it was a bit demoralizing
Starting point is 00:13:25 because you're so excited about this opportunity. At the same time though, like my perspective on myself was I wasn't really a software engineer. Like I didn't even, I didn't even think that I qualified for this interview. I was just taking it because I thought, ah, what the hell, what's the worst that could happen? And then, you know, probably the worst that could happen did happen. And then, you know, probably the worst that could happen did happen. Um, um, but more of my thought was that I had gone through a few of the hacker rank questions and I knew that like, that's something that I could be extremely good at. And I, I saw the interview format and so like, it sucks to bomb, but, um, I knew it's just a matter of time and energy. Like those types of questions compared to what you have to do in actuarial science
Starting point is 00:14:08 and these crazy five-hour exams that they make you take, that seemed a lot easier to do. It's just time and energy that you need to put in. And there's a lot of people that will tell you that it's just disheartening that like it has nothing to do with your day job and it takes time and energy and people a lot of the times are like,
Starting point is 00:14:24 well, why do I have to go and learn this sort of extra skill that is passing these interview questions in order to do something that I already know how to do? I completely agree with that. Like it's a broken model, but I feel like it's the same quote about like democracy is the best, worst form of government or whatever that we have. Like, how do you, how do you do a detailed interview process that is not prejudiced against, you know, certain demographics of people? And that sort of, it's quasi-standardized, but also looking at more than just your ability. Like, it's a hard problem to solve. And this is the best solution that these big tech companies have at the moment. Do you know, like, the concept of signaling, like from economics?
Starting point is 00:15:05 I do not, I don't think. Economists sometimes talk about whether university, whether it's actually valuable or whether it's just a form of signaling. For an education to work as signaling, it could work as this. I want to hire people, but I want them to be in the top 20% of smartness.
Starting point is 00:15:21 So I just find some bar, like getting a physics degree that has nothing to do with the job that I'm prepared to do, but it effectively acts as a screen. In a pure signaling world, the person going into the physics program leaves no smarter and no better to do the job that I'd like to hire them from. But by me hiring from that pool, all those people are pre-screened. The person who goes into physics, they, you know, they learn nothing in the program, but they're buying the signal that says I was able to pass this program. So there can be actually no educational value in a pure signaling world, but it can still be like an effective method. It's probably not good in the greater sense, but like both parties in an economic sense could be motivated to do something like that.
Starting point is 00:16:09 I think a coding challenge, you could view it that way and say, this won't have anything to do with your job, but we know that the people who pass these seem to be able to do their jobs. Yeah, there's probably a lot of that behind the process, even, even for like so many of the credentials behind certain professions. You'll hear people talk about like, well, this test is, it's not something that I'm actually learning skills to do my, my, whatever the domain is to, you know, to do on my day job. It's just a hoop. But they make folks jump through this hoop because of that reason. Like the folks that can jump through the hoop, typically it's got some correlation with being able to execute well on whatever it is that that profession is responsible for doing. Now Connor knows what the hoop is.
Starting point is 00:16:51 He just has to jump through it. And because of his competitive coding experience, he actually has a great strategy for getting where he needs to be to get through these job interviews. Basically, LeetCode has a contest every Saturday, depending on what time zone you're in. It's typically in the evening on the Eastern Standard Time Zone. And I would do that contest
Starting point is 00:17:11 every week. And then all the other websites, HackerRank, CodeForces, TopCoder, HackerEarth, there's a couple others. They have contests going on, you know, sometimes they're monthly, sometimes they're weekly. So there's no shortage of contests that you can actually take part in, which is, I think, the best for like simulating an interview experience. But there's also just thousands of questions in the backlog. You can go look at all these past contests and just solve whichever ones you want. I slowly, slowly but surely just started competing and practicing. And whenever I would fail to get a question, go back and look at other folks' answers to the questions. And slowly but surely, you just, you start to improve. So when you were competing on these Saturdays, like, did you get your computer set up? You
Starting point is 00:17:53 brew yourself a coffee and sit there? Do you have your binder? Like, I don't know, what's the process? Definitely not coffee because they typically are at like 9.30 or 10.30 at night. So if you're having coffee that late, you're not going to sleep anytime soon. But yeah, so the LeetCode contest is four questions, 90 minutes. And it's crazy. The top people finish all four questions in like 20 minutes, all of them. And they get the first one within like 30 seconds. Like I don't even know how they read them in that amount of time.
Starting point is 00:18:20 And so yeah, you typically start writing on a given contest. When I was sort of at the height, I would usually get either three out of four or four out of time. And so, yeah, you typically start writing on a given contest. When I was sort of at the height, I would usually get either three out of four or four to four. Obviously, the contests where you are able to solve them all, it's super awesome. Probably like the highlights was every once in a while, I would get the first one or two super quickly within like a few minutes. And then they show you a leaderboard of the current people that are in the lead. So a couple of times I managed to be in the top 10 for the first couple of problems. I think the highest that I ever ranked on a contest was like 60th or 70th out of a few
Starting point is 00:18:54 thousand people. So I'm, I'm definitely not like an expert level competitive programmer. And so, yeah, you, you solve the first two problems rather quickly. I mean, this past contest that just happened two days ago, the first problem I can't actually recall because it was super easy. And the second problem was called max ice cream. And it was basically, you're given a integer value and then a list of numbers.
Starting point is 00:19:15 And the list of numbers represent prices of different ice creams. And then the integer value you get is a total amount of money. So you say you're given $6 and then you've got ice cream prices one, two, three and four. But those might it doesn't need to be sorted. It can be in any order. So the question is, what's the maximum number of ice cream cones you can buy with the money that you have?
Starting point is 00:19:38 So in an imperative language, you typically the way that I saw being solved was you sort all the numbers and then you have a for loop that keeps track of, you know, how many ice creams you've bought and the total money that you have left. And so then you just, you keep buying the cheapest ice cream that you can until you run out of money and then you return the integer. Um, but there's a different ways you can solve that functionally. Like there's a super nice way you can solve that in an array programming language and note that LeetCode doesn't actually have any array. So they only give you a list of, I think, 10 or 15 languages you can solve from. So if the website doesn't support it, you can't code in it. But all the popular languages, Python, C++, Java, Rust. I saw they added Racket the other day, which is a Lisp, which is not as popular, I would say. So they are starting to add more and more languages. And yeah, the 90 minutes
Starting point is 00:20:22 are up. And then you see where you're ranked at the end of the day. And then the really cool thing about all these sites that I haven't mentioned is that they keep track of your rating, which is for me probably was what like drove me the most. Like obviously I wanted to do well in the interviews and stuff, but seeing a graph of your rating over time,
Starting point is 00:20:38 which hopefully is going up, is like super motivating. And it shows you like what percentile you're in. So I think I'm in like the 98th percentile or something. But when I started out, I obviously wasn't. And so each time that you compete, if you finish higher than what your average, you know, finish has been up to that point, like your rating is going to go up.
Starting point is 00:20:57 And so over time, your rating slowly goes up. And I don't know, for me, that's a very encouraging and motivating thing is to see that you're doing better. And, you know, oh, it's Saturday. You will I will my rating go up? We'll see. What was your goal here? Like, did you imagine yourself like, I'm gonna, I'm gonna hit number one on the leaderboard? And then what were you picturing? I definitely had zero goals to be number one, not even I think top 100. If you can get that,
Starting point is 00:21:23 it's like amazing. It, it's sort of like entering a, a marathon or a half marathon or a 10 K like your goal can't be to be number one. Like that's, there's just, there's just people that are going to beat you. Um, but you know, having some goal of being in the top 5% or something like that's, that's an achievable goal depending on where you're starting out from. So, you know, over time, obviously you want to see your rating go up, but ultimately the reason I was spending all this time practicing these problems was to get a job at a technology first company. So that's the thing is while I was working at my first career as an actuary, I was falling in love with the software engineering
Starting point is 00:21:57 side of things. And at a certain point I decided what I wanted to do was to switch to a company that was sort of technology first, like big tech companies like Google and Facebook, you know, they are technology companies first. I have friends that have sort of done the same thing and their strategy has been going back to school. I had talked to folks and they just said, if you can just learn how to get good at these interviews, you can probably land a job. And so over the period of a year, I just kept practicing. And then like the like the plus side of it is what you just mentioned there, right? Is that it does make it an open field for people
Starting point is 00:22:29 to apply for these jobs. It's not like you need to have a computer science degree from a certain place. Like there is a clear path that people can take, which can be empowering. On one hand, yes, there is no quote unquote discrimination based on credentials. At the same time, though, it did take me like a year of practicing this stuff. There are definitely arguments that say that depending on where you are at in your life, you don't have the ability on top of a full time job to then go and sort of train yourself in these sort of interview type questions. But I do agree that, you know, in some sense, it is very meritocratic, given that you have the time to go and sort of train yourself on how to
Starting point is 00:23:10 answer these types of questions. Connor didn't just train himself, he started a YouTube channel where he would explain solutions to others. And eventually, he got an interview at a technology first company, Amazon. It's funny, because at that point, you know, I had progressively been getting better and better at the interviews, but I went into the Amazon interview. And if I recall, I think like three out of the five interviews, I thought I absolutely crushed, but then one of them, I thought I didn't do as well in the fifth interview that wasn't algorithms and data structures was a behavioral question. And in the Amazon interviews, they ask you not to sort of
Starting point is 00:23:44 retell the same, like you get behavioral questions, like small behavioral questions in each interview. And Amazon has these 14 leadership principles that you're supposed to weave into your answers, like customer obsession, you know, think big and stuff like that. So you're supposed to, you know, if you've done your research, you're supposed to reverse engineer, what's the leadership principle they're trying to figure out here and then weave that exact leadership principle into your answer. And so the last interview I thought went awfully. I'd answer the question and then she told me that I didn't answer the question.
Starting point is 00:24:17 And I said, well, I have another example that I think is way better, but I sort of talked about it in a previous interview. She's like, OK, just go with that one. And then after I answered it that way, she's like, why didn't you just say that from the beginning? And I was like, well, I didn't want to say the same thing. And she's like, you need to learn how to market yourself better. So she started talking to me about how I wasn't approaching answering these questions the right way. And she's like, that was a fantastic answer. And if you hadn't just started with it, anyways, it was a very, very odd experience. And at the end of it, Anyways, it was a very, very odd experience. And at the end of it, I thought that it went well, but not amazing.
Starting point is 00:24:49 But then I ended up getting the offer. Connor had reached his goal, but he was sort of hooked. He kept his YouTube channel going and people kept leaving comments. You should show Python. You should show Java. But I was a C++ developer. So I was focusing on C++. And then at some point, in order to appease all the folks
Starting point is 00:25:06 leaving comments, I added Java and Python. So I was solving solutions and I was solving problems in all three of those languages. And then at some point, probably it sort of started while I was at Amazon, we started a study group, principal engineer there wanted to learn more about lambdas and functional programming. So we started a small study group with like five or six people. And we started working through a Coursera course that was taught by Eric Meyer. Is this Eric Meyer with like the tie dyed shirt? It is. Yes, it is. Yeah. I think the course is called introduction of functional programming. We ended up using their textbook, and then to a real-world Haskell, which we only got halfway through. But yeah, it was probably the highlight of being at Amazon was this lunch study group where we would meet once a week.
Starting point is 00:25:54 Because yeah, it was basically like a C++ dev, the principal engineer. He was sort of a polyglot, but mostly C, C++, Java, that kind of stuff. And then other folks were Java developers. So it was mostly just like a bunch of people with experience in imperative or procedural languages, like none with any sort of functional experience. And stumbling through this textbook, having our eyes open was very, very cool.
Starting point is 00:26:18 Was there a moment you recall where it was like, this is really cool, this is something different? Yeah, I can't remember what chapter it was, but when they start to introduce like function composition with a dot operator in Haskell, like in C++20 now we're getting ranges, which are sort of more composable algorithms, but pre C++20, which it was 2018 at the time,
Starting point is 00:26:39 you know, I didn't know about any of that. And so seeing that, you know, like I said, if you want to, you know, sum the even numbers in a list in Haskell, that's just some dot filter even like, and it's technically a point free solution because you're not mentioning your arguments and like the equivalent solution in C++, even if you are making use of algorithms is it's going to be two like different algorithm calls. And so it's just a lot more noisy in C++. It might be a way more performant,
Starting point is 00:27:07 but it's a lot more noisy. And then just seeing it's not, it's less than a single line. It's three functions. It's sum, dot, filter, even. Even is a built-in predicate. So you don't have to build up a custom Lambda. It's just, at least for me,
Starting point is 00:27:21 it blew my mind, like how simple function composition and like having functions as first class in your language, how that can change the simplicity of the solutions you come up with. Now Connor is hooked not just on leet code, but on finding these elegant mind expanding solutions to leet code problems. And that leads him in an unexpected direction. So I heard about APL five different times between the year 2010 and 2019. I like that you kept the count. Well, I wasn't explicitly keeping track. I just, I found it curious that once every couple of years, this random language that I'd heard about first in
Starting point is 00:28:01 university was sort of popping up. So the very first time was in my second year of university, 2010. And I was in an actuarial mathematics course taught by the best professor I've ever had, Dr. Gary Parker. He was the, at the time, the chair of the program. And he just made a small remark during one of the lectures. And he said, oh, you know, back in the day, all actuaries, we all use this language called APL. And it was beautiful. It was just this super elegant language that it was like built for actuaries and scientists and mathematicians. The problem was, though, is that you could write in a single line what in another language might take literally 100. The problem, though, is that when you went to go back and read that, you know, or someone else was trying to read what you did, they could never understand it.
Starting point is 00:28:48 And so it sort of, it got this reputation of being a right only language. And so then it ended up sort of dying. That was the first time that I ever heard of APL and nothing more was said about it. And then fast forward four years to 2014, my older sister is an engineer and at the time was working at an engineering consulting company. And she just randomly was complaining about this software program called MetSim, which is short for metallurgical simulation. So she was an engineer that specialized in metallurgy. And they were using this simulation program for, you know, fluid dynamics. And apparently, like the scripting language or the language that you use to build up these little programs in it was APL. And I was like, huh,
Starting point is 00:29:30 that's interesting. I've heard about APL about four years ago. And she was like, yeah, it's the worst. I can't, it doesn't make any sense. And it was like the worst part of her job was this Metsim and having to figure out what APL is doing. She says it looked like hieroglyphics. And so I just, I thought it was interesting the second time I heard about it. And then two years later, my youngest sister, she's a quant that works at a private asset management firm in Canada called Connor Clark and Lund. She was remarking that she was using this database called KDB plus and a language with a single letter called Q. And I, at that time I was, you know, into programming and on my way to becoming a, you know, software engineer. And I looked up Q and I was like, that's interesting. I've never
Starting point is 00:30:09 heard of Q. Like it's, I don't know all the languages, but I, you know, she's, it seemed super esoteric. And then on the Q Wikipedia page, it mentioned that Q was basically a wordified version of K and K is a derivative of APL. And I was like, what the heck? So now two of my sisters are doing something that's APL adjacent. And then fast forward a couple more years, there was three podcasts on Functional Geekery, which next to Co-Recursive is one of my favorite podcasts. And they, episode 64, 65, and I think 76, all had guests talking about APL. And it was at this point in time, oh, actually I skipped the fourth one. There was a really funny story in January, 2019 called IOTA shaming,
Starting point is 00:30:52 where basically this Twitter battle erupted between C++ developers because someone had used IOTA in a piece of example code. And IOTA is an algorithm that generates a sequence of numbers that increase by one. So if you go IOTA is an algorithm that generates a sequence of numbers that increase by one. So if you go IOTA zero to 10 or whatever you get, zero, one, two, three, four, five, six, seven, eight, nine, 10. And IOTA comes from, the name IOTA comes from APL. And so this big Twitter
Starting point is 00:31:16 war erupted when someone said, ooh, look at how smart I am for knowing what IOTA is and was trying to sort of snarkily saying that we shouldn't be, you know, borrowing these really awful names from old dead languages. And then another individual came out and said, well, it's actually a historical name and it has meaning. Anyways, that was the fourth instance. The fifth instance was the podcast. And at that point, I was actually on a trip to Yosemite at the time. And my partner was asleep in the car and I would listen to podcasts while driving into Yosemite because we were staying just outside the park. And I'm listening to these people just gush about APL and how it's this beautiful quasi-functional language.
Starting point is 00:31:57 And then I finally decided to check it out. I went to a site called tryapl.org, where you can just sort of mess around with the language. And then at that point, immediately fell in love with the language, like within 10 minutes of using it. Yeah, that was a long winded story. Well, what what made you fall in love with it? So the context is that this is June 2019. And I had just finished giving a talk my very first conference talk called algorithm intuition, which sort of gave me this reputation of being an algorithm expert or someone that knew a lot about algorithms. And this was in May of 2019. So literally a month later, I'm back from Yosemite on this website.
Starting point is 00:32:40 And one of the first things that I do is a plus reduction, which is just summing up the numbers in a list. And then there's a very, and a reduction in APL is just a slash. So you do, you type plus slash and then a few numbers and it adds them up. And there's another algorithm in APL called a scan, which is just the other slash. So I'm not sure which one's forward or backward, but one slash is a reduction and the other slash is a scan. And what a scan does is it's basically a reduction, but it outputs all the incremental results as well. So if you have, you know, five numbers, one, one, one, one, one, and you do a plus reduction, the resulting answer is going to be five. If you do a plus scan on one, one, one, one, one, you get one, two, three, four, five. And so one of the properties of a scan is that the last element
Starting point is 00:33:29 is always equal to the equivalent reduction. So it's an algorithm that actually exists in C++ by the name of partial sum. So our reduction in C++ is called a std accumulate, and our scan is called a partial sum. And because accumulate and partial sum seemingly from a sort of naming point of view have nothing to do with each other, I never made the connection that basically they are like sibling algorithms. One just gives you the last element. The other one gives you all the incremental results. So like a scan and a reduction are extremely closely tied together. And I had never made the connection. Like, and so I supposedly just gave this talk where I'm talking about
Starting point is 00:34:13 algorithms and saying how they're so awesome. And then a month later, I'm at this website that is got all these crazy looking characters. And I know nothing about the language, but I made this observation just based on what the characters look like. They're basically vertically symmetric. The one's the opposite of the other one. And it was just like, so it happened in like 10 minutes. And my, my mind sort of was just blown. I was like, how did I not make this observation before? Like I've spent at this point, two years learning about algorithms, you know, practicing these competitive programming problems. How have I never made this observation? And I actually went in like searched in all the different languages, like what do languages
Starting point is 00:34:52 call these? And there's only two languages where there's any symmetry between the scan and the reduce names. One of them is a closure, which causes their reduction reduce, and they call their scan reductions. So you can sort of see that like, oh, it's actually just saying you're just doing incremental reductions. And then in D, which is sort of a competitor with C++, they call theirs, I believe, fold
Starting point is 00:35:17 and cumulative fold. Every other language, there's absolutely zero symmetry between the naming. Haskell calls them fold L and fold R and then scan L and scan R. And scan and fold are really good names for both of those algorithms. But unless if you sort of are able to see that relationship between the two, it's there to be discovered. You're not going to think about it just because of the names of these algorithms. And that was like, because it happened so quickly, I realized that like, whoever designed this language clearly put
Starting point is 00:35:49 a lot of thought into what these, what these characters visually mean, like that there's a relationship between if a character looks similar to another character, it's probably for a reason. And so people think about this language as an unreadable language, but that's, it's really not fair. I think, you know, we, we chat about this before, but I like to say that it's the same thing that people say about Chinese. If you don't read or speak Chinese, no one, you don't say that, oh, Chinese is unreadable. You just acknowledge the fact that you don't read or speak Chinese. For someone that does read or speak Chinese or any Asian language that doesn't have sort of, you know, the A to Z alphabet, those languages
Starting point is 00:36:26 are readable if you read them. And that's the same thing with APL. The problem is that every other programming language typically looks similar to each other and uses, you know, ASCII characters to represent words and variable names. So APL looks so different. So we have this like gut reaction to say, oh, that's the most unreadable thing ever. But if you take the time to learn how to read APL, it's honestly, in my opinion, the most beautiful language that I've ever seen. And the things that you can express where every Unicode symbol represents
Starting point is 00:36:52 an algorithm, it's just absolutely beautiful. Part of this beauty, Connor says, is the simplicity. In APL, everything is an array. So like a one rank array is just like on a list or a vector, a list of numbers. A two rank array is a matrix. A three rank array is, you know, a cube of numbers, if you will. And it keeps going on like that. Even like a single number is technically like a single element. There's no such thing as other data structures. Like, you know, you don't have like a hash map that you can build up. Like we, we were going to represent a hash map as some sort of two column array where the first column are the keys and the second column is the values. Another part of the beauty is the symmetry between what an operator looks like and the action it performs. So imagine
Starting point is 00:37:36 you have a matrix of numbers, a five by five grid, and you want to reverse the elements in the matrix. So you have five rows of one, two, 3, 4, 5, and you want 5, 4, 3, 2, 1. You do that with the reverse operator. So reverse is a circle with a straight line through it. If you can, you can visualize reversing the reverse symbol because it is vertically symmetric. If you flip it 180 degrees, if you picture it in like a 3D land, it's the same thing after you flip it 180 degrees along that sort of vertical line. And this circle with a vertical line through it is actually a visual description of the operation. If you imagine in your mind's eye, making a vertical line through the center column of our five by five matrix, every element in the
Starting point is 00:38:23 middle will stay there when we reverse it, but everything else rotates around it 180 degrees so that the first element becomes the last and the last element becomes the first. Transpose is the exact same thing, except instead of a straight line, it has a line at like a 45 degree angle. And a transpose is flipping a matrix
Starting point is 00:38:42 basically along that line. And so the way you can think of it is the top right number is going to get swapped with the bottom left number. And every other number sort of is going to follow that pattern. So if you're on the diagonal from the top left to the bottom right, those numbers all stay in the same place because that's the line that defines the 45 degree angle. If you have a matrix, it's like a square and you're folding it diagonally at 45 and taking the top part to the bottom and the bottom to the top.
Starting point is 00:39:13 Yeah, exactly. And that's what the transpose glyph looks like. So it's a circle with that same line, but instead of it being vertical, it's now along that diagonal. And so the way you can think about the reverse and the transpose algorithms or symbols is that they are basically visual representations of the axis that you're quote unquote rotating for lack of a better word around. And what's crazy is that I didn't see this for like months. I just was like, oh, that's transpose, that's reverse. And then at some point I was reading a blog article where it pointed out that like, oh yeah,
Starting point is 00:39:48 once you see that the reverse and the transpose, and there's also another algorithm called reverse first, which has a circle with a horizontal line through it, which is basically doing a reverse, but instead of from left to right, it's doing it from top to bottom. And the blog said, once you see it, you can never unsee it. And I was like, see what? And then I looked at it and I was like, yeah, oh my goodness. Like, how did I not, how did I not see that? Like, that is, it's amazing. Like, and that's the thing is so many of the APL symbols have stuff like that. Like, so there's an article by Roger Hui, who was the primary implementer of the J language, which is a daughter language or a derivative language of APL.
Starting point is 00:40:29 He has a blog called My Favorite Character. And his favorite character is a character called Log, which is for logarithms. And he says it's his favorite for multiple reasons. But one of them is it's because it's a circle with a star inscribed in it. So there's a star in APL, which is for exponents. So if you want to go one or 10 to the power of two, you go 10 star two. But for logarithms, it's a circle with that same star inside of it. And he points out that if you actually think about cutting a log sort of via profile, what you're really going to end up with is like
Starting point is 00:41:06 a circle and then circles for the rings inside of it. But that circle with a star in it is sort of an approximation for that profile view of a log. And once again, it's like something that I never thought of in a million years. You mentioned Chinese, like, is there some sort of similarity between this idea of like ideograms or something? Like it's visually. Yeah, yeah, absolutely. Yeah. The reason I know this is I lived in China for a year and at one point was not fluent, but I could speak, you know, colloquially and get around day to day.
Starting point is 00:41:36 And I had to study, you know, written Chinese as well, is the radicals typically have meanings. So like the character for tree basically looks like a tree. Think of a T where underneath each of the arms of the T has like two more lines that sort of form a triangle. So it looks like it has a sort of relationship to what a tree looks like. And an easier example is co, which is K-O-U in pinyon. That is the word, and it's just a square. That is the word for both mouth and door, I believe. So it's a square that sort of represents what a door might look like.
Starting point is 00:42:11 You know, if you look at a newspaper, it all just seems like overwhelming. Like, well, I need to go and learn 2,000 characters. How am I ever going to learn how to read it? But if you just look at a single character at a time and you break apart the radicals, you know, I believe it's Mu or Shu is the word for wood or tree. And that radical that looks like a tree shows up in a lot of like wooden things. So if you look at what the character is for chair or for table or things that are made of wood, it contains that wood radical. So like Chinese is actually very, very beautiful
Starting point is 00:42:43 in certain respects because it has that same kind of like visual meaning. Even if you have no idea what a certain character is, if you break it apart and you know what each of the radicals are, you can basically play a game of like, you know, what is wood plus something plus something, and then take a guess at what this character actually represents. And you probably will be wrong, but you might be close to actually what it is, which is really cool. Although there are some similarities between Chinese characters and APL, APL was created by a Canadian, Kenneth Iverson, in a book called A Programming Language, which is where APL gets its name from. And at that
Starting point is 00:43:15 time, APL, or as it was called at the time, Iverson Notation, it was a tool for teaching. It had nothing to do with programming languages. It was a tool for expressing algorithms. And that's how it was used in a programming language, the 1962 text. Fast forward a few years to 1966, and IBM had decided that they wanted to actually implement the notation as a language. And the IBM 360 is actually implemented in APL 360. And so 1966 was the first implementation. And then to shorten the story, you know, fast forward to 1980, Ken Iverson ended up going to a company called IP Sharp that was headquartered in Toronto. And it could have history sort of could have played out so much differently. In 1982, Bill Gates was doing like a whirlwind tour of North America and swung by IP Sharp and Associates in Toronto. And there was a meeting between a few of the IP Sharp folks and Bill Gates, and Bill Gates was getting ready to make the PC.
Starting point is 00:44:15 Amongst APLers and array programmers, there's this anecdote of sort of Bill Gates for a long time was a big fan of APL and had an APL sort of handbook in his desk. And right before the PC, there was a computer called the IBM 5100 that actually had a switch on it for both basic mode and APL mode. So you could just toggle between the two. So there's a time in history, right at the late 70s, where APL was just as prominent as basic. Unfortunately, the folks at IP Sharp and Associates thought that mainframes were just here to stay because that was their business, was mainframe computing, and didn't really see what was coming with the personal computer. And so the PC that Bill Gates ended up creating came with BASIC, not with APL, and never ended up shipping an APL. And because of that, we live
Starting point is 00:45:02 in a world where a lot of languages, I think, look closer to sort of the imperative style basics. But there is an alternate universe where Bill Gates decided to put APL on the PC instead of BASIC. And then we'd be living in a way different world today. So Connor started doing his leet code videos using APL. And he also reached out to this interesting character, an Iverson protege named Arthur Whitney. Arthur was involved in creating A, K, and J, all APL-inspired programming languages. The finance press refers to him as the reclusive genius of banking IT. I learned a lot about Arthur because of all my research. And then I randomly reached out to him and just asked if he would have time to chat. And then he responded.
Starting point is 00:45:46 This was in early 2019 before COVID really showed up. And he said, well, do you ever do you ever come down to New York? And I totally lied and said, yeah, all the time. And then he said, well, if you ever find yourself down here, let me know and we can have dinner and chat a bit. And so I booked a plane ticket and booked a hotel room and spent two or three days just around the corner from where he works. And then, yeah, I got to have dinner with him and chat with him over those couple of days. And he's
Starting point is 00:46:14 just an absolute genius. So I had heard of him, I think through Brian Cantrell, who's interviewed him. I think he went to his house and interviewed him. I don't know if it was Brian or somebody else who was saying that he made a deal with the devil that he would be the best programmer of all time, but the trade-off was that nobody would be able to read his code. I found online the original J interpreter or something like that, and it's written in C, and it's just like,
Starting point is 00:46:42 I don't know what it is. I started a project a couple months ago i'm slowly porting that code base the j source code to c++ 20 and yeah it's i think there should be a name for these types of code bases because it's not c it's like a macro variant of c where it's a cdsl 80% of your library, quote unquote, is macros. I did a search. There's 10,000 macros in the source code. And those macros are used like functions.
Starting point is 00:47:16 Yeah. It seems like there's a vague relationship to what you were saying before about the competitive coders using a whole bunch of shorthands to do their coding. It's actually funny that you mentioned that because that is like literally what the best competitive programmers do. The first thing that they do if they're not in actual ICPC and they're just competing on topcoder or codeforces is they copy and paste in a template which includes all the headers, includes the main function, but then also has a ton of macros defined, a bunch of which are actually like macros for for loops, because you can reduce like, you know, 10 or 15 characters that you have to type out with a simple do macro.
Starting point is 00:47:55 And I never actually noticed that that relationship that the some of the exact same macros that exist in the J code base are exactly what competitive programmers use. It's not just competitive coding that has a similarity to Arthur's array programming style, but data science is also very APL-like in some ways. And through a strange coincidence, while Connor was deep into APL land, he left Amazon for NVIDIA to work on a GPU-accelerated data science library. It's crazy that this happened, that now I work on a open source library that is inspired by pandas. And pandas was written by an individual named Wes McKinney, who used to work at a firm called AQR, which is a quant firm. And he was directly influenced by J,
Starting point is 00:48:40 which is the daughter language of APL. At NVIDIA, Connor's ability to solve something in an APL style is coming in very handy for writing code that will distribute across the thousands of cores of a modern GPU. For example, we explained reductions in scans earlier. A reduction is something that's basically you're taking an element
Starting point is 00:48:58 and applying a binary operation to it. You can do that in serial, just taking one element at a time. And, you know, if you've got the five numbers, you know, one, one, you can just go through and 1 plus 1 is 2, 2 plus 1 is 3, etc. That's a serial algorithm. But you can also do something called a tree reduction. You can basically construct like a tree, add two elements at a time, and then on the next pass, take the results of two of the first operations and add those. And then you end up at the end of the day getting the exact same answer, but you didn't do things in order. So obviously, if you only have a five element array, you're not going to see much of a difference. But if you've got like a 5 billion element array, then putting that on a GPU, it's basically going to chunk that up. It's going to sum up all the individual chunks, and then it's going to come and do some sort of tree reduction, and you're going to get the same result. And because it's doing that over thousands of cores, it's going
Starting point is 00:49:43 to be way, way, way faster than if you're doing it serially on a GPU where you can only take two numbers at a time in sequence. Why does an array language, why is that helpful for expressing these? Because branching doesn't really fit super well into the GPU model and array programming inherently avoids that.
Starting point is 00:50:03 And so many of the primitives in APL are like trivially parallelizable. So reductions and scans are two of the most library and C++. They basically have a stood reduce or a thrust reduce and a thrust inclusive and exclusive scan, which are just two different versions of scans. And those are like bread and butter parallel algorithms. So, so many of the primitives in APL are just inherently parallelizable. So you can map basically the primitives directly to a parallel algorithm that runs on the GPU. Whereas in other languages, the way you solve things, if you're doing branches, they don't trivially map to the GPU. Whereas array programming languages, it's basically the way they're designed is operating on these huge arrays or huge matrices. It makes it like the
Starting point is 00:51:00 match made in heaven from my point of view. It's not actually a fluke. It's like this language was designed by Iverson to be able to take in these arrays and matrices. And because of that, it ends up having this form where it ends up being data parallel. And then like GPUs exist. And oh, isn't that convenient? GPUs are great if you can do things data parallel. Exactly. Yeah.
Starting point is 00:51:24 Everything's an array. So it just, it works out perfectly. There's a cliche about different programming paradigms changing the way you think. I think sometimes when I hear it, I roll my eyes a little bit, but there's definitely truth to it. And it's hard to put your finger on exactly where the truth lies. To me, it's a little bit like living abroad. I was an exchange student in university. I did a small part of my undergrad degree in Australia. And when you travel, you eventually go home. But for me, what changed based on my travels was how I view my homeland. Just living somewhere else for long enough to experience life there, you kind of start to question little assumptions about how you lived before. Connor is still working in C++, but when he's doing that, sometimes he's thinking about how
Starting point is 00:52:04 he would solve a problem in APL. And I think that, that different perspective that you bring home to your primary programming language, that's the power of learning how to solve problems in a different paradigm. So that was the show. I actually talked to Connor for quite a while and I made him walk me through the ice cream problem
Starting point is 00:52:21 from LeetCode that we talked about using APL. I'm gonna try to figure out how to put a video of that up on the episode page, which you can find in the show notes in your podcast player. Also, Connor and Bob Theriault and several others started a podcast all about array programming languages. I'll put a link to that on the episode page as well. Until next time, thank you so much for listening. Is that right? No. Or actually, that was right, but I should have said shift four. Back tech shift four. There we go.
Starting point is 00:52:50 And then looking at a weird arrow, the point it's, what is it? Is this triangle? And then this is a triangle with a line through it. Yeah. Line through. Okay.
Starting point is 00:52:58 And then type X again and right bracket. A bracket. And then bracket and then hit enter you can just use up arrow actually or yeah up arrow might work as well and then if you put 6 at the beginning of the expression and then back tick
Starting point is 00:53:18 shift 6 that's the transpose so yeah back tick shift 6. Or maybe I've got this wrong. Oh no, sorry, without the shift. Backtick 6. Oops, backtick 6.
Starting point is 00:53:34 There we go. And so if you hit enter on this now, we're going to get our Boolean array. Oh, okay. Okay.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.