CoRecursive: Coding Stories - Story: From Competitive Programming to APL
Episode Date: June 2, 2021Today on the show, we have solving algorithmic programming problems. You know when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard using C and in const...ant memory space? It's that kind of thing. These problems have their roots in algorithmic programming contests. And our guest, Conor Hoekstra, is a former competitor. Episode Page
Transcript
Discussion (0)
Hello, and welcome to Co-Recursive. I'm Adam Gordon-Bell, the host.
Each episode, someone shares the fascinating story behind some piece of software being built.
And today on the show, we have solving algorithmic programming problems.
You know, when you interview for a job to write CSS and they ask you to reverse a binary tree on the whiteboard
using C and in constant memory space, it's that kind of thing.
These problems have their roots in algorithmic programming contests, and our guest is a former competitor.
My name's Connor Hoekstra,
and I currently am a senior library software engineer
working for NVIDIA on the Rapids QDF library.
Connor's story starts in university,
when he was competing in programming contests.
But it's interesting how if you follow a single idea long and hard enough, you end up in interesting places,
like writing software for GPUs. For Connor, that idea is finding the perfect way to express the
solution to a programming problem. His journey will take him into leet code, into getting his
dream job, and all the way to APL, which is one of the more unusual and obscure programming languages.
But I think I'm getting ahead of myself. This story starts in 2012, when Connor was still in
university. I wasn't a computer science student at the time, but I was taking some computer science
classes. And I just happened to walk by the computer science lab. And there was a poster
on the wall that said competitive programming contest, free pizza, you know, meet Saturday at lunch.
And I like competing in things. And it sounded fun. I had really no idea what it was about.
So I showed up. So I studied at Simon Fraser University, which is in Vancouver, BC. It was
we had a bunch of computer labs there. So it took place there. If I recall, there's probably about
20 to 30 people that were competing.
And you basically just sit down at a Linux workstation.
They give you a couple instructions.
You code in any of a list of languages, but most of the popular ones, C++, Java, Python,
et cetera, are all usable.
You code up your solution from start to finish.
This was a qualifier for a contest called ICPC, the Intercollegiate Programming Contest. You have to code like a whole text file from start to scratch. So if
you're doing it in C++, you have to include all the include statements. You have to have a main
function. And typically they give you a specification as your input will be one integer
that represents T, the number of test cases, and then, you know then five lines per test case where each line represents the data
that you need to solve it.
So you have to be responsible for reading in that data
and then outputting to whatever is standard out
to solve it correctly.
Then you submit the file.
Like it's a single file, you submit it,
and then they compile it
and they hit it with test cases through standard in.
Yeah, so typically they'll give you two or three test cases. So they'll give So typically, typically they'll give you two or
three test cases. So they'll, they'll give the problem statement. They'll give you two or three
simple ones that you can test to see whether you've gotten the basic cases, right. And then
you submit a dot CPP for C plus plus, or a dot P Y for Python. And then they're going to, it's
going to run against a bunch of tests and you might get back a failure. So if you've missed
some corner case, maybe it was an empty list or an empty string that you didn't take care of. It's not going to tell you what test case you failed on.
It's just going to tell you that you failed and you have to go and stare at your code and figure
out, okay, what's the corner case I missed and then resubmit once you think you found it. And
then finally you'll get like a pass and then you move on to the next problem. And you basically,
the way competitive programming works is you have
a set amount of time. For this contest, it was two hours. You have a set of problems. I think
there was six or eight of them. And the goal is to solve as many as possible, as fast as possible.
I, in my first attempt at this, didn't know how to actually submit the problem. So I solved,
I think, three or four of them and didn't understand that there was a time penalty,
like the longer you waited. So when there was 10 minutes left, I went up to the
individual, the prof that was running the contest and said, Hey, I'm, I, I'm being unsuccessful with
submitting these. He sort of looked at me strangely, like, that's, that's the whole aim of
this. Like, what are you talking about? So you haven't like, he said, how many of you submitted?
And I said, well, I've solved four, but I haven't submitted any. And so he just sort of shook his head, came over.
I ended up submitting a couple of them, but then ran out of time.
But luckily, there was a second qualifier.
And so I sort of went back to my dorm room, figured out how to actually compete in these.
And then in the second one, I did well enough to get on to one of the university teams.
It was like the B or the C tier team.
So you joined the team.
Like, were you excited? the university teams. It was like the B or the C tier team. So you, you joined the team, like,
were you, were you excited? Did you have to like meet the team or what's the process after you made it onto the team? So yeah, once, once we joined the team, um, it was three members and
we basically had practice contests once every couple of weeks. Um, and this was just over a
semester. And then we, we ended up all three teams, like the A team,
the B team and the C team ended up going to the regionals. And we didn't end up doing that well,
because we were we were competing with people that have been studying, you know, their whole
university careers at this just practicing at competitive coding, because it's one thing to
get past the beginner level, it's just solving simple algorithmic things the intermediate level you need to know basically like general classes of
algorithms but to get to the super advanced level you basically are like memorizing all these custom
different like searching uh algorithms and so you'll read a problem and know like oh that's the
the floyd insert name insert name specific algorithm that you need to use uh and that just
takes time and energy.
And you're actually allowed to bring binders of pre-written code
into these contests.
So as long as you can recognize what algorithm is needed to solve it,
a lot of times it's just time and energy of studying these esoteric algorithms
that otherwise you probably wouldn't end up needing to know.
So you'll have a binder with all these tabs on it, you'll be like flip to that and start just typing that in.
Yeah, yeah. And typically I see or not typically the way that ICPC works.
Once you're in the regionals, you have teams of three people and one computer.
So the way that it's typically working is that you've got one person coding up their solution that they've already handwritten and solved sort of on paper.
And so two people are handwriting their code on paper and then you swap out to type. So you
basically need to be able to solve your problem without a keyboard. And a lot of the times when
you're doing that, if you can just pull a piece of paper out and say, oh, this is the algorithm
that I need, you can just scribble some notes around it for reading in and reading out
or writing out your answer and you're good to go.
The algorithm can't just be correct.
It also has to be performant
because every test case has to be handled
in two seconds of CPU execution time.
And for simple problems, it doesn't matter
because the input size is not that large.
But when you get to the harder problems,
one of the first things you do
is you look at what's the size.
So it's a 10,000 array. You know that basically you only have one times 10 to the eight operations. So if you've got one times 10 to the five elements in your array, a quadratic algorithm, you know looking at the size of the input and automatically like ruling out some brute force quadratic algorithm, which is sort of a neat part of the problem, too, is some people say, oh, how come you don't just do this?
And it's like, well, we know that that's not going to work.
And that leads to, I think, 90% plus of folks are using C++ because in those problems, like the difference between using Python and C++ is going to be an accepted solution. So I guess it would give you a neat insight into complexity of solving problems because
you have practice that sort of cutting things out just based on knowing that it won't meet
this barrier.
Yeah, it's interesting because that doesn't typically come up in the real world where
you know you have two seconds to solve something.
So then, oh, the algorithm I'm
choosing or the code that I'm writing is going to be completely driven by time complexity.
Typically, you write the code and then you optimize it later if you find that something's too slow.
You're not starting with time complexity and then working backwards. But it definitely is a good
way to learn the difference between linear, quadratic, NLogN, stuff like that.
Connor's team went on to compete at the Pacific Northwest Regionals.
The regionals take place in a large computer lab.
Everyone is huddled around computers.
And what they do is they have colored balloons for each problem.
So like problem A will have a red balloon.
Problem B will have an orange balloon.
And every time a team solves a problem, they come and bring these helium-filled balloons
and tie it to your computer.
So you can look around the lab
where the live contest is happening
and see just by how many balloons
are hanging above someone's computer,
basically who's winning.
So that whole part of it, I thought was super, super cool.
But then, yeah, so after that one regional contest,
we definitely, I don't think any of the SFU teams,
I think probably UBC and Stanford, they're typically the two teams that do the best.
Sometimes Berkeley pops up there.
And after that, yeah, that was basically sort of the end of my university competitive programming
career.
Five years later, competitive coding long behind him, Conor got invited to a job interview at Bloomberg for a software developer role. And at the time I wasn't a software engineer
per se. I was a programmer that used code at my day job as an actuary. So I worked for an
insurance firm. Bloomberg ended up reaching out to me and I said, sure, I'd love to interview,
even if it's just, you know, just to learn what it's like. And they said, okay, we're going to
do our interview on a site called HackerRank.
They've got this sort of side site called CodePair, which enables sort of this live streaming, sort of like a Google Doc, but specifically with an editor.
And you can you can actually test your code live.
So you write a function and then you can test it on the sample input.
And they said, if you're not familiar, you know, check out HackerRank.
Here's a couple of problems. And that was the first time I had heard of HackerRank. So I ended
up solving a few of the problems, thought it was awesome. And I was like, oh, yeah, this is just
like what I used to do five years ago and then ended up just absolutely bombing that interview.
It was so bad. Definitely one of the worst interviews I've ever had in my life.
What happened?
We were doing CodePair. I don't remember the exact question.
I'm pretty sure the answer to it was using std partition,
which is just a single liner from the standard algorithm library,
which basically partition means something different in every language.
But in C++, you'd give partition basically a container,
a sequence of elements, and a unary predicate.
And it puts all of the elements that return true for the unary predicate at the beginning,
and all the ones that return false at the end. So if you have the numbers 1 to 10,
and you do a partition with is even, the even numbers 2, 4, 6, 8 will all show up at the
beginning, followed by 1, 3, 5, 7. So I'm pretty sure that was what the answer to the question was. I didn't even know what a
partition was at that point in time. So I wasn't going to be writing that down. But it got to this
point where I needed to know how to, like, I think swap two iterators or something like that.
And I asked if there was a built-in function for swapping two iterators. There is. It's called
iter underscore swap. But at the time, I didn't know about it. And the interviewer said, yes,
there is. And I said, do you mind telling me what it is? And he said, no.
And I'm pretty sure this was just the warm-up question. And I stumbled my way through it, but that was sort of the end of the interview.
And just, so it was just a lack of my knowledge of the standard library.
It was just sort of eye-opening too, because I was not used to the classic big tech company
where they just ask you like algorithms and data structures questions.
Like every single company, Google, Facebook, you name it, that are getting thousands and
thousands of applicants.
They basically have a standard format where they phone screen you.
If you do well enough, you go to on-site.
And then on the on-site, you're going to have four out of five of the interviews
purely being on data structures and algorithms.
And they don't care what language you do it in.
But yeah, you have to get good at answering these sort of questions
that aren't going to ever show up in your day job.
You're going to be being hired to write some front-end code that doesn't make use of this stuff, but that's how they test you.
So I basically went on a journey of over a year just trying to get really, really good at what they call leak code problems, or aka competitive programming problems.
I had a similar experience at some point.
I interviewed, I don't know how long this was.
I suspect I'm older than you,
but I interviewed at Stack Overflow,
you know, the like the website
where they tell you answers to questions.
And I wasn't expecting, yeah,
like one of these algorithmic questions.
And I just bombed out.
And the guy who was doing it was,
he was weary of the world.
Like he probably had like four of these a day and was
barely paying attention, right?
He was the first guy.
They probably have this giant funnel where
most people never make it past that first stage.
And he was like, he barely had
effort to tell me, yeah, this isn't
going well. We're just going to hang on here.
I'm going to say no and then I'm on to the next
meeting. But I found
it like super painful to be rejected like that. Cause I was, I felt like I was a good software
developer. Just not at that. I mean, did you have that kind of response? I honestly can't recall
exactly how I felt. I think probably in the moment for me, it was a bit demoralizing
because you're so excited about this opportunity. At the same time though, like my perspective on
myself was I wasn't really a software engineer. Like I didn't even, I didn't even think that I
qualified for this interview. I was just taking it because I thought, ah, what the hell, what's
the worst that could happen? And then, you know, probably the worst that could happen did happen. And then, you know, probably the worst that could happen did happen. Um, um, but more
of my thought was that I had gone through a few of the hacker rank questions and I knew that like,
that's something that I could be extremely good at. And I, I saw the interview format and so like,
it sucks to bomb, but, um, I knew it's just a matter of time and energy. Like those types of questions
compared to what you have to do in actuarial science
and these crazy five-hour exams that they make you take,
that seemed a lot easier to do.
It's just time and energy that you need to put in.
And there's a lot of people that will tell you
that it's just disheartening
that like it has nothing to do with your day job
and it takes time and energy
and people a lot of the times are like,
well, why do I have to go and learn this sort of extra skill that is passing these interview questions in order to do
something that I already know how to do? I completely agree with that. Like it's a broken
model, but I feel like it's the same quote about like democracy is the best, worst form of government
or whatever that we have. Like, how do you, how do you do a detailed interview process that is not prejudiced against, you know, certain demographics of people?
And that sort of, it's quasi-standardized, but also looking at more than just your ability.
Like, it's a hard problem to solve.
And this is the best solution that these big tech companies have at the moment.
Do you know, like, the concept of signaling, like from economics?
I do not, I don't think.
Economists sometimes talk about whether university,
whether it's actually valuable
or whether it's just a form of signaling.
For an education to work as signaling,
it could work as this.
I want to hire people,
but I want them to be in the top 20% of smartness.
So I just find some bar,
like getting a physics degree that has nothing to do with the job that I'm prepared to do, but it effectively acts as a screen. In a pure signaling
world, the person going into the physics program leaves no smarter and no better to do the job
that I'd like to hire them from. But by me hiring from that pool, all those people are pre-screened. The person who
goes into physics, they, you know, they learn nothing in the program, but they're buying the
signal that says I was able to pass this program. So there can be actually no educational value
in a pure signaling world, but it can still be like an effective method. It's probably not good
in the greater sense, but like both parties in an economic sense could be motivated to do something like that.
I think a coding challenge, you could view it that way and say, this won't have anything to do with your job, but we know that the people who pass these seem to be able to do their jobs.
Yeah, there's probably a lot of that behind the process, even, even for like so many of the credentials behind certain professions.
You'll hear people talk about like, well, this test is, it's not something that I'm actually learning skills to do my, my, whatever the domain is to, you know, to do on my day job.
It's just a hoop. But they make folks jump through this hoop because of that reason. Like the folks
that can jump through the hoop, typically it's got some correlation with being able to execute well
on whatever it is that that profession
is responsible for doing.
Now Connor knows what the hoop is.
He just has to jump through it.
And because of his competitive coding experience,
he actually has a great strategy
for getting where he needs to be
to get through these job interviews.
Basically, LeetCode has a contest every Saturday,
depending on what time zone you're
in. It's typically in the evening on the Eastern Standard Time Zone. And I would do that contest
every week. And then all the other websites, HackerRank, CodeForces, TopCoder, HackerEarth,
there's a couple others. They have contests going on, you know, sometimes they're monthly,
sometimes they're weekly. So there's no shortage of contests that you can actually take part in, which is, I think, the best for like simulating an interview experience.
But there's also just thousands of questions in the backlog.
You can go look at all these past contests and just solve whichever ones you want.
I slowly, slowly but surely just started competing and practicing.
And whenever I would fail to get a question, go back and look at other folks' answers to the questions. And slowly but surely, you just, you start to improve.
So when you were competing on these Saturdays, like, did you get your computer set up? You
brew yourself a coffee and sit there? Do you have your binder? Like, I don't know,
what's the process? Definitely not coffee because they typically are at like 9.30 or 10.30 at night.
So if you're having coffee that late, you're not going to sleep anytime soon.
But yeah, so the LeetCode contest is four questions, 90 minutes.
And it's crazy.
The top people finish all four questions in like 20 minutes, all of them.
And they get the first one within like 30 seconds.
Like I don't even know how they read them in that amount of time.
And so yeah, you typically start writing on a given contest.
When I was sort of at the height, I would usually get either three out of four or four out of time. And so, yeah, you typically start writing on a given contest. When I was sort of at the
height, I would usually get either three out of four or four to four. Obviously, the contests
where you are able to solve them all, it's super awesome. Probably like the highlights
was every once in a while, I would get the first one or two super quickly within like a few minutes.
And then they show you a leaderboard of the current people that are in
the lead. So a couple of times I managed to be in the top 10 for the first couple of problems.
I think the highest that I ever ranked on a contest was like 60th or 70th out of a few
thousand people. So I'm, I'm definitely not like an expert level competitive programmer. And so,
yeah, you, you solve the first two problems rather quickly. I mean, this past contest that just
happened two days ago,
the first problem I can't actually recall
because it was super easy.
And the second problem was called max ice cream.
And it was basically, you're given a integer value
and then a list of numbers.
And the list of numbers represent
prices of different ice creams.
And then the integer value you get
is a total amount of money.
So you say you're given $6
and then you've got ice cream prices one, two, three and four.
But those might it doesn't need to be sorted. It can be in any order.
So the question is, what's the maximum number of ice cream cones you can buy with the money that you have?
So in an imperative language, you typically the way that I saw being solved was you sort all the numbers and then you have a for loop that keeps track of, you know, how many ice creams you've bought and the total money that
you have left. And so then you just, you keep buying the cheapest ice cream that you can until
you run out of money and then you return the integer. Um, but there's a different ways you
can solve that functionally. Like there's a super nice way you can solve that in an array programming
language and note that LeetCode doesn't actually have any array. So they only give you a list of, I think, 10 or 15 languages you can solve from.
So if the website doesn't support it, you can't code in it. But all the popular languages,
Python, C++, Java, Rust. I saw they added Racket the other day, which is a Lisp, which is not as
popular, I would say. So they are starting to add more and more languages. And yeah, the 90 minutes
are up. And then you see where you're ranked at the end of the day.
And then the really cool thing about all these sites
that I haven't mentioned
is that they keep track of your rating,
which is for me probably was what like drove me the most.
Like obviously I wanted to do well
in the interviews and stuff,
but seeing a graph of your rating over time,
which hopefully is going up, is like super motivating.
And it shows you like what percentile you're in.
So I think I'm in like the 98th percentile or something.
But when I started out, I obviously wasn't.
And so each time that you compete,
if you finish higher than what your average,
you know, finish has been up to that point,
like your rating is going to go up.
And so over time, your rating slowly goes up.
And I don't know, for me,
that's a very encouraging and motivating thing
is to see that you're doing better.
And, you know, oh, it's Saturday. You will I will my rating go up? We'll see.
What was your goal here? Like, did you imagine yourself like, I'm gonna,
I'm gonna hit number one on the leaderboard? And then what were you picturing?
I definitely had zero goals to be number one, not even I think top 100. If you can get that,
it's like amazing. It, it's sort of like
entering a, a marathon or a half marathon or a 10 K like your goal can't be to be number one.
Like that's, there's just, there's just people that are going to beat you. Um, but you know,
having some goal of being in the top 5% or something like that's, that's an achievable goal
depending on where you're starting out from. So, you know, over time, obviously you want to see
your rating go up, but ultimately the reason I was spending all this time practicing these
problems was to get a job at a technology first company. So that's the thing is while I was
working at my first career as an actuary, I was falling in love with the software engineering
side of things. And at a certain point I decided what I wanted to do was to switch to a company
that was sort of technology first,
like big tech companies like Google and Facebook, you know, they are technology companies first.
I have friends that have sort of done the same thing and their strategy has been going back to
school. I had talked to folks and they just said, if you can just learn how to get good at these
interviews, you can probably land a job. And so over the period of a year, I just kept practicing.
And then like the like the plus side of
it is what you just mentioned there, right? Is that it does make it an open field for people
to apply for these jobs. It's not like you need to have a computer science degree from a certain
place. Like there is a clear path that people can take, which can be empowering. On one hand,
yes, there is no quote unquote discrimination based on credentials. At the
same time, though, it did take me like a year of practicing this stuff. There are definitely
arguments that say that depending on where you are at in your life, you don't have the ability
on top of a full time job to then go and sort of train yourself in these sort of interview type
questions. But I do agree that, you know, in some sense,
it is very meritocratic, given that you have the time to go and sort of train yourself on how to
answer these types of questions. Connor didn't just train himself,
he started a YouTube channel where he would explain solutions to others. And eventually,
he got an interview at a technology first company, Amazon.
It's funny, because at that point, you know, I had progressively been getting better
and better at the interviews, but I went into the Amazon interview. And if I recall, I think like
three out of the five interviews, I thought I absolutely crushed, but then one of them,
I thought I didn't do as well in the fifth interview that wasn't algorithms and data
structures was a behavioral question. And in the Amazon interviews, they ask you not to sort of
retell
the same, like you get behavioral questions, like small behavioral questions in each interview.
And Amazon has these 14 leadership principles that you're supposed to weave into your answers,
like customer obsession, you know, think big and stuff like that. So you're supposed to,
you know, if you've done your research, you're supposed to reverse engineer,
what's the leadership principle they're trying to figure out here and then weave that exact leadership principle into your answer.
And so the last interview I thought went awfully.
I'd answer the question and then she told me that I didn't answer the question.
And I said, well, I have another example that I think is way better, but I sort of talked about it in a previous interview.
She's like, OK, just go with that one. And then after I answered it that way, she's like,
why didn't you just say that from the beginning? And I was like, well, I didn't want to say the
same thing. And she's like, you need to learn how to market yourself better. So she started talking
to me about how I wasn't approaching answering these questions the right way. And she's like,
that was a fantastic answer. And if you hadn't just started with it, anyways, it was a very,
very odd experience. And at the end of it, Anyways, it was a very, very odd experience.
And at the end of it, I thought that it went well, but not amazing.
But then I ended up getting the offer.
Connor had reached his goal, but he was sort of hooked.
He kept his YouTube channel going and people kept leaving comments.
You should show Python.
You should show Java.
But I was a C++ developer.
So I was focusing on C++.
And then at some point, in order to appease all the folks
leaving comments, I added Java and Python. So I was solving solutions and I was solving problems
in all three of those languages. And then at some point, probably it sort of started while I was at
Amazon, we started a study group, principal engineer there wanted to learn more about
lambdas and functional programming. So we started a small study group with like five or six people. And we started
working through a Coursera course that was taught by Eric Meyer. Is this Eric Meyer with like the
tie dyed shirt? It is. Yes, it is. Yeah. I think the course is called introduction of functional
programming. We ended up using their textbook, and then to a real-world Haskell, which we only got halfway through.
But yeah, it was probably the highlight of being at Amazon was this lunch study group where we would meet once a week.
Because yeah, it was basically like a C++ dev, the principal engineer.
He was sort of a polyglot, but mostly C, C++, Java, that kind of stuff.
And then other folks were Java developers.
So it was mostly just like a bunch of people
with experience in imperative or procedural languages,
like none with any sort of functional experience.
And stumbling through this textbook,
having our eyes open was very, very cool.
Was there a moment you recall where it was like,
this is really cool, this is something different?
Yeah, I can't remember what chapter it was,
but when they start to introduce like function composition
with a dot operator in Haskell,
like in C++20 now we're getting ranges,
which are sort of more composable algorithms,
but pre C++20, which it was 2018 at the time,
you know, I didn't know about any of that.
And so seeing that, you know, like I said,
if you want to, you know,
sum the even numbers in a list in Haskell, that's just some dot filter even
like, and it's technically a point free solution because you're not mentioning your arguments and
like the equivalent solution in C++, even if you are making use of algorithms is it's going to be
two like different algorithm calls. And so it's just a lot more noisy in C++.
It might be a way more performant,
but it's a lot more noisy.
And then just seeing it's not,
it's less than a single line.
It's three functions.
It's sum, dot, filter, even.
Even is a built-in predicate.
So you don't have to build up a custom Lambda.
It's just, at least for me,
it blew my mind,
like how simple function composition
and like having functions as first class in your language, how that can change the simplicity of the solutions you come up with.
Now Connor is hooked not just on leet code, but on finding these elegant mind expanding solutions to leet code problems.
And that leads him in an unexpected direction.
So I heard about APL five different times between the year 2010 and 2019.
I like that you kept the count. Well, I wasn't explicitly keeping track. I just, I found it
curious that once every couple of years, this random language that I'd heard about first in
university was sort of popping up. So the very first time was in my second year of university, 2010. And I was in an actuarial mathematics course taught by
the best professor I've ever had, Dr. Gary Parker. He was the, at the time, the chair of the program.
And he just made a small remark during one of the lectures. And he said, oh, you know,
back in the day, all actuaries, we all use this language called APL.
And it was beautiful.
It was just this super elegant language that it was like built for actuaries and scientists and mathematicians.
The problem was, though, is that you could write in a single line what in another language might take literally 100.
The problem, though, is that when you went to go back and read that, you know, or someone else was trying to read what you did, they could never understand it.
And so it sort of, it got this reputation of being a right only language. And so then it ended up
sort of dying. That was the first time that I ever heard of APL and nothing more was said about it.
And then fast forward four years to 2014, my older sister is an engineer and at the time was working
at an engineering consulting company. And she just randomly was complaining about this software
program called MetSim, which is short for metallurgical simulation. So she was an engineer
that specialized in metallurgy. And they were using this simulation program for, you know,
fluid dynamics. And apparently, like the scripting language or the
language that you use to build up these little programs in it was APL. And I was like, huh,
that's interesting. I've heard about APL about four years ago. And she was like, yeah, it's the
worst. I can't, it doesn't make any sense. And it was like the worst part of her job was this
Metsim and having to figure out what APL is doing. She says it looked like hieroglyphics.
And so I just, I thought it was interesting the second time I heard about it. And then two years later, my youngest sister,
she's a quant that works at a private asset management firm in Canada called Connor Clark
and Lund. She was remarking that she was using this database called KDB plus and a language with
a single letter called Q. And I, at that time I was, you know, into programming and on my way to becoming a,
you know, software engineer. And I looked up Q and I was like, that's interesting. I've never
heard of Q. Like it's, I don't know all the languages, but I, you know, she's, it seemed
super esoteric. And then on the Q Wikipedia page, it mentioned that Q was basically a wordified
version of K and K is a derivative of APL. And I was like, what the heck? So now two of my sisters
are doing something that's APL adjacent. And then fast forward a couple more years, there was three
podcasts on Functional Geekery, which next to Co-Recursive is one of my favorite podcasts.
And they, episode 64, 65, and I think 76, all had guests talking about APL. And it was at this point in time, oh, actually I skipped the fourth one.
There was a really funny story in January, 2019
called IOTA shaming,
where basically this Twitter battle erupted
between C++ developers
because someone had used IOTA
in a piece of example code.
And IOTA is an algorithm
that generates a sequence of numbers that increase by one. So if you go IOTA is an algorithm that generates a sequence of numbers that increase
by one. So if you go IOTA zero to 10 or whatever you get, zero, one, two, three, four, five, six,
seven, eight, nine, 10. And IOTA comes from, the name IOTA comes from APL. And so this big Twitter
war erupted when someone said, ooh, look at how smart I am for knowing what IOTA is and was trying
to sort of snarkily saying that we shouldn't be,
you know, borrowing these really awful names from old dead languages. And then another individual
came out and said, well, it's actually a historical name and it has meaning. Anyways,
that was the fourth instance. The fifth instance was the podcast. And at that point,
I was actually on a trip to Yosemite at the time. And my partner was asleep in the car and I would listen to podcasts
while driving into Yosemite because we were staying just outside the park. And I'm listening
to these people just gush about APL and how it's this beautiful quasi-functional language.
And then I finally decided to check it out. I went to a site called tryapl.org,
where you can just sort of mess around with the language. And then
at that point, immediately fell in love with the language, like within 10 minutes of using it.
Yeah, that was a long winded story. Well, what what made you fall in love with it?
So the context is that this is June 2019. And I had just finished giving a talk my very first
conference talk called algorithm intuition, which sort of gave me this reputation of being an algorithm expert or someone that knew a lot about algorithms.
And this was in May of 2019.
So literally a month later, I'm back from Yosemite on this website.
And one of the first things that I do is a plus reduction, which is just summing up the numbers
in a list. And then there's a very, and a reduction in APL is just a slash. So you do,
you type plus slash and then a few numbers and it adds them up. And there's another algorithm
in APL called a scan, which is just the other slash. So I'm not sure which one's forward or backward, but
one slash is a reduction and the other slash is a scan. And what a scan does is it's basically
a reduction, but it outputs all the incremental results as well. So if you have, you know,
five numbers, one, one, one, one, one, and you do a plus reduction, the resulting answer is going to
be five. If you do a plus scan on one, one, one, one, one, you get one, two, three, four, five. And so one of the properties of a scan is that the last element
is always equal to the equivalent reduction. So it's an algorithm that actually exists in C++
by the name of partial sum. So our reduction in C++ is called a std accumulate, and our scan is called a partial sum.
And because accumulate and partial sum seemingly from a sort of naming point of view have nothing
to do with each other, I never made the connection that basically they are like sibling algorithms.
One just gives you the last element.
The other one gives you all the incremental results.
So like a scan and a reduction are extremely closely tied together. And I had never
made the connection. Like, and so I supposedly just gave this talk where I'm talking about
algorithms and saying how they're so awesome. And then a month later, I'm at this website that is
got all these crazy looking characters. And I know nothing about the language, but I made this
observation just based on what the characters look like. They're basically vertically symmetric. The one's
the opposite of the other one. And it was just like, so it happened in like 10 minutes. And my,
my mind sort of was just blown. I was like, how did I not make this observation before? Like I've
spent at this point, two years learning about algorithms, you know, practicing these competitive programming problems.
How have I never made this observation?
And I actually went in like searched in all the different languages, like what do languages
call these?
And there's only two languages where there's any symmetry between the scan and the reduce
names.
One of them is a closure, which causes their reduction reduce, and they call their scan
reductions.
So you can sort of see that like, oh, it's actually just saying you're just doing incremental
reductions.
And then in D, which is sort of a competitor with C++, they call theirs, I believe, fold
and cumulative fold.
Every other language, there's absolutely zero symmetry between the naming.
Haskell calls them fold L and fold R and then scan L and scan R.
And scan and fold are really good names for both of those algorithms.
But unless if you sort of are able to see that relationship between the two, it's there
to be discovered.
You're not going to think about it just because of the names of these algorithms.
And that was like, because it happened so quickly, I realized that like, whoever designed this language clearly put
a lot of thought into what these, what these characters visually mean, like that there's a
relationship between if a character looks similar to another character, it's probably for a reason.
And so people think about this language as an unreadable language, but that's, it's really not fair. I think,
you know, we, we chat about this before, but I like to say that it's the same thing that people
say about Chinese. If you don't read or speak Chinese, no one, you don't say that, oh, Chinese
is unreadable. You just acknowledge the fact that you don't read or speak Chinese. For someone that
does read or speak Chinese or any Asian language that doesn't have sort of, you know, the A to Z
alphabet, those languages
are readable if you read them.
And that's the same thing with APL.
The problem is that every other programming language typically looks similar to each other
and uses, you know, ASCII characters to represent words and variable names.
So APL looks so different.
So we have this like gut reaction to say, oh, that's the most unreadable thing ever.
But if you take the time to learn how to read APL, it's honestly, in my opinion, the most beautiful language that
I've ever seen. And the things that you can express where every Unicode symbol represents
an algorithm, it's just absolutely beautiful. Part of this beauty, Connor says, is the simplicity.
In APL, everything is an array. So like a one rank array is just like on a list or a vector, a list of numbers. A two
rank array is a matrix. A three rank array is, you know, a cube of numbers, if you will. And it
keeps going on like that. Even like a single number is technically like a single element.
There's no such thing as other data structures. Like, you know, you don't have like a hash map
that you can build up. Like we, we were going to represent a hash map as some sort of two column
array where the first column are the keys and the second column is the values. Another part of the
beauty is the symmetry between what an operator looks like and the action it performs. So imagine
you have a matrix of numbers, a five by five grid, and you want to reverse the elements in the matrix.
So you have five rows of one, two, 3, 4, 5, and you want 5,
4, 3, 2, 1. You do that with the reverse operator. So reverse is a circle with a straight line
through it. If you can, you can visualize reversing the reverse symbol because it is vertically
symmetric. If you flip it 180 degrees, if you picture it in like a 3D land, it's the same thing after you
flip it 180 degrees along that sort of vertical line. And this circle with a vertical line through
it is actually a visual description of the operation. If you imagine in your mind's eye,
making a vertical line through the center column of our five by five matrix, every element in the
middle will stay there when we reverse it,
but everything else rotates around it 180 degrees
so that the first element becomes the last
and the last element becomes the first.
Transpose is the exact same thing,
except instead of a straight line,
it has a line at like a 45 degree angle.
And a transpose is flipping a matrix
basically along that line.
And so the way you can think of it is the top right number is going to get swapped with
the bottom left number.
And every other number sort of is going to follow that pattern.
So if you're on the diagonal from the top left to the bottom right, those numbers all
stay in the same place because that's the line that defines the 45 degree angle.
If you have a matrix, it's like a square and you're folding it diagonally at 45 and taking
the top part to the bottom and the bottom to the top.
Yeah, exactly.
And that's what the transpose glyph looks like.
So it's a circle with that same line, but instead of it being vertical, it's now along
that diagonal. And so the way you can think
about the reverse and the transpose algorithms or symbols is that they are basically visual
representations of the axis that you're quote unquote rotating for lack of a better word
around. And what's crazy is that I didn't see this for like months. I just was like, oh,
that's transpose, that's reverse. And then at some point I was reading a blog article where it pointed out that like, oh yeah,
once you see that the reverse and the transpose, and there's also another algorithm called reverse
first, which has a circle with a horizontal line through it, which is basically doing a reverse,
but instead of from left to right, it's doing it from top to bottom. And the blog said, once you
see it, you can never unsee it. And I was like, see what? And then I looked at it and I was like,
yeah, oh my goodness. Like, how did I not, how did I not see that? Like, that is,
it's amazing. Like, and that's the thing is so many of the APL symbols have stuff like that.
Like, so there's an article by Roger Hui, who was the primary implementer of the J language,
which is a daughter language or a derivative language of APL.
He has a blog called My Favorite Character.
And his favorite character is a character called Log, which is for logarithms.
And he says it's his favorite for multiple reasons.
But one of them is it's because it's a circle with a star inscribed in it.
So there's a star in APL, which
is for exponents. So if you want to go one or 10 to the power of two, you go 10 star two.
But for logarithms, it's a circle with that same star inside of it. And he points out that if you
actually think about cutting a log sort of via profile, what you're really going to end up with is like
a circle and then circles for the rings inside of it. But that circle with a star in it is sort
of an approximation for that profile view of a log. And once again, it's like something that I
never thought of in a million years. You mentioned Chinese, like, is there some sort of similarity
between this idea of like ideograms or something? Like it's visually.
Yeah, yeah, absolutely.
Yeah.
The reason I know this is I lived in China for a year and at one point was not fluent,
but I could speak, you know, colloquially and get around day to day.
And I had to study, you know, written Chinese as well, is the radicals typically have meanings.
So like the character for tree basically looks like a tree.
Think of a T where underneath each of the arms of the T has like two more lines that sort of form
a triangle. So it looks like it has a sort of relationship to what a tree looks like. And an
easier example is co, which is K-O-U in pinyon. That is the word, and it's just a square. That
is the word for both mouth
and door, I believe.
So it's a square that sort of represents what a door might look like.
You know, if you look at a newspaper, it all just seems like overwhelming.
Like, well, I need to go and learn 2,000 characters.
How am I ever going to learn how to read it?
But if you just look at a single character at a time and you break apart the radicals,
you know, I believe it's Mu
or Shu is the word for wood or tree. And that radical that looks like a tree shows up in a lot
of like wooden things. So if you look at what the character is for chair or for table or things that
are made of wood, it contains that wood radical. So like Chinese is actually very, very beautiful
in certain respects because it has that
same kind of like visual meaning. Even if you have no idea what a certain character is, if you break
it apart and you know what each of the radicals are, you can basically play a game of like, you
know, what is wood plus something plus something, and then take a guess at what this character
actually represents. And you probably will be wrong, but you might be close to actually what
it is, which is really cool. Although there are
some similarities between Chinese characters and APL, APL was created by a Canadian, Kenneth
Iverson, in a book called A Programming Language, which is where APL gets its name from. And at that
time, APL, or as it was called at the time, Iverson Notation, it was a tool for teaching. It had
nothing to do with programming languages. It was a tool for expressing algorithms. And that's how it was used in a programming language, the 1962 text. Fast forward
a few years to 1966, and IBM had decided that they wanted to actually implement the notation
as a language. And the IBM 360 is actually implemented in APL 360. And so 1966 was the first implementation. And then to shorten the story,
you know, fast forward to 1980, Ken Iverson ended up going to a company called IP Sharp that was
headquartered in Toronto. And it could have history sort of could have played out so much
differently. In 1982, Bill Gates was doing like a whirlwind tour of North America and swung by IP Sharp and Associates in Toronto.
And there was a meeting between a few of the IP Sharp folks and Bill Gates, and Bill Gates was getting ready to make the PC.
Amongst APLers and array programmers, there's this anecdote of sort of Bill Gates for a long time was a big fan of APL and had an APL sort of handbook in his desk. And right
before the PC, there was a computer called the IBM 5100 that actually had a switch on it for both
basic mode and APL mode. So you could just toggle between the two. So there's a time in history,
right at the late 70s, where APL was just as prominent as basic. Unfortunately, the folks at
IP Sharp and Associates thought that mainframes were just
here to stay because that was their business, was mainframe computing, and didn't really see
what was coming with the personal computer. And so the PC that Bill Gates ended up creating came
with BASIC, not with APL, and never ended up shipping an APL. And because of that, we live
in a world where a lot of languages, I think, look closer to sort of the imperative style basics.
But there is an alternate universe where Bill Gates decided to put APL on the PC instead of BASIC.
And then we'd be living in a way different world today.
So Connor started doing his leet code videos using APL.
And he also reached out to this interesting character, an Iverson protege named Arthur Whitney.
Arthur was involved in creating A, K, and J, all APL-inspired programming languages.
The finance press refers to him as the reclusive genius of banking IT.
I learned a lot about Arthur because of all my research. And then I randomly reached out to him and just asked if he would have time to chat. And then he responded.
This was in early 2019 before COVID really showed up.
And he said, well, do you ever do you ever come down to New York?
And I totally lied and said, yeah, all the time.
And then he said, well, if you ever find yourself down here, let me know and we can have dinner
and chat a bit.
And so I booked a plane ticket and booked
a hotel room and spent two or three days just around the corner from where he works. And then,
yeah, I got to have dinner with him and chat with him over those couple of days. And he's
just an absolute genius. So I had heard of him, I think through Brian Cantrell,
who's interviewed him. I think he went to his house and interviewed him. I don't know if it
was Brian or somebody else
who was saying that he made a deal with the devil
that he would be the best programmer of all time,
but the trade-off was that nobody would be able to read his code.
I found online the original J interpreter or something like that,
and it's written in C, and it's just like,
I don't know what it is.
I started a
project a couple months ago i'm slowly porting that code base the j source code to c++ 20 and
yeah it's i think there should be a name for these types of code bases because it's not c
it's like a macro variant of c where it's a cdsl 80% of your library, quote unquote, is macros.
I did a search.
There's 10,000 macros in the source code.
And those macros are used like functions.
Yeah.
It seems like there's a vague relationship to what you were saying before about the competitive coders using a whole bunch of shorthands to do their coding.
It's actually funny that you mentioned that because that is like literally what the best
competitive programmers do. The first thing that they do if they're not in actual ICPC and they're
just competing on topcoder or codeforces is they copy and paste in a template which includes all
the headers, includes the main function, but then also has a
ton of macros defined, a bunch of which are actually like macros for for loops, because
you can reduce like, you know, 10 or 15 characters that you have to type out with a simple do macro.
And I never actually noticed that that relationship that the some of the exact
same macros that exist in the J code base are exactly what competitive programmers use.
It's not just competitive coding that has a similarity to Arthur's array programming style,
but data science is also very APL-like in some ways. And through a strange coincidence,
while Connor was deep into APL land, he left Amazon for NVIDIA to work on a GPU-accelerated
data science library. It's crazy that this happened, that now I work on a open source
library that is inspired by pandas. And pandas was written by an individual named Wes McKinney,
who used to work at a firm called AQR, which is a quant firm. And he was directly influenced by J,
which is the daughter language of APL. At NVIDIA, Connor's ability to solve something
in an APL style
is coming in very handy for writing code
that will distribute across the thousands of cores
of a modern GPU.
For example, we explained reductions in scans earlier.
A reduction is something that's basically
you're taking an element
and applying a binary operation to it.
You can do that in serial,
just taking one element at a time.
And, you know, if you've got the five numbers, you know, one, one, you can just go through and 1 plus 1 is 2, 2 plus 1 is 3, etc. That's a serial algorithm. But you can also do something called a tree reduction. You can basically construct like a tree, add two elements at a time, and then on the next pass, take the results of two of the first operations and add those. And then you end up at the end of the day getting the exact same answer, but you didn't do things in order. So obviously, if you only have a five element array,
you're not going to see much of a difference. But if you've got like a 5 billion element array,
then putting that on a GPU, it's basically going to chunk that up. It's going to sum up all the
individual chunks, and then it's going to come and do some sort of tree reduction, and you're
going to get the same result. And because it's doing that over thousands of cores, it's going
to be way, way, way faster
than if you're doing it serially on a GPU
where you can only take two numbers at a time in sequence.
Why does an array language,
why is that helpful for expressing these?
Because branching doesn't really fit super well
into the GPU model
and array programming inherently avoids that.
And so many of the primitives in APL are like trivially parallelizable. So reductions and scans are two of the most library and C++. They basically have a stood reduce or
a thrust reduce and a thrust inclusive and exclusive scan, which are just two different
versions of scans. And those are like bread and butter parallel algorithms. So, so many of the
primitives in APL are just inherently parallelizable. So you can map basically the primitives
directly to a parallel algorithm that runs on the GPU.
Whereas in other languages, the way you solve things, if you're doing branches,
they don't trivially map to the GPU. Whereas array programming languages, it's basically the
way they're designed is operating on these huge arrays or huge matrices. It makes it like the
match made in heaven from my point of view. It's not actually a fluke. It's like this language was designed by Iverson
to be able to take in these arrays and matrices.
And because of that, it ends up having this form
where it ends up being data parallel.
And then like GPUs exist.
And oh, isn't that convenient?
GPUs are great if you can do things data parallel.
Exactly. Yeah.
Everything's an array.
So it just,
it works out perfectly. There's a cliche about different programming paradigms changing the way you think. I think sometimes when I hear it, I roll my eyes a little bit, but there's definitely
truth to it. And it's hard to put your finger on exactly where the truth lies. To me, it's a little
bit like living abroad. I was an exchange student in university. I did a small part of my undergrad degree in Australia. And when you travel, you eventually go home. But for me, what changed
based on my travels was how I view my homeland. Just living somewhere else for long enough to
experience life there, you kind of start to question little assumptions about how you lived
before. Connor is still working in C++, but when he's doing that, sometimes he's thinking about how
he would solve a problem in APL.
And I think that, that different perspective
that you bring home to your primary programming language,
that's the power of learning how to solve problems
in a different paradigm.
So that was the show.
I actually talked to Connor for quite a while
and I made him walk me through the ice cream problem
from LeetCode that we talked about using APL.
I'm gonna try to figure out how to put a video of that up on the episode page, which you can find in the show notes
in your podcast player. Also, Connor and Bob Theriault and several others started a podcast
all about array programming languages. I'll put a link to that on the episode page as well.
Until next time, thank you so much for listening. Is that right? No.
Or actually, that was right, but I should have said shift four.
Back tech shift four.
There we go.
And then looking at a weird arrow,
the point it's,
what is it?
Is this triangle?
And then this is a triangle with a line through it.
Yeah.
Line through.
Okay.
And then type X again and right bracket.
A bracket. And then bracket and then
hit enter
you can just use up arrow actually
or yeah up arrow might work as well
and then if you put 6 at the beginning
of the expression
and then back tick
shift 6
that's the transpose
so yeah
back tick shift 6.
Or maybe I've got this wrong.
Oh no, sorry, without the shift.
Backtick 6.
Oops, backtick 6.
There we go.
And so if you hit enter on this now,
we're going to get our Boolean array.
Oh, okay.
Okay.