Algorithms + Data Structures = Programs - Episode 230: Hoogle Translate
Episode Date: April 18, 2025In this episode, Conor and Ben chat about www.hoogletranslate.com.Link to Episode 230 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterCon...or Hoekstra: Twitter | BlueSky | MastodonBen Deane: Twitter | BlueSkyShow NotesDate Generated: 2025-04-09Date Released: 2025-04-18Hoogle Translatewww.plrank.comHaskell initCommon Lisp butlastHoogle Translate for Common Lisp butLastHoogle Translate for partitionHoogle Translate for q priorHoogle Translate for Clojure frequenciesHoogle Translate for Swift/Clojure reductionsC++ std::map::mergeC++ std::list::spliceIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
I mean, it kind of hurts my soul a little bit, though, when you when you see the
the rainbow end after like seven languages and then everything else is white
because there's zero agreement on the naming of one of them is is prior
adjacent transform, which we now have in C++ 23.
It's called prior in Q
zip with next in scholar Kotlin.
I can't remember which one.
And but like every language has a different name for that. Welcome to ADSP the podcast episode 230 recorded on April 9th, 2025.
My name is Connor and today with my co-host Ben we chat about a new website, Hoogl Translate.
Although I probably I'm not sure also because you don't spend much time on the socials anymore or maybe this can be
Part two now. We're now on episode 230. I have spun up my long-awaited
Hoogl Translate website. Oh, right. In fact, maybe I'll I mean
We try not to share screens here,
but maybe I'll do it for a sec.
And honestly, say what you will about vibe coding folks.
Vibe coding is, I think the number one best thing
about these AI assisted stuff, it is a barrier remover.
You know, the reason you actually commented,
I think it back on Twitter years ago, when I had, I think you commented something about like, oh, this like a graphic that I had posted is kind of like a Hoogle Translate, referencing the Haskell website.
Potentially.
And then I replied to that. It was so many years ago, yeah, you probably, it sounds vaguely familiar.
That may have been the first time you'd heard of Ho go is that the case I don't know it had not I think I had heard of it before
definitely I'd heard of it before because I had I you said who will
translate and I replied I've owned the domain for like 12 days who will
translate calm okay and so like I was like floored when you had suggested I
guess it's it's a pretty common like if you know of who goal this is basically that's why I called it that right it's a pretty common, like if you know of Hoogle, this is basically,
that's why I called it that, right?
It's a Google Translate for Hoogle.
You type in one thing and you get it cross-language.
And this was a site that I built in ClosureScript,
like in 2023, but the reason I didn't publish it
was because I already had a website spun up,
which was my program language rankings, plrank.com.
But the way that I would deploy that,
I would build the ClosureScript website.
It generates a main.js,
and then I would always manually push that main.js,
and that's what basically the website was,
which is, you should be able to, with CI,
recognize that there's some change, and then get like a CI to rebuild that JavaScript file and then push it for you.
I spent half a day or a day or a couple days at one point trying to figure that out, but I'm not a CI engineer and I'm also not like a front end web developer.
So like I'm already out of my depths dealing with JavaScript and stuff.
and stuff. And so I just gave up and I put it on my list of things to do.
And I wanted to get the automatic deployment
of this main.js working before I built my second ClosureScript
website so that I didn't have two different, you know,
bad, poorly deployed websites.
It's funny when you say, and I totally echo the sentiment,
like, you know, you're out of your depth with JavaScript.
You know, my wife is a front end developer. And, you know, she thinks what I do is arcane
magic and the same the other way around, right? It's like,
And that's the thing is it's, it's true, like, how like how you know how you can be an expert in one thing and then you move just slightly to the right.
You know, it's just a different programming language.
It's just a different stack.
But you don't know the idioms, you don't know the libraries to reach for.
And you may know how to spell it algorithmically in your head.
But like potentially those algorithms.
You need that primer to translate.
You need to Google Translate. Yes, exactly. But like right potentially those algorithms you need that primer to translate you need who will translate
Yes, exactly and
Long story long is that I I vibe coded using Claude 3.7 got like
Did like all of the plumbing of that first website got the github actions
You know automatically deploying and then I did that like one day and then the next day
I was like, well,
now that I have this,
I can just go spin up who will translate,
even though it's been sitting there for two years.
And initially like it was just this, you know,
bare-bones website.
And now I've got like a bunch of settings
and light mode, et cetera.
Let me now interview you about this website
for our listeners.
So this is huggletranslate.com,
H-O-O-G-L-E, translate.com.
And what does it do, Connor?
So the front page here looks very much
like just the sort of Google front page
where you have just a search bar and a few settings off
to the side.
But we're expected to type something in.
What do we type in here?
And I'll start off by, I've got a lot of comments because I released a YouTube video on this saying it's a terrible name, change the name.
And then a couple of people being like, you realize there's already a tool out there called Hoogle. And I'm like, yes, that's where I borrowed the name from.
There is an existing website. If you Google Haskell, Hoogle, it'll come up. And Hoogle is is a type signature, aka function declaration search engine.
You type in the type signature of a function and it will return you all of the functions
across Haskell libraries that match that type signature.
So it's a great way for finding a function that you don't know the name of.
For instance, I did it one time when trying to find the Python in function.
You're trying to check is there an element that exists in a list of elements.
That's called in in Python, it's called contains in many functions.
In fact, if I now type in here, I'll get the Python in.
And if I click on the in, it's called LM in Haskell, right?
One version of it is called LM.
And now you get a list across languages of the different names for algorithms
that share the semantics. So behind the scenes there's a couple different searching methods.
When you just type in an algorithm name that's called the by name search. But then when you click
on a specific languages implementation that switches to a by algo ID which is basically
an ID that maps to a specific semantics of an algorithm.
So sometimes there's algorithms across languages that share the same name, but have different semantics.
A great example of that is partition.
So partition, when you type it in, has, I don't know what this 15 languages, and there's four different colors.
And the colors in this by algo ID or like by semantic search means that
partition across languages has four different implementations. Some of the times it's the
C++ partition where you are basically given a unary predicate and you're putting the ones
that pass it at the front and the ones that fail it at the end and the two other languages
that share the same color are D and then CUDA, which makes sense because CUDA is just basically a parallel version of the C++1.
And everything else is basically different implementation.
So I think partition for Haskell and friends,
I mean, I actually have to go to the docs,
but you can see that when I click on partition for Haskell,
I now get partition copy and split with an elixir.
And my guess is that you're writing to two separate sequences here.
So you're returning a tuple of two sequences.
Yes, exactly.
Instead of like mutating in place.
And right now I play a little bit fast and loose.
I'm going to be adding something called signature search, which is the exact same as who go
where you type in the signature.
And like the worst offender right now
is if I type in fold and then click on one of these,
I get all the reduces, folds, inserts, aggregates.
And the semantics of some of these are different
and the best way to see that
is if you compare the type signatures.
A lot of these, there's a difference
between one that takes an initial value
and one that doesn't take an initial value.
And that's typically like you'll see Kotlin shows up both for reduce and fold.
Rust is the same thing.
Many languages are like this.
And the difference is whether it takes an initial value or not.
This is going to get split.
And there's currently an issue open to say, you know, build in signature search and then while you do that, be more clear about what do I mean by, you know, same semantics.
I've been playing a little bit fast and loose because technically the partition copy in C++, it doesn't return a tuple of lists, right?
It writes the two different sequences and there's always gonna be...
It is the only standard algorithm that does that in fact it is indeed
yeah, and
There's some other cool things that like if you this was added after the fact someone requested it if you click on a language
It'll give you all the
Algorithm so if we go to C++ which there are a lot of languages
represented here
Many as as we might expect,
of different paradigms, you know, we've got the C style,
the Algo family, we might say, C sharp, C, C plus plus,
go, Python, Rust, we've got the array languages, APL,
we were, we've got Excel at languages APL we were
We've got Excel at the bottom
Excel even yeah, and we've got functional languages. There was one language. I didn't know how to pronounce in there
I didn't recognize it's the Crescent
Yeah, this is I think I'm gonna I'm gonna change this to moon
You can see that half of these characters aren't recognized. This was a
Watcher of one of the live streams. I think their personal language.
This is an array language I take it. It's a I'm not sure if it's an array language. It is a symbolic
glyph language that it does have a lot of the same primitives transposed so I wouldn't be surprised
if it's an array language but okay the thing about this site is is this whole site operates basically on a markdown file that is just a table.
And then thanks to Claude 3.7 again, I basically built up a CI pipeline that whenever the markdown
file gets changed, it kicks off a GitHub action that runs a Python script, parses the markdown
file, and then generates a Clojure script dictionary.
And that's what basically runs the site.
So anyone that wants to can basically just go and open a PR and add any language or any missing algorithm.
And even it's it's now that if you want to add like third party like technically range V3 is here but that's hidden by default.
And then also in array languages a lot of the times they don't have explicit glyphs because it's just a combination of two or three to get that. So you can click show expressions and for certain array languages it'll show
you that you know like any is just an and reduction in all the array languages.
Currently right now we don't actually have those added but in the future
you'll see stuff like that. And one of the coolest things that I've added
because one of the like the motivation for this is one it's a useful tool for
finding algorithms
You know who goes very useful a who will translate is is also super useful
I've gotten a bunch of comments saying like I was working in Python and I I knew this thing should exist
But didn't know to check in more inner tools, which is like a inner tool library extension
But the other motivation is I I love putting these graphics
Especially if I'm at a non C++ conference, in talks where
I'm talking about Python and I'll say, I know this as this in C++, but it's also known,
if I can just show this thing, but manually creating these is a pain.
It takes 30 minutes to go and handle this.
So now that it's online, people can submit PRs, but I've also wanted to be able to show
the differences
across multiple languages in like a nice table. And so my favorite one these days is if you
type the name of a language, and then you can type a bunch of, you know, multiple of
these. So swap is the the C combinator known as flipping Haskell, which I guess has should
be here as well. But then if you continue to type, it'll build up.
Oh, a multi-column table showing each algorithm
in a different column.
Exactly, and adds a little sad face.
Sad face for when the algorithm's missing
in that language.
Sometimes this is just because it hasn't been added,
but specifically for what we're looking at,
we're looking at the combinators in that match
to BQN's swap over before, I can I can add after and then it's
still and I guess we haven't explained this is that I have I don't actually
have a name for the algorithm it's it's kind of basically it creates a
frequency map enclosure it's actually an algorithm called frequencies where it's
the equivalent of the counter data structure in fact if we copy and paste
this and we type in frequencies, you'll see
that there's two languages that have a function called frequencies, racket and closure. And
when we click on this, this is this is histogram. Exactly. This is creating a histogram. Yeah.
And in Python, the idiomatic way to do this is actually not to use an algorithm. It's
to use a container from collections called counter, which is why there's a little asterisk beside this because once again, yeah, it's it's it's playing a little fast and loose,
but it uses frequencies to basically figure out what's the most common.
And then I've stolen the colors from Excel 2003, which has got one of the best
color palettes out there, folks.
And then it describes purple to the most common name.
And then I guess blue is the next most common and
it goes down the rainbow and so across this you can see that you know over aka
the psi combinator is the most consistently named across these array
languages but then some of them like every single language chooses a
different name for it and some of them don't actually have the glyphs and so
you can do this for anything.
You can go C++, any, all.
And then technically, it should be any of and all of.
But then this shows that by a landslide,
the most common names for this are any and all.
And if you're actually using this to build up
graphics for a presentation, I've
added the feature that you can control and click on any row. So if you don't want, you know, C sharp, Elm, Julia, common list, whatever,
get rid of once you don't want to put in your graphic. You can build the graphic that you
want. So anyways, it's, there's still lots of stuff that I'll be adding over time, but it's a
it's a neat tool. Now the obvious thing for me to ask is for you to type in but last.
tool. Now the obvious thing for me to ask is for you to type in but last. Oh yes that's why we came here. So I don't think it'll be here but when I was doing my masters I was doing a lot of small
talk work and I know that small talk the ferro implementation specifically does have the
equivalent of but last and so I guess init would be the one that so and it's not
even in here from Haskell and but last won't show up either but I shall after
the fact go and get these and if we go to ferro small tap talk but last
algorithm it's very hard to search fair and actually I don't why are we doing
this let's go to chat GBT. Hello, good friend.
I don't even know, I got into the trap of Googling
when I would never do that if I wasn't on this podcast.
You were about to say something though.
Well, I was gonna say we want the right answer.
So if chat GPT will give us the right answer.
Oh, that's the thing is if you give it a lot of context,
if you say, I know in Pharaoh small talk there is an algorithm identical to
Haskell's a knit and is it common Lisp?
Yeah.
Common Lisp calls it but last.
Do you know the name?
It nails stuff like this and nails stuff like this.
All but last.
All but last implemented on sequenceable collection. OK.
And they even give us a little snippet using the hash paren for a list.
And so, yes, I should add this.
And you can note that, you know, on the Hoogl Translate, the ferro, there's only four
algorithms from ferro small talk that I have here and that's because
the stuff that mostly or half the stuff that exists in here was what I needed when I was
interested in looking into something like two or two and a half years ago and I think that was
like roughly 300 but we've I think since a week or week and a half ago it's doubled to like 600
algorithms across languages because people have been open in pr's
and so I imagine at some point it'll it'll get to like a thousand or two thousand but yes init
butlast all but last so what is the correct name for well in common list but last can take an
optional integral parameter so you can do but last n right but last is is but last one.
But if you give it an integer you can actually chop off the last end.
Right.
And this is this is something that will also be built in because right now for the most part things just map to one algorithm ID.
But over time things with optional parameters or default parameters like closure a lot of the times their partition algorithm I think technically maps to a couple different algorithms and with the type signature
work you'll be able to basically map a single algorithm to multiple semantics so it'll show
up in every place.
Common Lisp also somewhat confuses you confusingly. Well, it has n but last because it uses n in the name of an algorithm to mean a destructive algorithm.
N reduce is another one.
I think n reduce or n, I can't remember, but n append, I think, n concat.
These are algorithms that change their data structures in place rather than returning a new one so but last return
to a new list with everything but the last element and but last does not as
you might think have anything to do with selecting n elements but it changes the
list to chop off the last element interesting that's what the prefix n
stands for in common list that is, that is the conventional, that is the convention, yes.
And I think n-concat is another example.
So you give it two lists.
Concat just concatenates the list, returns the concatenation.
n-concat sets the tail of the first list to be the second list.
So it effectively destroys the first list or changes it in place.
Do you know why what the history but behind choosing prefix N?
I don't know why N is used in that way.
Certainly those variations exist for efficiency.
Right. Right. Yeah, that totally makes sense.
And that is, I mean, that is the
part of the issue of a site like this
is that you want there to be like a
clear, you know, almost strict
how semantic, how you group algorithms by semantics.
But when you get to a language like C++ where
we have a smattering of in-place
versus non-in-place algorithms that are inconsistently
named with the underscore copy suffix is at a certain level, semantically they are the
same.
But, and it's also like, I've been thinking, I haven't added, I think someone actually
the other day added reverse and reversed from Python for the reverse algorithm.
But those are an in place versus one
that's returning a copy. And you know, how do you how do you deal with that? Do you split
that because the type signature is different, right? One returns void and one returns a
new copy list, the new collection. Yeah. And or do you come up with some way of I type
sort and it shows you two different columns of like in place or out of place.
I haven't really figured that out yet. But I mean, that's one of my, I mean, I've complained
about that in the past too, is that, you know, sort is in place and transform is out of place.
But like, why did we not consistently like have a naming convention where it was either underscore
copy for anything. So it should really be transform copy.
And we're kind of just inconsistent with that,
because you might think that like, oh, right,
this one doesn't have a copy.
Therefore, it must be in place.
And that's not always the case, right?
Yeah, interesting, though.
Commonless calls it n.
And array languages don't care.
I mean, I think sometimes there's some internal optimizations
based on they can reason that
there's only one thing happening.
So they show everything in place, but that's not something the user necessarily necessarily
is aware of.
Yeah, and concat is the interesting case, or an interesting case in common list, because
so regular concat.
So both and concat and concat don't have to change the second argument, right?
But in the case of concat, you need to create as many cons cells as are in the first list.
You basically create a new first list with the second list as its tail.
The second list can be the tail of something and
not know about it. So the second list, because a list is really just a pointer to a con cell,
which is designated ahead of a list, right? And the rest just flow after it. So the non-destructive
version of concat allocates N new concels for the first list and then
appends the second list just with a single operation by sending it to the tail.
But the n concat doesn't need to allocate.
It steps down the first list and when it hits the end it just sets the tail to the second
list.
So it turns the first list into the longer list while still
leaving the second list alone.
And just to clarify, because concat can mean different things. I mean, sure enough, I'm
sure if I type concat here, you can see there's multiple colors. It sounded like concat is
just for two different lists?
It's appending two lists.
Gotcha. Okay.
And returning a third list.
Yeah. Whereas my guess is across these languages,
one of them is like a flatten where it can take any list of lists and flatten them all.
Interesting. Yeah. So that makes complete sense. Yeah. Can cat might be able to take
variadic arguments in some lists. Well, you know, for know for simplicity sake let's just consider the two argument version. Yeah
Interesting and I'm trying to think like in C++ basically
Well, there is no really equivalent. I mean in ranges land they do have join now, but if you're pre ranges
It might be that right Like if you have like a vector effect, like that
stuff is statically typed, right? So you can't turn that into like a one dimensional vector.
So you're always going to be doing some kind of copy, out of place copy.
Well, it's like a splice. Splice is in unordered map or map.
I'm going to say one of those.
There's an operation or is it called merge?
I've heard of merge. I definitely haven't heard of splice.
Oh, splice is an operation on std list.
Yeah. Okay. So splice is the list version.
Transfer elements from one list to another.
No elements are copied or moved. Only the internal pointers of the list nodes. Transfer elements from one list to another. No elements are copied or moved.
Only the internal pointers of the list nodes are repointed.
Yeah, that's the same.
OK.
Which makes sense, right?
Because that's common LISPs.
I assume implementation is some kind of pointer-chasing thing.
Interesting.
All right.
And merge actually also exists.
And I think merge exists on the list and also on unordered
set, unordered map, forward list.
So it's a very similar sort of idea.
And there's a freestanding, I think, merge algorithm as well, I think.
Well, yeah, but that's something different.
Right, the merge algorithm is taking two sorted sequences and outputting them and zipping
them together into a third.
Again, zip is an overloaded term.
But yeah.
Right.
Anyway, so yeah, this I guess this spun out of what to call the for each but last and
then that made me think, oh, I have a tool for this.
Not fully not fully built out because it doesn't have any of the names.
You have a tool that doesn't quite know this yet, but they will.
Yeah, yeah, over time.
And that's the thing is that my hope over time
is that languages that I have zero familiarity with that
potentially have extents of libraries,
like D is one of those languages that has a lot of interesting names.
And Swift, actually, the other day, I think I knew this
and I had forgotten it, that Clojure calls their scan,
which is overwhelmingly the name chosen
for the incremental fold that's opening all the running
results, Cotton calls it running reduce and running fold.
Clojure calls it reduce for the reduction
and reductions for the scan.
Oh, okay. Okay.
Which is kind of nice. It's one of the few algorithms or few languages that shows that
they're related in the naming of it. And Swift also, I don't know if they borrowed it from
Closure, but Swift in their algorithm library, which is an extension of what's provided in the standard library,
they call it reductions as well,
which I thought closure was the only one.
But yeah, it's neat when you stumble on different names
but that are arguably better names.
I mean, it's-
Right, right.
I'm reminded of Tails.
I think in Haskell it's called Tails.
It might also be called tails in Lisp,
although maybe not.
Cause that's like-
Which is like prefixes.
Yeah, they have a knit and a knit,
and tail and tails.
Yeah.
That's like-
Yeah.
I forget exactly what it is,
but it's something like all of the,
so the tail is the Lisp minus the head,
and the, or the cuder in lisp and the cdr and the
cdr.
Yeah.
The lisp folks understand what you just said.
Or anybody that's read SICP probably understood as well.
Yeah, I think actually BQN, you know, my favorite language of the day, they have two versions
of that.
And I believe they're called prefixes and suffixes.
Yeah, prefixes and suffixes,
which arguably is a kind of nicer name in my head
because scan, a lot of the times,
they're known as prefix scans,
or they involve the word prefix because it is,
different prefixes that you're computing the reductions of the prefixes yeah yeah I
mean it is very cool I love when people add stuff to the site because then it
shows that you know sometimes like filter and even arguably filter you know
there's there's been conversations about are you filtering in are you filtering
out is that the the best like but conversations about are you filtering in, are you filtering out?
Is that the best? But some of them, you know, filter, map are almost unanimously, unfortunately not for C++, but almost unanimously are agreed upon. And then there's other ones where it's like
there's maybe three languages and two of them are like C++ and CUDA C++, which really should count
as one. And then every language has a different name for it, which is this is like the vocabulary
that we use to write.
It's a great, the site is good for naming, you know, as well as, as well as exploring
languages and by exploring languages, we get the idea of what, what is the common name
if there is one or what are some of the names with some of the ideas around this you know also because things school different names have slightly different signatures different languages right so you get to explore the design space around the algorithm.
Yeah I mean it's kind of hurts my soul a little bit though when you when you see the.
when you when you see the the rainbow end after like seven languages and then everything else is white because there's zero agreement on the naming of it.
One of them is is prior adjacent transform which we now have in C++23.
It's called prior in Q, zip with next in scholar or Kotlin I can't remember which one.
But like every language has a different name for that. And to me, it's a fundamental algorithm.
I don't know how many times, I think in your talks,
in Ivan Kuchik's, I'm going to pronounce that wrong.
It's been a while since I've met him.
But anyways, multiple talks, I've seen the zip-tail trick.
I think I've seen it in Odin Holmes, maybe a given talk
as well.
That's this pattern is a, you know, adjacent transform.
And it's a real shame in my opinion that it's not as like popular as map and filter and
reduce like it should be up there. Anyways, we'll wrap up. Who will translate?
Be sure to check the show notes either in your podcast app or at adspthepodcast.com
for links to anything we mentioned in today's episode, as well as a link to a GitHub discussion
where you can leave thoughts, comments, and questions.
Thanks for listening, we hope you enjoyed,
and have a great day.
I am the Antiprase.
Um.
Um.
Um.
Um.