Algorithms + Data Structures = Programs - Episode 233: AI! Live from Sunnyvale!
Episode Date: May 9, 2025In this episode, Conor talks about his recent experience with Cursor, Claude 3.7, Gemini 2.5 Pro and several C++ unit testing frameworks!Link to Episode 233 on WebsiteDiscuss this episode, leave a com...ment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonShow NotesDate Generated: 2025-05-07Date Released: 2025-05-09GoogleTestboost/ext-utMinUnitDocTestIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
Vibe coding is a thing, rage coding is a thing, but there's this sweet spot right in the middle.
I call it guide coding.
It's where you're basically, you're pair programming with an AI, you're not driving, the AI is
driving and you just hold its hand.
Sometimes it's just hitting home run after home run after home run and then every once
in a while it'll do something stupid.
I want to use a unit testing framework that isn't going to light up my
Clang tidy like a Christmas tree. So I asked it what the alternatives were and it recommended a few But really I wanted to try out Chris use the X boost
Extended UT and sure enough five minutes or so later
It made changes to my CMake list.txt file and then one by one just went
and changed all 1700 lines of code.
Welcome to ADSB the podcast episode 233 recorded on May 7th, 2025. My name is Connor and today I record solo
live from the Las Palmas Park in Sunnyvale. I chat about Cursor, Clot 3.7,
Gemini 2.5 Pro and a recent amazing success that I had with these tools. We, or should I say I are recording today from Las Palmas Park
because when I arrived here a couple days ago I was able to get a lifer, which in burger
parlance is what you call a bird
you've seen for the first time and it was the... actually don't know how to pronounce this. I looked it up and couldn't find a video of
anyone pronouncing this bird's name either
Mitred or Mitred
Parakeet or
Conyer and it said it was a rare bird and they were up in the palm trees
in this park but they are no longer here and I was hoping to get a very nice background noise
because they were making the noises that we'll call the my tread parakeets make.
But alas, we are back on my last full day in Sunnyvale and this is a solo episode.
First ever solo episode, I think?
I could be wrong about that.
I believe it's episode 233 we're
gonna title this one AI live from Sunnyvale which is a callback to episode
232 which I recorded with Bryce now almost a month ago although you listened
to it last week which was entitled algorithms live from New York we might
not get any parakeets in the background, but you probably
have heard the American crows. Nothing special about those birds,
although they are very intelligent. Anyways, I've got my Merlin app
on, and if any bird sounds pop up,
I will report. But in the meantime, I am going to give a short solo episode on my
massive successes with AI, GenAI specifically.
I am pretty sure that I've mentioned on the podcast, but if I haven't, I've been
using cursor for several months now, but have been relying on it significantly
in the last month or two.
And I would say my productivity has 10x to 100x.
Yes, there are some things that AI is not good at, but there are other things that it's
absolutely phenomenal at.
And I've told this anecdotal story to several folks, including Bryce, over the last week or so.
And I thought I'd share it with you here.
Why am I not interviewing someone?
I did mean to reach out to Ben last week to set something up,
but waited a little bit too long, and also too.
I'm down here in the States from Monday until tomorrow, Thursday, and I have to edit this basically before Thursday.
And I could have grabbed someone from NVIDIA to chat with, but it's basically been an internal conference and I am
absolutely exhausted and what I really wanted to just do was come back to this
park and get the parakeets in the background and chat about AI. But like I
said no parakeets, only AI and crows I guess. And so I've been using Cursor. For
those of you that don't know, Cursor is one of the AI-assisted IDE's.
It's a fork of VS Code. Alternatives to this are Winsurf.
I think technically you can hook up Copilot to VS Code as well.
And it's got a couple different modes, Agentic mode, Ask mode. There's one other mode that I never use.
And you can select the LLM that you want to use.
So I only switch between Claude 3.7 and Gemini 2.5 Pro.
I find that Claude 3.7 is the best in general, but if you need some larger task, sometimes
Claude 3.7 fails at and the task that I'm going to tell you it
did fail at, but Gemini 2.5 Pro absolutely crushed it on this.
And the task was basically switching my test framework, so the library that I'm working
on I was using Google test with, I had roughly 1700 lines of code, roughly 150 tests, and within those
tests 350 plus asserts.
So not, you know, 10,000 lines of unit tests, but still, you know, close to 2,000 lines
of code to switch, you know, test frameworks is not something I would ever even really
consider doing. The reason that I wanted to switch from Google test is that I used cursor and cloud 3.7 basically
to set up ClangTidy and slowly turn on basically all of the linting rules.
And Google test does not work well with these linting rules.
And if you ask the AI what you should do,
it basically recommends that you set up
two different Clang-Tidy files with two different sets
of linting rules.
And you have basically a less strict set of linting rules
for your unit tests.
And I didn't want to do that.
The alternative is commenting no lint on every single test
that has an error, which is basically all of them.
But I basically didn't want to do that.
I want to use a unit testing framework
that isn't gonna light up my ClangTidy
like a Christmas tree.
So I asked it what the alternatives were and it recommended a few, but really I wanted
to try out Chris Yusiak's Boost Extended UT because I had seen it either in a lightning
talk or a C++ talk from a number of years ago and I know it was super modern, and I was pretty sure that it had all the fancy bells
and whistles and clang format, clang tidy, et cetera.
And I attempted basically to tell Cloud 3.7,
I want you to update my CMake list.txt from Google Test
to Boost Extended UT, and then just port all my tests and
Cloud 3.7 would churn for a little bit and
Then it would convert, you know, 10% of my test file and discard the other 90%
So epic failure, but like I said when cloud 3.7 fails or when any of the LLMs fail
You just go try another one. Gemini 2.5, it would spin from anywhere
from three to five minutes to the point where actually
it was asking me if I wanted to force quit cursor
because it thought it had frozen.
But I just said wait a couple times,
and sure enough, five minutes or so later,
it made changes to my CMake list.txt file
and then one by one just went brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr library because I used thrust pair in this library and for some reason it was saying
that the Ostream less than less than operator wasn't defined for thrust pair and even when
I defined it, it was still whining about a bunch of stuff.
So I spent 45 minutes trying to fix this and then gave up.
And so then I went and tried two different libraries.
And actually before I tried Chris Yusehik's UT library, I went and tried two different libraries. And actually before I tried Chris Yusiak's UT library, I actually accidentally tried
another one which was called like minunittest, which is a C unit testing framework.
It unfortunately uses the UT name that Chris Yusiak's library uses as well.
So that was a mistake.
I wasted 20 minutes testing out, you know, a different unit testing framework library
because I wasn't I
Wasn't paying close enough attention which is bound to happen. We do have a bird here. That is a
That is flying
Closer to me. It's not making any noise
could be an American Robin look at that we got a Canada goose at the other side of this pool and
Might have been an American Robin.
Anyways, I accidentally tested the C unit testing framework, discarded that, went back
to Google test, then tried out the one I wanted to, Boost Extended UT.
That one didn't work.
And then I went to Doctest.
And Doctest got it up and running in like five to 10 minutes.
All the tests converted and it worked great.
There was I think one lint that it was whining about
which was some no discard.
And that was only because basically I had cleared out
and so I don't even think that was an issue with Doctest
but anyways within a couple extra
I had to no lint a couple things.
It was working perfectly.
And basically, in less than three hours or so,
I had tested out three different C++ slash C unit testing
frameworks.
And if your mind is not blown by this,
this is not something that I would have ever
considered trying to do, because it's just too much work
like to port a couple thousand lines of code one by one and it's such
mundane work and figuring out the mapping from the macros in one library
to another library it's just a headache and you're not going to do it and now
you have this tool and it's able to do these kinds of things for you.
Anyways, absolutely thrilled.
I'm now using Doctest.
I think it actually compiles a little bit faster than Google
Test as well.
And it's phenomenal.
So cursor, I highly recommend you try it out.
Like I said, I hear a lot of folks, they say, oh, I tried this thing and it works terribly.
I've run into cases where, you know, I put cursor
and Cloud 3.7 in agentic mode and give it
the ability to, you know, run commands, I don't have to approve everything,
and I'll say, you know, run the tests, look at building.md on how to run tests and how
to build,
and, you know, make changes, if you hit a compilation error a compilation error, fix it and just basically spin until you get this to
work.
And after a couple minutes, it'll decide that the best way to solve this problem is by deleting
the tests and the corresponding code.
So it is far from perfect.
Sometimes you've got to smash the AI a little bit over the head.
And I meant to mention this at the beginning.
People talk about vibe coding, which is where you don't check the code.
There's another version called rage coding in my opinion,
which is when the AI starts gaslighting you
or doing something really stupid.
The worst case is where it'll have failing tests
and just gaslight you and tell you that,
oh perfect, the tests are passing now, problem solved.
And then you'll tell it, what are you talking about? You know that the tests are failing, I know that the tests are passing now, problem solved. And then you'll tell it, what are you talking about?
You know that the tests are failing.
I know that the tests are failing.
Why are you telling me that they're not failing?
And then it'll run them again.
It shows you, you can see in like the little embedded
terminal in cursor that it fails and it'll just say,
perfect, they're passing.
And it is immensely frustrating.
It's more frustrating than any other thing I've experienced programming.
So you know, vibe coding is a thing, rage coding is a thing, but there's this sweet
spot right in the middle.
I call it guide coding.
It's where you're basically, you're pair programming with an AI, you're not driving, the AI is
driving and you just hold its hand.
Sometimes it's just hitting home run after home run after home run and then every once in a while
it'll do something stupid.
Or maybe it's not even stupid,
it just misunderstands what you want
and you need to hold its hand, you need to guide it,
point it in the right direction and send it off.
And sometimes that phase where you're guiding it,
it'll last for 30 minutes, 40 minutes
in order for you to guide it in the direction.
But in the case where you port, you porting, you know, 1700 lines of
unit tests is going to take like days, if not like a week, it's worth it to spend that
30 minutes, and in this case it didn't take 30 minutes, but there's been other cases where
I've been trying to do something and it really doesn't understand.
But then once I do figure out the magic incantation to get it to do exactly what I want, you know,
take that incantation, throw it in a cursor rule, and then the next time you want to do
something like that, you just invoke the cursor rule and say, hey, do this, follow the rule,
and you're good to go.
So that's my solo podcast, folks.
It's probably, what, how long have I been recording for?
10 minutes, 15 minutes?
It is so sunny out
It says 13 minutes, so once this kits cleaned up little intro added
it'll probably be around 15 minutes or so and
We we do have a house finch has been recognized a poorly named bird in my opinion because they're beautiful
they got pink heads and
They're a beautiful bird, but they got a very boring name of house finch so they don't get the respect they deserve. If my fiance is listening to this she'll be
chuckling now because I always I always tell her that you know the house sparrow
and the house finch they got they got bad names. House sparrow one of the most
resilient birds in the world. It's everywhere it's everywhere every country
I've ever gone to.
There's the House Sparrow kicking around.
Anyways, there's vibe coding, there's rage coding,
and there's guide coding.
I implore you to give guide coding a try.
10x your productivity, 100x your productivity.
This is the future, this is the future
for software developers.
For some of you, you're gonna be doing some
gnarly
template metaprogramming stuff, or constexpr
metaprogramming stuff, or cutting edge reflection
metaprogramming stuff.
And guide coding isn't going to get you very far.
But there is a huge set of tasks that guide coding with
these LLMs are just amazing for.
And it's only going to get better from here.
So that's my pitch.
We say goodbye to the American Crows, to the Canadian Geese, and to the House Finches.
We did not get a appearance from the Mithred Parakeet.
That's very unfortunate.
And with that, I will say I hope you have a fantastic day and weekend
if you're listening to this on Friday or Saturday or Sunday.
Expect to hear from Ben and I next week where we will be chatting about
a number of topics. Maybe we'll talk about the cost of commuting.
One of my favorite combinators, the C-combinator, the Cardinal, aka
Flip and Haskell, aka AKA commute in the array languages.
I posted that tweet I don't know how many days or weeks
ago at this point.
It was a work of art called the cost of commuting.
Maybe we'll chat about that.
Maybe we'll chat about something else.
Until then, have a good one.