Algorithms + Data Structures = Programs - Episode 233: AI! Live from Sunnyvale!

Episode Date: May 9, 2025

In this episode, Conor talks about his recent experience with Cursor, Claude 3.7, Gemini 2.5 Pro and several C++ unit testing frameworks!Link to Episode 233 on WebsiteDiscuss this episode, leave a com...ment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonShow NotesDate Generated: 2025-05-07Date Released: 2025-05-09GoogleTestboost/ext-utMinUnitDocTestIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Transcript
Discussion (0)
Starting point is 00:00:00 Vibe coding is a thing, rage coding is a thing, but there's this sweet spot right in the middle. I call it guide coding. It's where you're basically, you're pair programming with an AI, you're not driving, the AI is driving and you just hold its hand. Sometimes it's just hitting home run after home run after home run and then every once in a while it'll do something stupid. I want to use a unit testing framework that isn't going to light up my Clang tidy like a Christmas tree. So I asked it what the alternatives were and it recommended a few But really I wanted to try out Chris use the X boost
Starting point is 00:00:34 Extended UT and sure enough five minutes or so later It made changes to my CMake list.txt file and then one by one just went and changed all 1700 lines of code. Welcome to ADSB the podcast episode 233 recorded on May 7th, 2025. My name is Connor and today I record solo live from the Las Palmas Park in Sunnyvale. I chat about Cursor, Clot 3.7, Gemini 2.5 Pro and a recent amazing success that I had with these tools. We, or should I say I are recording today from Las Palmas Park because when I arrived here a couple days ago I was able to get a lifer, which in burger parlance is what you call a bird
Starting point is 00:02:08 you've seen for the first time and it was the... actually don't know how to pronounce this. I looked it up and couldn't find a video of anyone pronouncing this bird's name either Mitred or Mitred Parakeet or Conyer and it said it was a rare bird and they were up in the palm trees in this park but they are no longer here and I was hoping to get a very nice background noise because they were making the noises that we'll call the my tread parakeets make. But alas, we are back on my last full day in Sunnyvale and this is a solo episode.
Starting point is 00:02:58 First ever solo episode, I think? I could be wrong about that. I believe it's episode 233 we're gonna title this one AI live from Sunnyvale which is a callback to episode 232 which I recorded with Bryce now almost a month ago although you listened to it last week which was entitled algorithms live from New York we might not get any parakeets in the background, but you probably have heard the American crows. Nothing special about those birds,
Starting point is 00:03:32 although they are very intelligent. Anyways, I've got my Merlin app on, and if any bird sounds pop up, I will report. But in the meantime, I am going to give a short solo episode on my massive successes with AI, GenAI specifically. I am pretty sure that I've mentioned on the podcast, but if I haven't, I've been using cursor for several months now, but have been relying on it significantly in the last month or two. And I would say my productivity has 10x to 100x.
Starting point is 00:04:16 Yes, there are some things that AI is not good at, but there are other things that it's absolutely phenomenal at. And I've told this anecdotal story to several folks, including Bryce, over the last week or so. And I thought I'd share it with you here. Why am I not interviewing someone? I did mean to reach out to Ben last week to set something up, but waited a little bit too long, and also too. I'm down here in the States from Monday until tomorrow, Thursday, and I have to edit this basically before Thursday.
Starting point is 00:04:56 And I could have grabbed someone from NVIDIA to chat with, but it's basically been an internal conference and I am absolutely exhausted and what I really wanted to just do was come back to this park and get the parakeets in the background and chat about AI. But like I said no parakeets, only AI and crows I guess. And so I've been using Cursor. For those of you that don't know, Cursor is one of the AI-assisted IDE's. It's a fork of VS Code. Alternatives to this are Winsurf. I think technically you can hook up Copilot to VS Code as well. And it's got a couple different modes, Agentic mode, Ask mode. There's one other mode that I never use.
Starting point is 00:05:41 And you can select the LLM that you want to use. So I only switch between Claude 3.7 and Gemini 2.5 Pro. I find that Claude 3.7 is the best in general, but if you need some larger task, sometimes Claude 3.7 fails at and the task that I'm going to tell you it did fail at, but Gemini 2.5 Pro absolutely crushed it on this. And the task was basically switching my test framework, so the library that I'm working on I was using Google test with, I had roughly 1700 lines of code, roughly 150 tests, and within those tests 350 plus asserts.
Starting point is 00:06:29 So not, you know, 10,000 lines of unit tests, but still, you know, close to 2,000 lines of code to switch, you know, test frameworks is not something I would ever even really consider doing. The reason that I wanted to switch from Google test is that I used cursor and cloud 3.7 basically to set up ClangTidy and slowly turn on basically all of the linting rules. And Google test does not work well with these linting rules. And if you ask the AI what you should do, it basically recommends that you set up two different Clang-Tidy files with two different sets
Starting point is 00:07:18 of linting rules. And you have basically a less strict set of linting rules for your unit tests. And I didn't want to do that. The alternative is commenting no lint on every single test that has an error, which is basically all of them. But I basically didn't want to do that. I want to use a unit testing framework
Starting point is 00:07:40 that isn't gonna light up my ClangTidy like a Christmas tree. So I asked it what the alternatives were and it recommended a few, but really I wanted to try out Chris Yusiak's Boost Extended UT because I had seen it either in a lightning talk or a C++ talk from a number of years ago and I know it was super modern, and I was pretty sure that it had all the fancy bells and whistles and clang format, clang tidy, et cetera. And I attempted basically to tell Cloud 3.7, I want you to update my CMake list.txt from Google Test
Starting point is 00:08:23 to Boost Extended UT, and then just port all my tests and Cloud 3.7 would churn for a little bit and Then it would convert, you know, 10% of my test file and discard the other 90% So epic failure, but like I said when cloud 3.7 fails or when any of the LLMs fail You just go try another one. Gemini 2.5, it would spin from anywhere from three to five minutes to the point where actually it was asking me if I wanted to force quit cursor because it thought it had frozen.
Starting point is 00:08:55 But I just said wait a couple times, and sure enough, five minutes or so later, it made changes to my CMake list.txt file and then one by one just went brrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr library because I used thrust pair in this library and for some reason it was saying that the Ostream less than less than operator wasn't defined for thrust pair and even when I defined it, it was still whining about a bunch of stuff. So I spent 45 minutes trying to fix this and then gave up. And so then I went and tried two different libraries.
Starting point is 00:09:42 And actually before I tried Chris Yusehik's UT library, I went and tried two different libraries. And actually before I tried Chris Yusiak's UT library, I actually accidentally tried another one which was called like minunittest, which is a C unit testing framework. It unfortunately uses the UT name that Chris Yusiak's library uses as well. So that was a mistake. I wasted 20 minutes testing out, you know, a different unit testing framework library because I wasn't I Wasn't paying close enough attention which is bound to happen. We do have a bird here. That is a That is flying
Starting point is 00:10:15 Closer to me. It's not making any noise could be an American Robin look at that we got a Canada goose at the other side of this pool and Might have been an American Robin. Anyways, I accidentally tested the C unit testing framework, discarded that, went back to Google test, then tried out the one I wanted to, Boost Extended UT. That one didn't work. And then I went to Doctest. And Doctest got it up and running in like five to 10 minutes.
Starting point is 00:10:46 All the tests converted and it worked great. There was I think one lint that it was whining about which was some no discard. And that was only because basically I had cleared out and so I don't even think that was an issue with Doctest but anyways within a couple extra I had to no lint a couple things. It was working perfectly.
Starting point is 00:11:05 And basically, in less than three hours or so, I had tested out three different C++ slash C unit testing frameworks. And if your mind is not blown by this, this is not something that I would have ever considered trying to do, because it's just too much work like to port a couple thousand lines of code one by one and it's such mundane work and figuring out the mapping from the macros in one library
Starting point is 00:11:35 to another library it's just a headache and you're not going to do it and now you have this tool and it's able to do these kinds of things for you. Anyways, absolutely thrilled. I'm now using Doctest. I think it actually compiles a little bit faster than Google Test as well. And it's phenomenal. So cursor, I highly recommend you try it out.
Starting point is 00:12:03 Like I said, I hear a lot of folks, they say, oh, I tried this thing and it works terribly. I've run into cases where, you know, I put cursor and Cloud 3.7 in agentic mode and give it the ability to, you know, run commands, I don't have to approve everything, and I'll say, you know, run the tests, look at building.md on how to run tests and how to build, and, you know, make changes, if you hit a compilation error a compilation error, fix it and just basically spin until you get this to work.
Starting point is 00:12:29 And after a couple minutes, it'll decide that the best way to solve this problem is by deleting the tests and the corresponding code. So it is far from perfect. Sometimes you've got to smash the AI a little bit over the head. And I meant to mention this at the beginning. People talk about vibe coding, which is where you don't check the code. There's another version called rage coding in my opinion, which is when the AI starts gaslighting you
Starting point is 00:12:51 or doing something really stupid. The worst case is where it'll have failing tests and just gaslight you and tell you that, oh perfect, the tests are passing now, problem solved. And then you'll tell it, what are you talking about? You know that the tests are failing, I know that the tests are passing now, problem solved. And then you'll tell it, what are you talking about? You know that the tests are failing. I know that the tests are failing. Why are you telling me that they're not failing?
Starting point is 00:13:11 And then it'll run them again. It shows you, you can see in like the little embedded terminal in cursor that it fails and it'll just say, perfect, they're passing. And it is immensely frustrating. It's more frustrating than any other thing I've experienced programming. So you know, vibe coding is a thing, rage coding is a thing, but there's this sweet spot right in the middle.
Starting point is 00:13:32 I call it guide coding. It's where you're basically, you're pair programming with an AI, you're not driving, the AI is driving and you just hold its hand. Sometimes it's just hitting home run after home run after home run and then every once in a while it'll do something stupid. Or maybe it's not even stupid, it just misunderstands what you want and you need to hold its hand, you need to guide it,
Starting point is 00:13:52 point it in the right direction and send it off. And sometimes that phase where you're guiding it, it'll last for 30 minutes, 40 minutes in order for you to guide it in the direction. But in the case where you port, you porting, you know, 1700 lines of unit tests is going to take like days, if not like a week, it's worth it to spend that 30 minutes, and in this case it didn't take 30 minutes, but there's been other cases where I've been trying to do something and it really doesn't understand.
Starting point is 00:14:19 But then once I do figure out the magic incantation to get it to do exactly what I want, you know, take that incantation, throw it in a cursor rule, and then the next time you want to do something like that, you just invoke the cursor rule and say, hey, do this, follow the rule, and you're good to go. So that's my solo podcast, folks. It's probably, what, how long have I been recording for? 10 minutes, 15 minutes? It is so sunny out
Starting point is 00:14:49 It says 13 minutes, so once this kits cleaned up little intro added it'll probably be around 15 minutes or so and We we do have a house finch has been recognized a poorly named bird in my opinion because they're beautiful they got pink heads and They're a beautiful bird, but they got a very boring name of house finch so they don't get the respect they deserve. If my fiance is listening to this she'll be chuckling now because I always I always tell her that you know the house sparrow and the house finch they got they got bad names. House sparrow one of the most resilient birds in the world. It's everywhere it's everywhere every country
Starting point is 00:15:24 I've ever gone to. There's the House Sparrow kicking around. Anyways, there's vibe coding, there's rage coding, and there's guide coding. I implore you to give guide coding a try. 10x your productivity, 100x your productivity. This is the future, this is the future for software developers.
Starting point is 00:15:42 For some of you, you're gonna be doing some gnarly template metaprogramming stuff, or constexpr metaprogramming stuff, or cutting edge reflection metaprogramming stuff. And guide coding isn't going to get you very far. But there is a huge set of tasks that guide coding with these LLMs are just amazing for.
Starting point is 00:16:04 And it's only going to get better from here. So that's my pitch. We say goodbye to the American Crows, to the Canadian Geese, and to the House Finches. We did not get a appearance from the Mithred Parakeet. That's very unfortunate. And with that, I will say I hope you have a fantastic day and weekend if you're listening to this on Friday or Saturday or Sunday. Expect to hear from Ben and I next week where we will be chatting about
Starting point is 00:16:36 a number of topics. Maybe we'll talk about the cost of commuting. One of my favorite combinators, the C-combinator, the Cardinal, aka Flip and Haskell, aka AKA commute in the array languages. I posted that tweet I don't know how many days or weeks ago at this point. It was a work of art called the cost of commuting. Maybe we'll chat about that. Maybe we'll chat about something else.
Starting point is 00:16:58 Until then, have a good one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.