The Changelog: Software Development, Open Source - Never. Let. AI. Write. Your. Tests. (News)

Episode Date: June 9, 2025

Diwank explains why you should never let AI writes your tests, Apple redesigns all of their software platforms, AI has brought about the rise of judgement over technical skills, Peter Steinberger says... Claude Code is now his computer, and the curious case of Memvid.

Transcript
Discussion (0)
Starting point is 00:00:00 What's up my nerds? I'm Jared and this is ChangeLog News for the week of Monday, June 9th, 2025. Just days before their much anticipated WWDC keynote Apple Research published a paper on the strengths and limitations of large reasoning models, which I can't help but interpret as, seriously guys, there's good reasons why our Apple Intelligence rollout has been a dumpster fire. You'll see.
Starting point is 00:00:34 Okay, let's get into the news. Never let AI write your tests. Developer Dewank's field guide to a new way of building software starts off as a pretty typical here's how to be productive coding with AI but then he says something near the end and emphatically so that I haven't heard anybody say quote now we come to the most important principle in AI assisted development it's so important that I'm going to repeat it in multiple ways until it's burned into your memory. Never let AI write your tests.
Starting point is 00:01:09 Tests are not just code that verifies other code works. Tests are executable specifications. They encode your actual intentions, your edge cases, your understanding of the problem domain. High performers excel at both speed and stability. There's no trade-offs. Tests are how you achieve both." Dewank says AI can help with test planning, suggest test scenarios, debug, and analyze test features, but that it should never touch test files, write test code, or modify test
Starting point is 00:01:41 expectations. Your tests are your specification. They're your safety net. They're the encoded wisdom of every bug you fixed and every edge case you've discovered. Guard them zealously. End quote. I'm not sure if I agree or not.
Starting point is 00:01:56 I don't think I have enough experience yet to weigh in with more than a hunch. What do you think? Does this ring true to you? Or does it sound overly cautious? Apple redesigns it all. The headliner announcement from Apple's WWDC keynote was a complete redesign of all major software platforms.
Starting point is 00:02:15 Quote announced simultaneously for iOS, iPadOS, MacOS, WatchOS, TVOS, VisionOS and CarPlay. Liquid Glass forms a new, universal design language for the first time. At its WWDC keynote address, Apple's software chief Craig Federighi said Apple Silicon has become dramatically more powerful enabling software, materials, and experiences we once could only dream of. Inspired by VisionOS, Liquid Glass is layered throughout the system and features rounded
Starting point is 00:02:45 corners that have been matched to the curved screens of the devices. It behaves just like glass in the real world and morphs when you need more options or move between views." I'm not gonna lie, it's giving me Windows Aero vibes. It'll probably grow on me, but I can't say I'm super excited about this change. The return of texture depth and expressiveness in UI trend I featured last week coming on the heels of Airbnb's redesign is much more interesting to this guy. The Rise of Judgment over Technical Skill Ever since ChatGBT launched our current AI
Starting point is 00:03:18 madness, developers have been asking ourselves, and each other, what it all means in the long term. We still don't have that answer yet, but I can confidently say that at least in the medium term, it means we must move up the value chain because the once cherished technical skills we've acquired are being commoditized at a blistering pace. There's nothing new under the sun. Quote, in 1995, musician and producer Brian Enno made a profound observation about computer
Starting point is 00:03:48 sequencers that has become increasingly relevant in our AI powered world. Quote, the great benefit of computer sequencers is that they remove the issue of skill and replace it with the issue of judgment. With Cubase or Photoshop, anybody can actually do anything. And you can make stuff that sounds very much like stuff you'd hear on the radio or looks very much like anything you see in magazines. So the question becomes not whether you can do it or not
Starting point is 00:04:13 because any drudge can do it if they're prepared to sit in front of the computer for a few days. The question then is, of all things you can do now, which do you choose to do? End quote. You know what? Adam and I had a similar conversation about digital photography while on a photo walk in New York City years ago.
Starting point is 00:04:30 It was my contention then that the skills required to take great pictures were trending towards zero and when we get to that point, which we're pretty close to now, the only thing that would matter is taste, which is just another form of judgment. In other words, it's a way of answering the question of all the perspectives you can now capture. Which do you choose to capture? In one sense, changelog news is me trying to climb my way up the value chain. Sure, I write some pros too, but not notably well.
Starting point is 00:04:58 And I read them aloud to you, but not all that well. What I really do is repeatedly answer the question of all the things you can feature, which do you choose to feature? It's now time for sponsored news. Our best customers are now robots. Kurt Mackey and our friends at Fly have had quite the experience, quote, but a funny thing has happened
Starting point is 00:05:20 over the last six months or so. If you look at the numbers, DX, developer experience, might not matter that much. That's because the users driving the most growth on the platform aren't people at all, they're robots. End quote. We've talked about LLM SEO a few times on the pod,
Starting point is 00:05:37 and this is why. Because you don't have to attract humans when coding agents make tool selections at massive scale. Kurt and his team are now focusing on the latter. Quote, if you try to think like a robot, you can predict other things they might want. Since robot money spends just the same as people money, I guess we ought to start doing that.
Starting point is 00:05:56 For instance, it should be easy to MCP our API. The robots can then make their own infrastructure decisions. End quote. Lots to glean from this post. Thanks to Fly.io for sharing so candidly and for sponsoring ChangeLog news. Claude Code is my computer. Here's Peter Steinberger. I run Claude Code in no prompt mode. It saves me an hour a day and hasn't broken my Mac in two months. The $200 per month max plan pays for itself."
Starting point is 00:06:26 This echoes the sentiment that Steve Yegge impressed upon us on last week's show. After recording that, I took Steve's advice and gave Claude Code the ol' college try at writing a few scripts that I'd procrastinated because they were just too much work for their perceived ROI. Color me impressed. The first script Claude wrote was delivered so well on my specs that I decided to vibe code the second one and didn't even look at the code itself. Worked great.
Starting point is 00:06:51 Peter says this about Claude code. Quote, Claude code shines because it was built command line first, not bolted onto an IDE as an afterthought. The agent has full access to my file system, If you're bold enough, can execute commands, read output and iterate based on results, end quote. I think that's right. I like clogged code more than I like clogged inside of Zed. It's even more natural in my terminal
Starting point is 00:07:15 than it is in my editor for some reason. More to come on this front, I'm just getting started. But yeah, up the value chain we go. The curious case of MemVid. Okay, I'm feeling way too AI bullish in this episode, so here's a nice balancing story. A graduate student created a software project that got a lot of attention online, like a lot of attention.
Starting point is 00:07:35 It's pitch quote, MemVid revolutionizes AI memory management by encoding text data into videos, enabling lightning fast semantic search across millions of text chunks with sub-second retrieval times. Unlike traditional vector databases that consume massive amounts of RAM and storage,
Starting point is 00:07:54 MemVid compresses your knowledge base into compact video files while maintaining instant access to any piece of information." End quote. Now on his face, that sounds amazing, but it also sounds kinda weird. Why would encoding text into video use less disk space
Starting point is 00:08:11 or make anything faster? Well, turns out it doesn't. Quote, testing shows this library's performance is the opposite of what the readme claims. Your text will take 100x more disk space. Searches will be 5x slower. Setup will take hours, not minutes. This library will cause serious problems
Starting point is 00:08:28 at production scale. The readme's performance claims are backwards." That was posted as an issue on the repo. On the heels of this discovery came a new contribution, a proposal for Memvid 1.0, the universal, streamable, self-contained AI memory format. Does that sound ambitious? Does it sound sloppy?
Starting point is 00:08:49 One commenter sure thinks so. Quote, GitHub is now infested with AI slop. AI generated repo with obvious overhead and no practical usages. People that has AI replaced brains, giving stars to this, and AI generated issues. Perfect. End quote. I guess the AI slopping will continue replace brains, giving stars to this, and AI generated issues. Perfect."
Starting point is 00:09:05 I guess the AI slopping will continue until morale improves. That's the news for now, but go and subscribe to the changelog newsletter for the full scoop of links worth clicking on. Such as The new HTTP query method Containerize environments for coding agents And markdown with superpowers. Get in on the newsletter at changelog.news.
Starting point is 00:09:31 In case you missed it, last week Steve Yegge shared with us his adventures in babysitting coding agents and Amanda Silver, CVP of the Developer Division of Microsoft, explained why we're all builders now. And coming up this week, on Wednesday, Richard Feldman tells me all about his rock programming language and on Friday, Justin Searles is back to help us digest all the WWDC announcements. Have a great week, like, subscribe and leave us a 5 star review if you dig the show and I'll talk to you again real soon.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.