The Standup with ThePrimeagen - How BAD is Test Driven Development?
Episode Date: May 14, 2025The gang discusses Test Driven Development and laughs a lot. As usual ;) Don’t forget, ssh terminal.shop...
Transcript
Discussion (0)
So today's stand-up, we are going to talk about TDD.
Okay, we're not going to have some topical topic.
We're going to go after the big stuff, okay?
TDD, is it good?
Is it bad?
Do you use it?
TJ's just smiling, looking like an absolute champ in that free frame?
Dude, somehow TJ's the only human I know that gets freeze-framed and it's a good picture.
Yep.
Yeah, look at DJ.
I can't handle this right now.
I can not handle this right now.
All right, Casey, you probably have the strongest opinion potentially out of all of us.
Okay.
I doubt that, but okay.
All right, because I also have a strong opinion, but I'm willing to...
It's probably you then.
Yeah, I think Prime.
You got to lead up the strong opinions here.
I'll lead up with the strong one, which is, so I recently just tried another round of TDD.
Prime, Prime.
You have to say, hey, everyone, welcome to the stand-up.
We got a great show for you tonight.
I'm your host of Primogen.
With me as always are T.J.
Hello.
Trash.
Hey, everyone.
See Muratory.
Great to be here, Prime.
Thanks.
Today's topic is tested.
Have you ever heard a podcast before?
Thanks, everybody, for joining you're watching the hottest podcast on the greatest
software engineering topics.
The Stand Up today we have with us, Tej, who's in the basement fixing something.
I know.
Is that a home depot ad?
Look, he's showing us his router.
We have with those trash-deb broke-toe winning a judo competition, still living off that high,
and see Mortory, the actual good programmer among us.
Oh, all right, I got a lot to live up to.
Yeah, you got, don't worry, it's not hard to live up to, at least in this crowd.
I'll kind of start us off on this one.
So we've been developing a game called the Towers of Mordoria.
We came up with the name from an AI, and AI generated that name for us.
and effectively, I decided at one point
I have this kind of really black box experience.
I have a deck of cards that I need to be able to draw cards out
when they're played, they either need to go to the discard pile
or the exhaust pile.
When there's no more cards in my hand are not enough to be able to draw cards,
I need to be able to take all the discard cards
to teach it's just flexing on us touching grass right now,
put all the discard cards, shuffle everything up,
and then draw new cards back out, replenish the draw pile.
right this sounds like a TDD pinnacle problem there's a black boxes there's a few operations I'm going to
be able to do this and so I just did this like a week and a half ago on stream it was amazing I did the
whole thing designed the entire API in the test then set up all the tests set up it all to fail
then it did work or then it failed then I made it work then it worked and it was great and everything
was perfect and I was like there we go we just did it I just TDDed and it was a good experience
so then I took what I made and I went to integrate it and it was a great and it was like there we go
into the game and realized I made an interface that's super good for testing and not actually
good for the actual thing I was making and I had to redo it.
And so it always, you know, that's kind of been a lot of my experience with TDD is that either
the thing is simple enough and a super black box enough that I can make it and both the testing
interface and the usage interface is the same or I create an interface that's really great
for testing and it's actually not that great for the practical use case.
and I know people talk about how it's a great way to develop your, you know, your architecture and all this kind of stuff.
I've just yet to really buy it, but I will say the thing that I do like doing is on problems that are hard to develop and have many moving constraints.
I do like to co-create tests with it, meaning that I come up with an idea, how do I want to use it, I program it out, then I go create a test, which is backwards from TD, go, okay, this is what I want to see.
Am I even correct on this? Can I get a fast iteration loop being like, no, no, no.
No, no, yes.
Okay, got it.
Next thing.
I want to make sure it does this.
No, no, no, no, yes.
Okay, got it.
And so I do like the fast iteration cycle of getting tests in early,
but few problems feel like I can actually have that.
And I feel like this.
All things need to be tested immediately and need to be written first.
You just write dumb interfaces.
Everything becomes inversion of control.
Everything's an interface.
And you just have this abstract mess at the end of it
because testing was the requirement,
not, hey, use it when it's good.
So that's my TDD experience,
and I feel fairly strong about that.
And if you say TDD is good,
I feel like I'm going to jump kick you.
I can respond to that or Casey,
if you have strong feelings.
No, I think trash, let's go to trash take first.
Yeah, no blockers on my end.
First, you couldn't, like, deter me from doing any task
than to tell me I need to, like, start off by,
writing tests first off.
Second, I kind of had a very similar experience
recently building a little seal. Hold on. Can you rewind
for a second? What did you just say for your
first point? I said
you couldn't deter me from doing
work if you let off with me
having to write tests first.
I don't even know what that means.
You mean you couldn't deter me any more
like that's the most deterring thing?
Man, am I
stupid? You are.
Can you try using a different word to describe what
you feel. Okay, let's reword this. Okay, let's
reword this.
this, flip, please take this out.
Flip, leave it. I don't like test-driven development. However, however, however, I just built
something recently at work, which was very kind of similar to what Prime was saying.
I ended up doing a little bit of test-driven development. But you wouldn't notice this,
but I'm actually really kind of passionate about testing. So I'm going to kind of like,
when I test, I would say 2% of the time I actually eat off with,
test-driven development. And the rest is usually post-implementation because, one, requirements
and all these things change afterwards, right? So it's like why write tests for things that are
potentially going to change and you don't really know what's going to happen? But I do write test
for very granular things that I think are going to be used all the time. So for instance, like if I
wrote a parser for like my CLI, the parses going to be reused by multiple commands, probably never
going to change. So I typically will write that. And I'll also also write tests for things that
take multiple steps to test. So like, you know, people argue for n-10 tests because you test just like
a user. So that's kind of how I like to develop. I like to use it like I would use it for anybody
else. But if I got to do like three steps to get to the point I want to test, I'm like,
screw this. I'm just going to write a test specifically for this like instance. And then I save
time that same time I arrive at test-driven development is specifically in those scenarios.
Otherwise, it's purely just like, you know, what do you call it, integration testing?
stuff like that because you end up with stuff like like prime said like you when you write test you
write everything so granular and everything works so modular that you just don't even know what it's like
when you actually put them together so that's why that's kind of like the way i approach things
that being said like people that kind of like live and die by test certain development i don't
i don't get it um but i was hoping someone would be pro tdd in this group so i could argue with them
that's all I got for right now
I'm pretty much right there
with trash on this one
my feeling on TDD is pretty much
the same as my feeling on Oop
right the problem is not
if you end up with an object
or something that looks like an object in your code
in Oop that's fine
the problem is the object oriented part
the like thinking in terms of that
wastes a lot of your time and leads you to bad architecture
I think
Td is the same way. I think testing is very good, but test-driven development, meaning forcing the programmer to think in terms of tests during development.
Like, that is what your primary thing is, is I'm making tests, to drive the development, the TD part.
That's actually the problem.
I think the reason for that's pretty straightforward.
And it's kind of weird because usually the people who advocate for TDD are the same people who would poo-poo any similar suggestion.
Like, any suggestion that, like, oh, we should measure performance during development.
They'll be like, that's ridiculous.
You're wasting so much program time.
We're like, what about the 8,000 tests that no one ever needed because we ended up deleting that system anyway?
And they're like, well, you know, test driven development is very good.
So I think, like, the problem with all of those sorts of things, any of them, is just putting emphasis on something instead of talking about it like what it really is, which is a tradeoff.
If you're spending time making tests or you're orienting things towards making tests, that is necessarily programming,
time that isn't being spent doing something else.
Like programming is zero sum.
If you spend time doing one thing, you're not spending time doing another thing.
And so, for example, if you end up spending a lot of your time making the interface
revolve around tests, you weren't spending a lot of your time making the interface revolve
around actual use cases because actual use cases don't usually look like tests.
Therefore, sometimes you get the exact situation that Prime was talking about, where you do
test term development and you end up with a completely unusable mess, which is actually
very common in my experience, right?
I wouldn't call it a mess.
I write pretty good guy.
It was, it was like pretty easy.
I just want to say.
No, the API.
The API is a mess.
Not the code, right?
Like the code has been tested and it probably works.
But the API is a mess, right?
I mean, that's what happens.
Because it wasn't designed specifically for the kinds of use cases it was meant to tackle.
And so you end up with that problem.
And again, it was shifting the developer's focus was all it took to make that happen.
Good APIs are hard to make in the first place, right?
So if you're.
Adding more complexity to the programmer's workflow when they're trying to make an API that's good for the actual use case,
you're just going to get a worse result. Again, zero sum.
So what I would say is like testing is better something that you do as the thing takes shape.
When you're like, okay, we now have figured out like how this system works properly.
We're really happy with it.
The API is working nicely.
We check the performance and we think we've got like reasonable.
We've got a reasonable path towards having this perform really well.
now let's start seeing what are the what is the overlay like what are the parts we can now start to add some little substrate to to test that we think are going to have bugs that will be hard to find because that's usually the way I look at like what are the parts of this system whose bugs will be either very hard to find or really catastrophic like they'll cost us a lot of money or they'll crash a user's computer or whatever right and then you put the tests for that right and that's how I've always approached it and I think that works pretty well and you do you
do want to do that step, but I don't think you want test-driven development.
It just seems like a really bad idea.
And to be honest, when I use software that comes from people who are like TDD,
this organization, CD, whatever, it's full of bugs all the time.
Like I'm not getting these pristine software things.
I mean, I imagine that the people at YouTube do a lot of testing and code reviews.
And yet for the past eight years, the play button cannot properly represent whether something
is playing or not.
eight years, right? Sometimes it's paused and it has the play thing and sometimes it's playing
and it has the play thing. And sometimes both of those times it shows the pause thing. It's just,
it doesn't seem like the testing is working, guys. So again, I'd say, you know, I think you have to
kind of rethink how you're allocating that time. That's what I would say. Can we get TJ in here?
Can we can playboy T.J. We don't know this one. Can we? Am I coming in loud and clear?
Yeah. Your voice sounds good. Your voice sounds good.
I went outside like trash said.
Thank you, thank you.
So I would say, I'm, I think I agree mostly with what everybody is saying,
although I would say for a few kinds of problems.
The problem is I think people, when you say TDD, they just mean 10,000 different things.
So this is like one of the problems, right?
There's like people who are like TDD is only if you do red, green cycle before you write the
code, you have to write the tests.
And like you can't write any code until you've written a failing test and blah, blah, blah.
which like I get
that maybe they're the people who invented the word
so they can define it that way
but like I find that
not very useful for some stuff
but there are like testing
or like projects where I think
I test like basically everything
and there's just ways that you can do it
that are like more effective versus less effective
like my favorite kind of testing
is if you can find a way to do like snapshot
or like golden testing
whatever, you know, people have different names for it,
but where you can basically just write a test and then say,
assert that the output is the same as it used to be, right?
Like, so, and usually I've seen those called snapshot tests or expect tests.
I find that those are like super...
Yeah, yeah, or like people will use them in regression,
but like the kinds that I've done it before, like,
maybe I'm building up a complex data structure to, like,
represent the state of like a bunch of users,
or maybe it's the parse tree of something,
or maybe it's the type system,
like whatever it is,
if you can find a way to represent that
in like a human readable way,
usually just by like printing it in tests that I've done,
you can just like compare the diff
between this one and before you made the changes.
And they should be fast at least.
If they're not fast,
then that's a separate problem.
Like your tests have to be fast for people to run them,
otherwise they don't run them.
Then those kinds of things tend to be,
I find, like, very effective and give you a lot of confidence that you haven't, like, scuffed
anything throughout the system with your changes, right? Because you're sort of like asserting
the entire state of something. And if something changes, it's really easy to update because you just
say, oh, snapshot update this. Cool. Done. One second, all your tests are updated. You walk through
them and change them. You get some other side benefits, too. Like in code review, you literally see
the diff of the snapshot. And you can be like, um, that looks wrong.
that used to say false and out says true,
those things that shouldn't have been flipped between those.
So there are like certain problems where I've employed like nearly TDD
in this snapshot style in the sense that the snapshot always fails on the first time you run it
because there's nothing to expect.
And then you accept that snapshot if it looks good or keep changing the code until it does look good.
And then you like move on to the next thing.
So but that,
but it's kind of like a unique case.
Like, I wouldn't want to snapshot test a, like,
HTML page, right?
It would just, like, there's no way that they're reproducible,
it feels like anyways.
But, like, so there's, there's,
it's only for like certain subsets of problems,
which have been more effective, like, for me,
because I've made a lot of dev tools in my career and stuff like that.
So you want to do, like, verify that all of the references of something look like
XYZ.
Okay, very easy to do with snapshots.
Um,
So that's kind of the main, the one thing that I wanted to throw out there is like that style of testing I find really powerful.
And I've used it a lot.
Like across projects you wouldn't really expect as well.
I've always had the opposite experience with snapshot tests.
I feel like it's okay maybe if like it's you and maybe some other people that actually know the expected output of a snapshot.
My experience is it changes and they're like, oh, I was just update it.
And then the test passed again.
and then you're kind of just left with the snapshot.
And you're just like, all right.
So it's to the point where it's like, all right, let's just get rid of them
because they're just changing every time requirements change
and no one actually knows what the expected output is.
One more point.
I want to make.
Someone said something about like test help refactoring.
I think that's true after a product is actually mature because you're just going to be,
it's kind of like a snapshot test where you're just going to keep updating your tests
every time something changes to the point where it's kind of pointless.
So I always find when tests are,
like that, I usually just delete them immediately.
Oh yeah. Go ahead, track.
All right. I called you trash.
I'm like, yeah, no, you call me trash.
Your comment filled Prime with so much joy.
It translated into physical up and down motion.
Yeah.
I was literally watching him as I was talking and I was like, let's just see how
hype we can get pride.
I wish I could see you guys.
I see nothing.
Yeah.
Yeah, we see you.
We see you guys.
Yeah.
We get to you, TJ, enjoy grass.
If you see a strange mushroom in the grass, do not eat it.
Oh, yeah, don't eat it.
Don't eat the strange mushroom, DJ.
When you're mowing, there's a strange mushroom.
Guys, do I, I don't have my AI glasses.
Do I eat this or not?
Do I eat it or not, guys?
No, don't eat the mushroom.
Eat the mushroom.
That's all I heard.
Okay, I'm going for it.
Prime, send your thing.
Go on.
All right.
All right, so the reason why I have,
actually agree with trash a lot on this one is that I've done a lot of snapshot testing and I think
there's probably does exist a world where this makes a lot of sense. It's just all the times I've ever
done it. Whenever a requirement change, or I have to make a change that warrants changing the snapshot,
I've also had to make some decent, some non-minor change also to some logic throughout the program.
So when my snapshot breaks, I also don't know if I got a right. So now I'm like trying to hand eyeball
through this, this thing like one end of the line, like, okay, is this, is this really?
what I meant it for it to be in this moment.
How do I know I should change this from true to false?
Like I have to re-reason about everything,
and I find that true snapshot testing on big things is,
feels very difficult,
though I love it.
Like, I want, like, ultimately that's what I always want.
I fall into the, hey, let's just make a golden trap constantly.
Like, that is like my default go-to.
It's just like, it's golden time.
And then I always get sad.
As far as, like, the whole unit test breaking and mature product thing,
I just find that testing generally, I do not think testing helps for refactoring unless it's a true end-to-end test.
Like you use the product from the very tippy top and you only have the very bottom of the output.
You have nothing else you look at.
But often I don't, like, that's really hard to test because I'll write a function that I know is very difficult to get right.
And I just want to test those five things.
So I'm going to make a test that's highly coupled to that piece.
and if you decide to refactor it, yeah, the tests all break.
Like, that's just a part of it.
I wrote it so that this one thing works this one way,
and I wrote a test that runs that one thing to make sure you didn't screw it up
if you had to make a minor bug fix to it.
I don't try to do that whole.
I just really think this whole idea of unit testing and, oh,
it will prevent you from having headaches while refactoring has just been alive from the beginning.
And then what you'll hear is people do the exact same thing every time.
Oh, that's because you're testing at the wrong level.
It's just like, I'm just testing the things that I,
want to test. I don't test all the things. And what happens if you're testing on the
running level just simply means that you broke a whole bunch of tests, but then some of your
test didn't. And that gave you the confidence that you did the right thing. And I just don't,
I just am not going to test the universe. I'm going to have a few end to end. I'm going to have
the hard parts. And that's that. No more. I also think like, I mean, when I'm thinking back
to times when I've done testing, like a lot of times when I will break out testing, it's for things
that I know aren't really all that refactorable or changeable anyway.
It's like, okay, I made this memory allocator,
and we pretty much know what it does, right?
You have these like two ways you can allocate memory from it,
and you have one way to free or something like that.
And then I just have these tests to make sure that, you know,
it doesn't end up in bad states or whatever it is.
And that's a really useful thing, right?
I've definitely done that.
I do that for math libraries.
Like we kind of all know what the sign of something,
things should return. So if I'm making a math library that implements the sign function,
that's not like, I don't have to think too hard about refactoring sign because math has been
around for like, you know, hundreds of years and isn't going to change tomorrow. Like, hell,
bad news, guys, like React updated and now the definition, the mathematical definition of sign is
different today. So I'm so happy that's not even a possibility that React does stuff with math because
we need. You know, guys, there's still time. There's still time.
Yeah. So, so I do, I do like him for that. And actually, like, there was a, I can think of a specific time when I shipping game libraries when we had a really bad bug that caused a problem for one particular customer that took forever to find. And it was because it was in a thing that I had added to the library just for that customer as like a back door. Right. They, like, we need this one thing really quick. And I was like, okay. And I
put it in there. And it had a bug in it. And of course, it wasn't in any of my standard
testing or anything. It wasn't even really in anyone's use case, right? And, uh, and so like,
you know, that was a, to me, I was like, okay, that's a pretty, that's pretty good evidence
that, like, the testing that I was doing before was good and, like, wasn't a waste of time because, like,
the one time it didn't happen, it caused a significant problem, right? And so, you know,
presumably I would have had more of those
in actual other customers
had all the stuff that normally
shifts to the library not been tested
in the way that I thought it should have been.
And from then I was like, I was like, you know, never do a one-off ad.
Like if I'm going to add something for a customer,
I'll add it like the right way and actually put in the testing for it
or I'll just tell them I'm sorry I can't do that right now
because, you know, I'm too busy or whatever.
And so I do think like, yeah,
when you can actually talk about this is
isolated, this is what this thing does.
I'm pretty sure it's not going to change.
I think the tests are great, but before then, yeah, it's just, and also,
this was just kind of going off on a random anecdote, but just to circle back,
the thing that Prime was saying is absolutely true, it's like refactor testing,
not really.
Like, that's only, you can only really get, like, a very high-level confidence from that.
You can't really know, there could be all sorts of things that aren't really working
because whole system testing like that is going to miss,
it's going to miss, like, weird edge cases that can overlap.
And the only way you catch those with test-term development is knowing about those edge cases and making tests to kind of target them.
And you can't do that if you just refacted everything because now the edge cases are different, right?
So, yeah, I think TD people overstate the degree to which that really works.
That's why I'm a huge fan of PDD-PROG-driven development.
I think that every time you release and you have a bunch of people that use your product, you should do a slow release.
you should measure the outcome because it's just going to test all the weird things you're never going to be able to think of or set up.
And if it starts, if it starts breaking, you know that you screwed up.
Previous version worked better than the new version.
You can't release this new version.
You got to back it up.
Not P Diddy.
PDD.
Okay.
Stop safe.
There's no baby oil in this situation.
Okay.
But I mean, to me, shots fired at Sonas or Sonas or whatever that company is called.
I don't know what they did.
I don't know what they did.
I don't know what that.
Yeah.
So you don't, you, you, you guys are supposed to be the web dev people.
What's what hell are you talking about, Casey?
Sonos, the company that makes the speakers, nobody.
I know, Bose.
Are you kidding me?
You guys are the web dead.
Everything got.
Beats by Dre.
No, this is, this is like legendary.
Like, everyone knew about this.
So there's this company called Sonos.
They make like the most popular internet powered speakers in the world.
Like everyone has these things.
I mean, not me because I'll,
there's no way in hell I'm connecting speakers to the internet.
That's just a way to make them not work,
which Sonos went ahead and demonstrated for me.
They had a product that everyone liked.
They were extremely popular speaker.
They then did a big software update where they were like,
we're rolling out our new system.
Like, you know, kind of like how Google does every couple years.
They're like, hey, guess what?
Gmail's going to suck way harder than before today.
Like you'll get five months of being able to switch between them
and then you're on to the new bad one.
They did that, only there was no five.
months. They just basically said, like, here's the new rollout. And it was awful. Like, tons of feature
regressions. It didn't really work. Customers couldn't do most of the things they used to. And it, like,
tanked the company. Like, it was just, like, immediately everyone hated it. And nobody wanted to
buy their stuff. And they had to do this walkback. And they couldn't roll back because they had
redeployed all their servers. And I guess they didn't know how to, like, undeploy them. So they
couldn't, you couldn't go back to the old app. It was, go read about this.
I can't believe no one knows about this but me.
It was a massive failure.
I know it.
I know it.
Never heard of them.
I heard it.
Okay.
Casey, I don't know if you guys can hear me, but I heard that.
Okay.
We can hear you.
We can't really see you.
You have very few expressions.
That's fine.
But I do.
Okay.
Your point is that hardware is definitely different than software.
You know, if Twitter has a small breaking thing because they made something,
Twitter, all these major sites, they break all the time, right?
I have much more forgiveness for that.
This is, you know, it's a, you know, it's a small breaking thing.
that's obviously different than, you know,
you've got to play the field a little bit different when I say break production.
If I'm building a game that only, you know, so many people play,
you don't want to just crashing nonstop while you're testing features.
There's probably some levels.
But if your Twitter and all of a sudden you can't post for five minutes
and only 0.001% of the people couldn't post for five minutes,
it's just not going to make a huge splash and it's probably not going to deter them from actually using the products again.
Well, I was actually, no, I was agreeing with you with the Sonos thing.
What I was saying was they should have done a,
exactly what you're talking about. They should have taken
and deployed a small server bank for what they
were doing. Oh, I see. Given it to 1%
of their users and their users would have been like, this is
utter garbage and I do not want it. And they would have like, okay,
pull that on back. Nobody saw it.
We're fine, right? So exactly what you're
saying, I was totally agreeing with you. I'm like
Sonos, if they had done that, they would have
totally saved their company literally like
six months of hell. Had they
just given a slice of users this thing for a
while they would have been, you know, some significant percentage above beta test, right?
Like so, like you said, you know, five, 10 percent of the people got this.
They would all been like, nope, nope, nope, nope, nope, nope.
And then they would have like, you know, be like, okay, okay.
So that's all.
I do want to, can I say one thing just about the testing part, though?
Is that okay from before?
Maybe.
Sure, that's what the podcast is about.
Exactly.
I like this.
I like this.
Go for, TJ.
My favorite project that I've ever worked on, NeoVim, by the way,
way. Can you go back to the grass? You were much more like better in the grass.
Really? Your house is blocking the tower. I got a notification that said your phone is overheating
because I was in the sunshine. I was actually going to bring that up in the beginning.
Like it literally said your phone is overheating. Make sure your body's the one covering.
Okay. All right. Sit over. Create a shadow, TJ. So they can't. You guys are in the shadows now.
You got a plank. Plank over your phone. Hey, what's up boys? Hey, what's up boys?
This is like getting into like T.J. profile pick territory where it's like on the dating app.
Can you tell us some of your pet peeves, T.J. Do you like long walks on the beach?
I'm a, I'm a long walks and joyer. I'm really not a fan of short conversations.
I want to really find out about the real you. I don't know what else.
I don't even know any my words are coming through.
DJ is actually just describing himself right now. That's all T.J. is doing.
I love laughing. I love having fun.
I like to make you laugh.
This is just me to prime.
This is my pitch for getting on the podcast.
It's true.
I'll make fun of trash.
Yo, if there's a guy named trash on the podcast,
I will make sure to make fun of him every episode.
He uses one password.
Are you joking?
All right.
All right.
All right.
TJ, what's your take?
My take is my favorite project that I've worked on.
And like, definitely the one with the most weird edge case.
and probably users, NeoVim, like, the test sweep for NeoVim is amazing.
We have tons of, we have different, like, styles of tests.
We have different ways to run them.
We have all this different stuff.
Tons of work went into making it.
And maybe that's just partially because NeoVim is, like,
trying to maintain compatibility with VIM on some stuff, like, literally forever, right?
So there's some other requirements that are different.
But, man, every time I did a PR and a random test.
broke. It was my fault. It was not the other test. It was not the snapshot tests. It was nothing else.
And like random stuff. I mean, maybe it's just because NeoVim's also like a really old C project.
So you like change something and then a random global is destroyed. I don't know. It's like whatever.
It's maybe not always the best. But like so I don't, in in my experience where people like actually
cared about it and like cared to maintain snapshots and make those happen. Like no one's even
getting paid to work on NeoVim.
You know what I'm saying?
And like the snapshots were valuable.
The unit tests are valuable.
The functional tests were valuable.
And I used them in a ton of different PRs and across like really large refactors,
adding Lua to auto commands, like doing lots of different stuff where I had to change tons and tons of the code.
I was just wrong every time until the test went green.
I think that probably goes to show that tests are as valuable as the people who wrote them.
I was literally going to say that.
Yeah, that's what I've been waiting to say, too, is I just wanted to say, but since I've been lagging, I didn't want to yell.
I just wanted to yell skill issue to everything trash has been saying.
I would disagree, actually.
I will defend trash, actually.
Casey's smarter than you.
Because here's the thing.
That's true.
Dang it.
I suspect that some of what you're seeing, though, is not so much the skill of the people who did it, although they may have been very skilled.
It's more just about when the tests were added.
So at least for me, a lot of times what I'll do is I'll go, okay, if I, like, let's say I have a bug that I actually work on for a day.
Like, like, this is, because most bugs are like, oh, yeah, I know what it is and you just fix it, right?
But you get one and you're just like, what the hell is happening?
And you actually, it's like that once every few months where you're just like, okay, this is really weird and I'm not sure what it is.
when I find those
I typically try to add a test
or assertions or whatever
I think would have caught it
right because I'm like okay
this one's nasty and I wasn't
expecting it what can I add
that will trip this kind of thing up
and I'll add it and I suspect
that much of what gets seen
in like mature products like NeoVem
are just like hey if you've got
some post test
discipline like after we made the product
when we're finding bugs we
add the test that would have caught the bugs, those are often very good because they come out of
what are the things that are likely to break when we do stuff and we put in a net there to catch them.
And so those are like very good.
And they're the opposite of TDD.
They're like, oh, they're TDD.
They're test-driven debugging.
Maybe you could say or something like that.
So I do think there's something to that that's a lot more.
I had a couple other hot takes that I'll get to a second,
but I just wanted to leave it.
I wanted to say that one based on what to you say.
I don't know if that's true about NeoVim,
but I might suspect it, right?
It definitely is.
The thing, though, that, like, I was more pushing back.
I was like, I find the snapshot tests in Neovim invaluable.
Like, they're amazing.
I find a bunch of these good end-to-end tests, truly end-to-end,
like literally it runs a test and does user, I don't know if I just dropped.
It does, like, user input and then asserts the outputs, right?
I found those super valuable in Neovim.
Probably just dropped.
I also, no, you didn't drop.
We heard your point.
I think also to put on top of it is think about the horizon of Neovim versus you starting
up a new dev tool at work.
And maybe there's something to be said about when are the Golden's introduced during
exploration phase?
Or like, Neovim's just not going to be changing.
My assumption is that a lot of Neovim doesn't change super hard.
It's not like, hey, guys.
Let's do a whole new data model today.
Wrong.
That's all the ones.
That's all the ones that everything you guys have pooed on that I found them the most valuable.
When I changed literally the entire data model and how all auto commands work all inside of Neovim, it was the most useful to have the snapshot tests then.
I mean, your testing is going to depend greatly on what you're building 100%.
Sure.
Yeah, yeah.
But I'm just saying, like, we were fundamentally changing the data models.
Yeah, I'm just saying.
That's what happened.
I thought it was a pretty smart statement.
Come on!
You're really smart, Trash.
I wouldn't poo that.
I wouldn't poo that at all.
You've noticed I have not pooed snapshot testing.
True.
Good job, Casey.
I love you.
You're smarter than trash.
I love that I'm not getting dugged on it.
Can we start rearranging the windows based on who T.
Teage thinks is smartest at any given time?
And I'm guessing that it has a very high correlation.
to who is agreeing with him.
It's like I'm getting that sense.
No, I'll just always put trash at the end.
It's fine, Casey.
You don't have to worry.
Today I'm going to tweet that Casey protected me from Teage.
Yes.
This is a pinnacle moment of my career.
I can't even see your guys as videos.
So I have no idea what's happening.
I'm talking to a black screen right now.
Yeah, yeah.
I've been just cracking up at your feed this whole time.
I just can't handle it.
It's pretty good.
It's very good.
Like, the result is top notch of the feed.
Unfortunately, I think this will be a stream-only feature.
I don't know if T.J.
will actually have this effect on the video,
because Riverside will be uploading it.
That's true.
Yeah, that's true.
He'll still be in weird, like,
it is profile, his dating site profile pick phase,
but it will be much smoother, yeah.
So, so anyway, I wanted to add a hot,
I would agree that snapshot testing can be good.
It just depends on whether those snapshots are going to be stable.
I mean, that's the bottom line.
I would imagine what Prime and Trash we're talking about
is if you're trying to do snapshot testing at a level below
something that's actually consistent across the refactor,
obviously that's not going to work, right?
And so if your snapshot testing is like someone mentioned a parsery,
well, if it's a parse of something that could be parsed different ways
depending on how you chose to implement that subsystem,
those snapshots are useless.
If the parse always has to be the same
because it's like parsing for the C language specific,
and it can't change because, like, you know, it's hard-coded.
So if the thing's in C-99 mode, then it had better produced Zach Parsry,
that's a great snapshot test, right?
And so I think the difference boils down to those things.
It has to be about something that actually is, you know, not going to change.
Item potent to the release of the project.
Otherwise, it's a limited time test.
But, you know, that's just how it is.
Probably also the scope of the breaking, right?
If only 10% of your golden is invalid because you made this change,
that's a lot different than 100% of your goldens are invalid
and you have to like rework all of them because of big kind of refactors.
That was my last experience is that I had to do a bunch of testing against this
games reporting thing at Netflix and to be able to do this auto dock and to be able to do it,
I had to like create the data model of the incoming events.
And then they are changing the incoming events like once a week because it was still an exploration phase.
So it's just like, well, not only is my input incorrect, now my output's incorrect.
So I have to change both sides.
So I'm just like, I guess I'm just redoing all the things again.
Here we go.
And so then it just became super hard to know if I was actually right or if I was breaking things.
So I think at the outset I made it pretty clear, I don't like test different development as an idea.
I don't like focusing the developer's attention on tests.
I think they should understand tests and know how to deploy them smartly, but I don't think, do you right?
I do want to say there is a place where I do recommend literal test-driven development.
And it is as follows.
That is if the programmer is finding that they are low productivity on a particular task,
and that task does not provide feedback.
So, for example, if you have to implement some low-level feature component that has no visual output,
no auditory output, no anything, it's just like an abstract thing that happens inside the computer, right?
then sometimes programmers, like, they get bored working on these sorts of things,
but a lot of programmers will, like, focus on things like numbers that come out of a thing.
Like, if a thing, if you run a thing and the number, like, you know, 37% pops out,
suddenly the programmer's, like, super motivated to get it up to, like, 45% and 50%, and, right?
So sometimes it's like, oh, okay, if you can turn this problem, if you can make a,
test that turns whatever you're working on into something with a concrete output that you feel
some reward about when it goes up or improves, that can be a valuable use of test-driven development.
And we do this in performance a lot.
We'll make a little test around something that outputs things like how many cash misses did
it have or how many cycles did it take to complete or whatever.
And it's like, ooh, now I'm like trying to get that number down.
and that's a lot more fun than I'm just trying to make this system faster or whatever.
So, like, a lot of times you can make it so that those things go together,
and then you get the test or the good statistics gathering or whatever it is that you actually want it anyway,
and you get this motivation for the programmer, and they all kind of work together in this nice way.
So I just wanted to put that out there's one place where I think it does kind of work to drive development by a test of some kind.
First thing that comes in my mind is like code coverage reports
And you see that number really low
And I have the opposite reaction where I just don't care
And I want to quit
I don't think I've ever saw like a low code coverage report
And I was like, you know what?
Today we're making that number higher
I'm kind of just like guys we suck
And I'm not going to help us get better
Just no way
All right
I mean to be fair
If I go to a place
It's like 100% code cover
I also go, uh-oh, right?
Like, I'm also kind of worried about the 100% code coverage
as well.
I'm not going to say the company I was at.
I've been at many.
Netflix.
And there was a team that required 100% test coverage
to the point where people were just like writing the most worthless test
just to get like this like PR bot to just shut up.
And to make it even like funnier,
there was like bugs every day like in production.
I was just like, what's happening here?
I don't know if anyone's ever experienced like a hundred percent like test coverage quota by their team.
Yes, I did it with Falcour when I was at Netflix.
100% code coverage bugs every day of my life.
Oh, for Falcour, the one line production takedown?
Wait, do you say Falcour?
Uh-huh.
Like the luck dragon from the never-ending story?
Yeah.
What?
Why was it called that?
You don't know what Falcourt is, though?
I do know what Falcour is.
luck dragon for the never-ending story.
Yes, it was named after the luck dragon from the never-ending story.
Tell the story, Brian.
Tell the story. Tell it all so Casey can hear it and then we'll be done.
All right, I'll do a quick one. Casey, Casey Snow, how it works is that you would provide
effectively an array with keys in it as the path into your data model in the back end.
So say you want videos, some ID title.
It would return to you the title in that kind of as a tree, videos or as a graph or maybe
as a force is a better way to say it. Videos, ID, title.
So if you wanted, say, a list, I want list zero through ten title.
It would go list zero through ten, which would be IDs, which would then link to videos by ID title.
So it's kind of like a graph like data structure that makes it so that if you make duplicate requests,
it's able to go, oh, okay, I already have videos, whatever.
I already have list item zero.
I can kind of not request so much from the back end.
The problem was is that what happened if lists seven doesn't exist?
You don't know.
So we have to return something back to you that says it does not a.
exist. So we return something called like a, it's a boxed value. It's an atom of undefined. It
lets you know, hey, there is a value there and the value there is nothing. You don't need to
request it again. It's not going to be something. So we created something called materialize.
On the back end, that means you need to materialize missing values. Well, what happens if you
request lists zero through 10,000, zero through 10,000, zero through 10,000, title? Well, you just
created a gigantic number that we're going to materialize everything for you. And so with one
simple line of code, a wild loop and bash, you can dost, or at one point, 2016, you could have
dozed down and took down Netflix permanently. There was no rolling back. It was in production
for like two years. You could do it for all endpoints, for all devices, for everything, and just,
it would never run again. And that's it. It was a simple wild loop, and I wrote a lot of beautiful
code about it, and I did a lot of beautiful stuff with it, and it was very nice code. And then I
also discovered it and reported it. And then later on, it got named the repulsive Grizzly
attack. Repulsive
grizzly? Yeah.
So has no one at Netflix watched the
Neverending Story? Shouldn't it be called the Gormork
or something? Like it's a movie, it's
probably on Netflix.
Okay, no, no, this wasn't some
9,000. It was a completely
separate team that when they found out about it, they had to do
a bunch of work, and it ended
up being, you know, a fix on the Zool Gateway
and all this kind of stuff. Because there are no
grizzly bears in the Never Ending Story. I didn't
read the original book, because I think it's in
German or something? I don't know.
Yeah. Yeah, but the nice news is I originally didn't come up with the materialized idea.
But I did do a lot of refactoring that while refactoring and recreating everything, I stopped and went, wait a second.
Something seems funny here. So then I trashed staging for a day where no one could use staging for a day.
And then that's when I realized we made a mistake.
When I say we, I mean me.
And that's it.
he's off.
It's like,
this is the guy told my story,
and I'm not participating in this podcast anymore.
The transitions at the end of this podcast are just getting worse and worse.
Is that stream?
He walks off frame.
What's happening?
All right.
Is stream done anything you want to add to this all?
He's framed on.
Are we still live?
Yeah.
We're still live?
Dude, find a friend that smiles at you like DJ smiles at us.
I think today's big takeaway is don't bring trash dev into a project that needs help.
He'll just quit.
Yeah.
No, actually, I will help your product get better.
Bye.
He'll drive your code coverage metrics down to zero.
Yeah.
Code coverage is a lie anyways.
Are we live?
And then quit.
Oh, TJ's back again.
All right.
Well, what is happening?
Don't know.
See you later.
No blockers on my end.
I think we had a really productive meeting today, everyone.
Everybody, enjoy your weekend.
Next quarter, it's going to be big.
It's going to be the biggest quarter of our lifetime.
Absolutely.
You've got to hit our OKRs.
Yep.
Well, okay, then.
I'm a gentleman, another professional podcast from the stand-up.
All right.
Ending it.
All right, hold on one second.
I'm going to end the stream.
Thanks, thanks everybody for being around here.
The TJ is back.
Big, whatever TJ is.
