Two's Complement - Integration Tests are a Scam

Starting point is 00:00:00 I'm Matt Godbolt. And I'm Ben Rady. And this is Two's Compliment, a programming podcast. Hey Ben. Hey Matt. How on earth are you? I'm doing okay. I'm still a little sick.

Starting point is 00:00:24 I was going to say, you sound a bit nasally still. Yeah, I got a little bit of, not really congestion, but just sort of like... I don't know about you, but I feel like at some point in the winter, I get a cold. And this was obviously not true in the last two years of the pandemic or whatever. Because you didn't see other humans. Yeah, right. But at some point in the winter, I get a cold. And that cold lasts until spring

Starting point is 00:00:45 that's now i've been my experience too yeah it's like i feel better like i'm not really sick but like that little lingering cough or that little sort of like you know rough sound in your throat it's just the only thing that really cures it is spring and i i don't know and it's always you always sound the worst after you've stopped feeling bad. That's what I might just go and say. Like, you feel dreadful for a couple of days, and then you get no sympathy off anybody. And then you sound all stuffy and crap the next day.

Starting point is 00:01:14 And it was like, oh, are you all right? I'm like, I feel fine now, actually, but thanks for asking. Yeah, right. Exactly. So, yeah, probably a more long-winded answer than you wanted. That was, yeah. Look, this is the syn-syn-ack of the podcast here. I'm not actually genuinely interested in how you are, I'm afraid.

Starting point is 00:01:33 I mean, I am. I am, but... Oh, man. Yeah, so what are we talking about today? That's an excellent question. Well, we were chatting yesterday uh at breakfast such as it was we'd got a coffee together as we'd kind of got on the same train and then uh we got talking as we do and then we the conversation was so interesting that instead

Starting point is 00:01:57 of going to our desks once we actually got to work we sat down still wearing our coats and carried on the conversation about uh integration testing of all the things i mean yeah it seems unlike us to to be so excited about something that many people might consider very dull but maybe not our listener who is now going to have this podcast amazing tell me more right right uh yeah it's kind of shocking to me i was looking back over some of the old episodes and it's kind of shocking to me. I was looking back over some of the old episodes, and it's kind of shocking to me that we haven't talked about this, the best I can tell. I mean, we've definitely talked about shades of it, but...

Starting point is 00:02:33 We talked about acceptance testing, which to me, I sort of broadly think of as an integration test of a sort. I mean, it is pretty much an end-to-end test type thing, and that obviously has its place, but not sort of like the integration tests that we that we know and love or i've written before an hour yeah right right often come to regret latterly um so yeah so i mean basically you know as we were kind of talking about yesterday as you as you said um Integration tests are I think a bit of a double-edged sword. And one of the things that has stuck with me

Starting point is 00:03:09 for a very long time is a talk that J.B. Rainsberger gave a few times in a few different venues. I think he was mostly giving this talk back in like 2008, 2009. And I think he's updated it and revised it a few times since then. But the original title of the talk

Starting point is 00:03:26 was Integration Tests Are a Scam. And the point that he makes, and I think he's revised this over time. I think he's changed it slightly as like Integrated Tests Are a Scam, and he's got some sort of caveats. It's a very sort of click-baity title. Of course, in this world. But it makes you think, and I think the central point is sort of caveats. It's a very sort of click-baity title. Of course, in this world.

Starting point is 00:03:45 But it makes you think. And I think the central point is sort of bang on, which is integration tests, which we're going to define as tests that exercise large chunks of the system, perhaps spin up external dependencies like a database or some other sort of like network service you know if you had like a if your system was built out of microservices you

Starting point is 00:04:12 might spin up like a whole bunch of containers to run all your different microservices and have them all interacting with each other you know anything that that talks to like a third party service you know that that's kind of what we're defining as an integration test here. And usually in those kinds of tests, you're testing lots of different pieces of behavior all at once. You know, an example of this would be like,

Starting point is 00:04:36 all right, I'm going to log in as a user and I want to test to make sure that we can log in. It seems like a reasonable test, right? Make sure users can log into our system. Right, certainly very important. Yeah, so I'm going to start up the web server. I'm going to start up a database. I'm going to populate that database

Starting point is 00:04:49 with some user records. I'm going to go to the website. I'm going to enter in a value for the username. I'm going to enter in a value for the password. I'm going to hit the login button and then I'm going to get redirected to a dashboard, a user dashboard. And I'm going to confirm that all of that works

Starting point is 00:05:06 for some definition of confirm and some definition of works, right? And so those are the kind of tests that we're talking about here. Those are the kind of tests that I think JB was talking about in his talk. And his central point was those tests are a scam. They seem like a good idea. They seem like they give you a little bit of

Starting point is 00:05:27 a safety net and some confidence that things work. And they do, but the costs of those kinds of tests are very subtle sometimes. They can be very hidden. And they are often paid in the future with a ton of interest. And so if you build your system with the idea that you're going to test everything through integration tests, what you often find is that over time, your test suite gets very slow and it gets very brittle

Starting point is 00:05:58 and it becomes extremely difficult to make changes to your system, which was kind of the whole point of writing tests in the first place is that you want to be able to make changes to your system, which was kind of the whole point of writing tests in the first place, is that you want to be able to make changes to your system quickly and have confidence that you haven't broken anything. And it's like, well, you still maybe, maybe

Starting point is 00:06:14 have the confidence that you haven't broken anything, but you can no longer make changes to your system quickly. Right? Yeah. And so that is why integration tests are a scam. Yeah, and that idea has kind of are a scam. Are a scam. Yeah, and that idea has kind of stuck with me for a long time. And so a natural question that one would ask if you say, okay, Ben, let's take that as a premise, is what should I do instead?

Starting point is 00:06:37 Yeah, should I just accept that occasionally I'm going to make an innocuous change and then everyone can't log in and that's okay? I mean, maybe it's okay if you're Twitter. Too soon, except not too soon, because by the time this goes out, no one will remember why that's funny. Sick, sick burn. Yeah, that's true. That's a good point.

Starting point is 00:06:59 But yeah, and that is sort of what we were talking about yesterday is sort of like, okay, well, if you're not going to do that. What are you going to do then? How can I develop confidence that I can make changes and I haven't painted myself into a corner where in order to make those changes, I have to update the integration test because I've moved the login box up a little bit. And that was totally not the point of my change. Yeah. That kind of stuff. So what do I do?

Starting point is 00:07:24 Tell me. Well, I was going to ask you some of the things that you've done obviously i've done a lot of things on this but if you want to no i won't i was playing playing the foil go on you ask me all right so like yeah the the we were again we were talking about this yesterday which is not a good thing to keep bringing up on the podcast because nobody was there yesterday which is why we're recording this today but um we talked a lot about transitivity of tests in terms of unit tests in a lot of places so let's take your logging for example um so we have tests that say does our interaction with a database work can i select rows out of a database and read them back in again?

Starting point is 00:08:07 Okay, that could be a test. Now that does sound integration-y, and maybe it is integration-y, but I developed some amount of confidence that I can talk to a database. That could be a first step. And then I go, well, now I understand how databases work. I'm going to just get rid of the database and replace it with some kind of fake or a stub or a mock or any insert your favorite, not a real thing in technology here. And then I can write the majority, the lion's share of my, can I access quote the database correctly tests using tests written against that fake stub, mock, blah, whatever. And now I know I'm pretty confident that I can get the user information out and maybe their sorted password, blah, whatever. And now I know I'm pretty confident that I can get the user information out and maybe their sorted password, whatever it is.

Starting point is 00:08:48 And then I can separately write tests that take a username and a sorted password and say, is this the right user? Is this the right credentials for this user? Tell me yes, no, right? If that's how you're writing your system, obviously. And now I have a sort of transitive relationship where it's like, well, given that i trust the database code given that i interact with the database code correctly given that my login

Starting point is 00:09:11 system works with canned examples that i've set up and given that i've tested the interactions between those two using mac mox or fakes now transitively i don't necessarily have to write the code that goes to the database and checks i can log into the database directly now i just know the pieces work and the bits between them work and then i can keep building back and back and back until i get to the point where i have 99.9 confidence in each of the individual steps along the way handing off correctly all the way to the web service handler or the javascript code that's like when they click the login i send this post request and i know that that will get to the end point because i have tests there and it follows all the way through to the point where i go yes the user is logged in and then right now what am i giving up in that well there's always you can always come up with

Starting point is 00:09:59 machiavellian reasons why that you can break that right and that's fair that's a fair fair but the the marginal cost of dealing with the the the machiavellian case compared to like the integration test cost which is giant and as you say brittle and often fragile and you know flaky um i'm i'm definitely like it's definitely not as certain because for each step i'm only 99.9 sure let's say and so you're multiplying a bunch of 99.9 together and maybe there's 20 of them and that quickly becomes like 97 rather than whatever my math is not perfect here but you know what i mean right you do you do lose something i'm not going to question it but yeah yeah the the chances of a real bug falling in that three percent crack seem diminishingly small to me and the cost for a big integration

Starting point is 00:10:52 test or at least developing all your tests as integration tests now right maybe there's a call that says like it is so fundamentally and critically important that users can log in that you do have a once a day ci thing that stands up the system and does the thing or pre-deploy or even you have a human do it right i that i prefer not to do that but like you know if you need it then this is there's a business case for it but if you build a system where that's the primary thing that you the primary way you make your tests then you are definitely making it hard for yourself right you can't iterate as quickly and yes so that would that i think i went all the way to the far end of the reply that's really good i think i think that's a really good baseline because i

Starting point is 00:11:36 think that describes the sort of basic answer the the short answer to okay if i'm not going to do integration tests what i'm going to do i'm going to break the problem down into lots of little bits, and I'm going to use the transitive property to make sure that all of these little bits can talk to each other. And each of those tests for each of those little bits can run super fast and be very reliable because they're not, you know, necessarily talking to external systems or anything like that. And when they fail, they will give you a very focused answer of this bit is broken right it's not like well something in the login process is broken and i don't know let's say somebody changes the crypt method you know we pull in a new library it's got

Starting point is 00:12:17 a different version of crypt it's not quite compatible and now users can't log in and what your your integration test says is login failed. And you're like, why? Why is login failed? And then you have to like get bisect to find the change or look through all the PRs, whatever. Whereas if your test is, I couldn't authenticate this user with this password. Right. Now maybe you've got a much more localized understanding of where the problem will lie.

Starting point is 00:12:44 Yes, yes, yes. And I think that this sort of, that sort of difference has a number of very interesting properties to it. One is, it's sort of the difference between, if you're doing any kind of validation, data validation, like checking individual fields versus checking like a checksum or a cryptographic shell. Right, right. Right? Like if you just take a whole bunch of data and you're trying to be like, what are the differences between these two pieces of data?

Starting point is 00:13:10 Well, if you run it through diff, if it's text data, you can see the differences, right? Like, oh, this line is here and it's not there. If you take both of them and you do an MD5 sum on them and you compare the MD5 sum, it's like, well, these sums are different. And it's like, well, why did it, what happened? Now, it's much easier to compare the md5 sums it requires no understanding of the data itself you don't have to look at it you don't have to understand what it is what each of the individual

Starting point is 00:13:34 lines are um it's very simple but all you really get out of that is if everything works properly, you knew that everything worked properly. Right. If anything breaks, you're left in this world of like, well, I know it doesn't work, but I have no idea why. And so I think that is why, again, creating the bulk of your tests using this sort of integration style will lead to a world in which you will operate very slowly.

Starting point is 00:14:07 You will be able to make changes very slowly because you will be constantly in this world of, hey, these two MD5 sums are different and I don't understand why. And so there's that property to it. The other thing that I think that it does that's kind of interesting is it does put a bit of a...

Starting point is 00:14:23 The alternative that we're proposing here, where you break things up into small pieces and test them, does put more of a burden on software engineers who are changing the system, right? And the burden that it puts on them is that they have to make an effort to understand when they're making a change, the parts of the system that interact with that part, right? Which is not strictly true if you say, well, I'm just going to test everything through integration tests

Starting point is 00:14:53 because I can lean on the integration tests to tell me that, right? If I go and I make a change to a system, I can just ignore everything that it talks to and everything that talks to it. And I can trust, quote unquote, that the integration tests will catch that. after the heat death of the universe. Because the combinatorial explosion of all of the different possible code paths through all of the different possible conditionals

Starting point is 00:15:29 in that system are just... It's too immense. It's like grains of sand in the universe. An interesting observation that I'd never really considered is that you are definitely reducing the dimensionality of the problem by testing unit by unit. Like, I'm testing my password hashing thing now of course there is an infinite almost infinite number of inputs to my my password hashing thing and there's

Starting point is 00:15:51 you know essentially 64 bits 128 bits whatever of of possible outputs right and i can't write tests i knowingly can't write tests that cover all that but as i'm designing that thing i know where the bodies are buried and i can use whichever zombies approach to say, you know, what if I give it an empty string? What if I give it a full string? And then I've tested it with all of those things. And I'm sure it works in that situation. And now the property that I transit to the outside world is not the high dimensionality of all those possibilities. It's the did they log in?

Starting point is 00:16:22 Yes or no. And then that's the one and only thing that's that's the one thing that escapes my world i i've separated away all those degrees of freedom from like the rest of the code right right yeah absolutely and i think one of the cool thing is is that if you sort of do what we're talking about here you will arrive at that sort of naturally and you will be um naturally incentivized to sort of make that contract with the outside world as small and as simple as you possibly can. So going back to our sort of database example with the login, right? Like, okay, if I'm going to break these things

Starting point is 00:16:58 apart and I'm going to have, you know, some system that interacts with the database to say, hey, I need the, you know, I need to get the salted password for this user, for example, right? Like, the fundamental operation of get me a salted password or get me any field that's related to a user has nothing to do with SQL or databases or, you know, servers or network connections at all. It's a very simple operation, right? And so there are various points in this sort of chain where you can create abstractions that are significantly simpler than the actual underlying system that's backing right you could reasonably have said you could have exposed the sequel to everybody for example like and then in the first case because it's

Starting point is 00:17:40 convenient hey select star from users where name equals quote whatever i have in most that's one thing but this forces you down the route that says no no i want to provide the api of bool does user exist you know bool does user have is check password whatever that's obviously terrible this is not a great way of designing a system for anyone but like those kinds of things you're you're you're forcing the testability boundary to to look more like what uh the downstream users will actually want because yes you don't want to expose them to that and then you can hide it away from everyone including yourself most of the time right right right and that has a lot of like really good design benefits that are very practical they're practical in the sense of like it's helping you write better faster more reliable tests but it also sort of in my experience it actually sort of creates a nice sort of um uh framework for discussion among like the people

Starting point is 00:18:37 that you're working with right right because like people have these sort of like esoteric almost like philosophical debates about software design all the time and what actually kind of helps that and i'm sure you've seen this with performance too what actually kind of helps that is having a constraint right saying like oh well we can't design it that way because it'll be slow or we can't design it that way because it won't be testable it actually sort of like helps people come to an agreement and say like oh yeah that design is better and i can see why right as opposed to no i want an object hierarchy here it's no we should use functions as no we should use a list no we should use a set whatever and it's like if you can tie that back to something concrete that everyone agrees is valuable of course then you

Starting point is 00:19:21 can actually have like much faster more more coherent, more aligned discussions around software design because they're based on fundamental principles that you all share. Right. Yeah. And you don't get the opportunity to do that if you don't create the constraint by saying, well, we can use whatever design we want because we're just going to create this with an integration testing. Yeah. Right. Or we're just going to test this with an integration test anyway. So it doesn't matter if you create a nice layer of abstraction over the

Starting point is 00:19:44 database, You can just have the templated string with the SQL in it and, you know, inject whatever values you directly got from the UI into that string. Terrifying. I can't even... Because that's not going to... That's actually the easier thing for you to do

Starting point is 00:20:00 as opposed to the easier thing being creating a nice abstraction that you can then swap out, right? And this actually leads me to another thing that we were kind of talking about, which is a technique that I do all the time, which is using the sort of test fakes that you create as part of doing this process as sort of the repository for institutional knowledge about strange errors. Right, right.

Starting point is 00:20:27 So, like, you build your interface to your data store, right? Right. There were air quotes for the listeners there around the data store. And it's a database under the covers. The normal implementation is a database, right? Some thin wrapper around some SQL driver or something like that that you have.

Starting point is 00:20:50 And maybe you even think about it as like a sql based thing right so it's like i'm giving this thing sql statements and it's giving me results i don't really understand what's going on behind that abstraction but that's what's happening yeah that's the interface that i have right and you have your sort of real okay we're using postgres so we use the postgres one and then maybe there's like some you there's some really stupid stub that you create maybe where you just have hard-coded query strings with hard-coded responses, and you can use that in some of your unit tests when you're just testing things like, yeah. Right. That's not the point of this test. It's not to test if the database works or if SQL's correct

Starting point is 00:21:23 or anything like that. I just want to make sure when i run this grade and i get this result do i process it correctly yeah yeah which is a very common thing to want to do right you know yeah exactly exactly but another form of implementation that you might have there in addition to the real one in addition to the super dumb fake one is a realistic one that lets you simulate error or failure modes right and this is something that like a lot of times you will have some strange esoteric error like oh yeah the cluster was rebalancing and in the middle of the rebalancing we executed this uh query and there was some sort of weird you know data partitioning condition that the database detected, and it threw this error that we'd never seen before, right?

Starting point is 00:22:10 And we have no way to reproduce it because it just depends on the timing of the database actually just doing its rebalancing. But we want to lock in the good behavior of, if we see that again, because we fell over in prod. Exactly, exactly. It created, yeah. So where does that knowledge go, right? Well, in a lot of organizations, it just goes into a ticket or a document

Starting point is 00:22:30 and that's, it sort of is, that's where it is, right? But there's an alternative here, which is you can take that hard won, hard fought knowledge where somebody had to wake up at two in the morning to deal with this thing and take it and encode it into a sort of a test double, like a fake database whose purpose is to kind of mostly behave like a real database, except it lets you puppeteer all of the different failure modes that you've seen. So when you go and you write that code to say like, hey, we have this one in a million error condition that only ever happened one time and that happened 12 months ago and i'm changing the error handling code for that right now because we're making another change how do i know if this error handling code actually still works in that

Starting point is 00:23:18 weird case yeah in that weird that one weird case right well if you encode that information into your test doubles you can have a lot of confidence that works right because you can say like yeah well we reproduced the error that we got in the client code by re-raising that same weird error that we saw out of the database the client code failed in the same way so we knew that we sort of reproduced it and then we fixed the client code so that it handled that error so that that it wasn't an issue. And we have a test that makes sure that when that client code runs and this error occurs, it

Starting point is 00:23:49 works. So I can go back later and I can say, okay, well, if I change how that's handled or I handle it in a different component, I can reuse that same fake database in a completely different component. It's like, oh yeah, when you're interacting with the storage system, watch out for that. Sometimes it'll return's like, oh yeah, when you're interacting with the storage system, watch out for that.

Starting point is 00:24:05 Sometimes it'll return error. Because it can happen sometimes and you can have a reasonable degree of confidence that if it happens again, you will handle it, right? So there is another sort of opportunity that you get by using this approach to sort of encode that information

Starting point is 00:24:20 in a way that is much more immediate than tucked away in a ticket somewhere that no one will ever look at again unless something goes terribly at the very least i mean the fake stuff i i can see that it could work but i in my experience when those things have happened definitely get setting out and like making a mock that walks the system through that uh exact system exact set of circumstances and then giving a test the name that explains it and maybe has the reference to the ticket where it was filed or the pager duty alert or whatever is a great way of saying like here is here is the pile of wounds that we've recovered from and you know here is

Starting point is 00:24:54 our list of uh of i think you said institutional knowledge right you know when we when when the gray beards go oh you probably hit that thing you, then you're going to go, well, how was I supposed to know that? You're like, well, if you'd have written a system for which the test fell out naturally, then maybe you would have found it, or maybe it would have been more obvious. And certainly refactoring the code later to not handle that will be picked up, and you'll be reminded that, yeah, yeah,

Starting point is 00:25:22 by the way, your new code, the new error handler doesn't handle the uh strange network split that that we saw and if and if all of your if you just have a big suite of integration tests that are all talking to a real database um then you can't really write those kinds of tests like you you can they're just like they're just very expensive to write you know they're going to be and in some cases they're almost impossible to puppeteer the system into a situation where exactly you can they're just like they're just very expensive to write you know they're going to be and in some cases they're almost impossible to puppeteer the system into a set situation where exactly you can make it happen even remotely or or at all you know like some strange admin commands and you know nobody wants to have any chance at all that like these unusual things where you're like maybe even killing processes to try and kind of make it do the error no that's not code you

Starting point is 00:26:03 want to write in case it ever somehow makes it remotely near production, right? Right, right. Yeah, exactly. And then you wind up doing it. It's like very, very expensive to write those tests and run them and make sure that they work.

Starting point is 00:26:15 And it's for this error that has only ever happened once. And you start asking yourself, like, yeah, we added like two minutes to our build to like, know uh bring up this this database and tease it into a state where it produced this error and then crash and then do the thing yeah that's not good and it's flaky one time right and it doesn't always work yeah you could easily argue for like well let's just cross our fingers and hope it doesn't happen again which isn't

Starting point is 00:26:42 a satisfying answer compared to here's a 15 line test that sets up the fake in the right way and calls the login method and says, yep, look, even if we get a split during the login, we just retry. Done. Yep. Exactly. Exactly.

Starting point is 00:26:56 So I think that there are a lot of benefits. In addition to the, I think, sort of more immediate benefits of the tests run faster and they're more reliable, there's a lot of, like, design benefits to saying, like, I think, sort of more immediate benefits of the tests run faster and they're more reliable. There's a lot of design benefits to saying like, okay, we're going to decompose our system in this way. We're going to be able to test it in this way. We're going to think about it in this way. There are trade-offs. Like I said before, it means that you need to think a little bit more about the interactions with components when you're changing them. You have to have a sort of a wider view on the system. But I think that those trade-offs are very are very worth it yeah i agree

Starting point is 00:27:30 so there's another sort of aspect to quote integration tests that uh is something that certainly i've done and um i know other people do as well and that is like sometimes you don't know how a third-party API a third-party server and how it works right and so you want to kind of write tests in the same way that I write test to test code that I might be interacting with kind of exploratorily even if it's my own code you know TDD style like hey i wonder what happens if we pass in this string to my code the way that i reach to do that is to write a test that does it even if it's just called test foo and it doesn't i'm never going to use that test again right and i'm not even going to check it in like it's a great way of interacting with your own code to be able to say like i just want

Starting point is 00:28:17 to make run this bit of code but so it seems natural to reach for testing for the same kind of exploratory testing with a real environment because maybe you don't know what happens under these circumstances. Yeah, absolutely. There's a real power in writing tests that you intend not to keep. I sometimes refer to this, and you've heard this before, as make it till you fake it instead of fake it till you make it. So you can write tests against a real system. And it's like, I've definitely done a thing where it's like, all right, I'm going to get some read-only credentials

Starting point is 00:28:52 to this database. It might even be a production database because it's got some data in it. It's interesting to me, but I've got some read-only credentials. I'm confident that they're read-only credentials. And I'm going to temporarily put those credentials just hardwired into my test because I

Starting point is 00:29:06 don't know how this particular API works I've never used it before you know I don't know how it works I don't know how it breaks uh I don't know what the what does the error message actually look like you know yeah exactly exactly and so um and I will sit down and I will write tests to sort of explore that, knowing that I'm going to either dramatically change these tests before I actually go and check it in, or I'm just going to delete them, right? And that can be really useful,

Starting point is 00:29:36 especially if you have a setup where your tests run automatically, it almost becomes kind of like a REPL, where you can just be typing it away in a test and be like, oh, what does this thing do? I'm going to assert that the return value of this, you know, get user count is null. It should always be an integer, right? And then I see, oh, yeah, it's 105 in the production database. Cool. What if I point it at a table that doesn't exist? Oh, shit, it's null. It should

Starting point is 00:29:57 never be null. But in this case, it is actually null, right? So that kind of like exploration can be a great way when you're in that stage where we're trying to figure out how to break this thing apart right because again one of the things that that you get from this one of the trade-offs from this this approach is you do actually have to think about the design of this and how do i create these like easily testable seams that's not always obvious when you're first you know using an api or thinking about a problem right like you have to sort of come up with that that's your job right so having sort of like everything just sort of laid out in front of you in a messy way that doesn't try to do this so that you can see it all and you can interact with all you can understand like how

Starting point is 00:30:39 the system works and then say like okay i I think that if I built a little abstraction around this part, I could basically treat it like, let's say, for example, a sequence. Right? So I just have a sequence of lines that is coming out of this thing. I don't really know what it is, but I can create an abstraction on top of this API that will basically treat it like that. And then I can build my real implementation under that.

Starting point is 00:31:09 And all it does is produce a sequence of lines. And then, you know, lots of things are great at producing like a sequence of lines, like a list of lines. So I can use that as my test double. And I can start designing, basically doing the design work to tease this apart and say, okay, the code that I have that operates on a sequence of lines super easy to test i can pass in whatever i want there the api that i now know how it works i can model as a sequence of lines and so the wrap around that is extraordinarily simple and then i can do some additional things where i you know kind of like we were saying before maybe i write an integration test for that real implementation. Maybe what I do is I just get really heavy into mocks and I actually just like mock out the interaction with the API.

Starting point is 00:31:53 That's another approach. You know, you can sort of do it either way. Maybe what I do is something I do a lot is rather than having an integration test, I test for integration. So when my system starts up, I say I'm going to take all of my objects that I'm instantiating that represent real-world systems that I didn't want to write unit tests for because they would be slow and unreliable, and I'm going to make sure that they work

Starting point is 00:32:19 at the point when the system starts up so that it will fail fast if they don't. So in the case of the database, I'm going to make sure that I can connect to the database. I'm going to make sure that my credentials are good. I'm going to make sure that the tables exist. I'm going to make sure that the tables have the schemas I expect. And if all of that stuff is not true

Starting point is 00:32:34 when the system starts up, it doesn't really matter what's going to happen after that because nothing else is going to work. So it's just going to shut down and be like, hey, I expected there to be this table and it doesn't exist, so sorry, I can't do anything. And I think that that is a much more sort of practical way to get confidence that those external dependencies really work the way that you do. Because you tend to build them up in this sort of iterative form where like at the moment when you're writing the code,

Starting point is 00:33:03 you're pretty confident that it works. You know, you've done your make it till you fake it or you've written a little, you know, command line test driver program to make sure that it works or you've, you know, stepped through with the debugger and you've made sure that it works.

Starting point is 00:33:13 It's like, all right, I'm pretty confident this works. But the thing that you have no control over is the outside world, right? Did the configuration of the database change? Did the schema change? Did other things change? And so you can try to test for those

Starting point is 00:33:25 things in an integration test that runs in your CI, but that test is going to be unreliable. And it might be unreliable in a very unfortunate way, which is if you introduce a bug that causes a problem with the database, and it's critical, you have to fix this right now you may wind up in a situation where it's like oh yeah we fucked up the database and i need to go change this code in order to fix it but i can't deploy that code because it depends on a continuous deployment system that actually checks the real database which is not working right and then you wind up having to be like yeah so you wind up being having to do this thing like in a panic, like turning tests off, which is like, okay, everything's on fire. You're panicking.

Starting point is 00:34:10 You're turning off all the safeties. Like that doesn't seem like a good idea, right? Bad situation to be in. That's the very moment when you need those tests to give you the confidence that you can just very quickly get a production fix out there and not have broken anything. That's an interesting observation. Yeah, I hadn't thought about that yeah so so that that kind of style yeah i think where you're testing um your external dependencies at the sort of the last responsible

Starting point is 00:34:37 moment at the moment when you know okay this has to work now it's kind of okay if it wasn't working before but it has to work now. You know, that kind of thing can be, I think, a better way. And then you can write those tests to be very focused, right? It's not this sort of general, like, oh, do databases work at all? You know, can I execute every possible query? Because my interface here is like, well, this is a general query interface, so I should test all these different things. It's like, well, no, you actually don't care about all those different things. All you care about is does this table exist? Does it have this schema?

Starting point is 00:35:10 Does it have these types of columns? These are preconditions under which I've performed all my other tests. If I can prove those at runtime, now I can transitively believe that the rest of the code should work. Exactly. Nice. Exactly. So I think that's another alternative to sort of the code should work. Exactly. Nice. Exactly, exactly. So I think that's another alternative

Starting point is 00:35:26 to sort of like, okay, I want to write integration tests just to prove that the database is configured correctly. And it's like, okay, yeah, I can see why you'd want to do that. But rather than writing like, you know, a unit test that does that,

Starting point is 00:35:40 maybe try a different approach. Yeah. Well, this has pretty much covered everything and more that I had in my list below here. So, I mean, I think we should exhort folks to seriously reconsider any integration testing that they're doing with these kinds of alternatives. Yeah, might be a scam.

Starting point is 00:35:59 Might be a scam. You heard it here. I was going to say you heard it here first, but no, this is what. How old is that talk? You heard it here at least second. Probably like 70 heard it here first, but no, this is what. How old is that talk? You heard it here at least second. Probably like 75. Yeah, 10 years old.

Starting point is 00:36:08 Oh, dear. All right. Well, that's how much we've got our finger on the pulse here. Yeah. Cool. All right, my friend. I'll see you the next time. Good as always.

Starting point is 00:36:30 You've been listening to Two's Compliment, a programming podcast by Ben Rady and Matt Godbold. Find the show transcript and notes at www.twoscompliment.org. Contact us on Mastodon. We are at twoscompliment at hackyderm.io. Our theme music is by Inverse Phase phase find out more at inversephase.com

Two's Complement - Integration Tests are a Scam

Ben and Matt borrow a title from J.B. Rainsberger and talk about how integration tests want to take all your money. Or time. Same thing....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Two's Complement - Integration Tests are a Scam

Ben and Matt borrow a title from J.B. Rainsberger and talk about how integration tests want to take all your money. Or time. Same thing....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.