Two's Complement - Lint and Other Fuzzy Bits

Starting point is 00:00:00 I'm Matt Godbolt. And I'm Ben Rady. And this is Two's Compliment, a programming podcast. Hey, Ben. Hey, Matt. How's it going? Pretty great. Awesome.

Starting point is 00:00:23 Well, I was chatting to someone, you know, the other day and they were asking how much preparation work we do before we record these podcasts. And they were pleasantly surprised to learn that we don't do an awful lot. We're not scripted in any meaningful way. I mean, you and I have a couple of minutes chat beforehand and then we decide on a topic and then we just basically hit record so we're really putting that to the test today because we literally said hey let's talk about linting and then we've hit record so let's talk about linting what's linting ben that is that is the extent of the prep um what is linting what is linting? Linting is what you do when you don't have a compiler. That's one way to think about it.

Starting point is 00:01:09 That's an interesting point. Although I would say that I use as much linting in compiled languages as I do in non-compiled languages. So it's equally. But, you know, yeah, lint. I mean, so I don't even know. I've never really stopped to think about the etymology. But like lint is like... The entomology?

Starting point is 00:01:25 The entomology, no. The insect collectology of... That's going to have to be a running bit. It's like less and fewer. We just have to constantly confuse entomology. Confuse entomology. Yes. One day we're going to have to talk about insects and find a way to work that in.

Starting point is 00:01:43 I was going to say, what's the entomology of the word bug in computing right whoa oh man yeah so yeah but lint is like the stuff that you find at the bottom of your pocket after you've like worn a pair of trousers for a while and you go well there's all this nonsense and bits of fluff and whatnot and presumably linting the code term comes from like the removal of that like let's get rid of all that let's clean it up let's get the the you know the little brush out but yeah but i guess i honestly don't know what that's i don't know where it comes from but what what i think of it as is essentially static analysis of code and the application of relatively straightforward rules to say no this is how the code ought to be right the language probably gives you freedom to do things in lots of different ways and as a sort

Starting point is 00:02:31 of choice you say these ways are the ways that we're doing this thing you know we we are whenever we call our log methods we don't use python f strings and i say this and you're going to grin at me because this is my personal bugbear, because the logger automatically does some of the formatting for you. And if you don't use the logger's facility, you're paying a cost for something that's going to happen anyway.

Starting point is 00:02:56 And you can say, well, that's straightforward. Let's write a rule for it, and then I can apply it automatically to my code. But there's thousands of other things, right? So it's basically code quality checking. And then you said something about compilers do you want to elaborate on why compilers or the not the lack of so so if you don't have a compiler and and all of your code is interpreted then there's never an opportunity to do this kind of static checking uh normally naturally but if your code is compiled,

Starting point is 00:03:26 you could make the argument in any compiled language that a linter is just all the things the compiler team didn't have time to add warnings for. Well, there's that, but there's also a lot of opinionated things that they may or may not want to add

Starting point is 00:03:40 as a general principle. There's definitely things that are known to be okay, but may be frowned upon. And it's maybe not the compiler's place to to uh to to make that decision on behalf of you and you know some of this stuff also comes from the fact that you as a if you're writing in a compiled language that has stronger types then you can make some of the sort of policy decisions about how things may or may not be used by saying well i don't accept anything other than this type here and then the compiler enforces your rules you get to write the rules using the language but in say an interpreted language we'll pick on python um you don't get the opportunity to make those distinctions you have to sort of say no this is

Starting point is 00:04:18 yeah i need something else to to uphold these these rules that i'm making. Yeah, definitely. I mean, if you don't have the compiler there, then I think the linter has a much more... I'm not really trying to make the argument that if you have a compiled language, you don't need a linter. That's not what I'm saying. But I am saying that if you

Starting point is 00:04:39 don't have a compiled language, there are going to be more instances where you need a linter because there's just no other opportunity to do that. And I feel like that demand changes with various aspects of the team, like how junior the people are, how big the team is, how much consensus there is around

Starting point is 00:05:01 conventions that prevent bugs or enforce a particular style or, you know, just make, ensure that a pattern is repeated consistently and things like that. You know, it's, I think it's actually quite common to have situations where if you got everybody in a room to try to have them agree on how something should be, they would actually agree, maybe after a little bit of a discussion, but they would agree, but they don't know that they agree. And so you sort of have the linter there

Starting point is 00:05:34 as the objective judge be like, no, no, no, we pre-agreed that we're agreeing on this. So if the linter's telling you something, then that's what we're doing. And then everyone's like, well, okay, seems reasonable. Right. So there's sort of like varying degrees you something, then that's what we're doing. And then everyone's like, well, okay, seems reasonable. So, you know,

Starting point is 00:05:45 there's sort of like varying degrees to which you might agree, but I think that can be a role that linters play, is if you have, you know, larger teams especially, but really any situation where you're trying to

Starting point is 00:06:01 create that consensus around, like, what should the code be doing, what it should not be doing, linters are a tool to achieve that. And I suppose the way that you could think about it is that we have tests for the functionality of our code, and we make them automated, and we have them, like, checked in, and they're run as part of a CI. What linting is is a level above that, like meta. This is how we said we're going to write the code how do we enforce and test check how do we check that we are keeping to the rules that we all agreed on and yeah you know i think probably the the difficult aspect about linting is it's it's it is more subjective there are things that are more subjective and you have to make some i mean you alluded to something you know you get most people in a room and they say

Starting point is 00:06:43 yeah don't do X. And now I'm having trouble thinking about what X is that everyone would agree to not do. Well, the non-scalar values in Python default arguments, I think, is a great one. And it is actually one of the ones where it's like... I don't actually think it's maybe as clear as like, oh, the tests are testing for correctness and behavior

Starting point is 00:07:01 and the lints are more stylistic because you can definitely find bugs in in code with linters like right it's we it's just explain a little bit to do this oh yeah well that is because the one in particular one yeah so do you want to give us a quick rundown yeah i think i i think i can do it off the top of my head so uh so basically when you declare like a function or a method in python you can give it default arguments um and so like arguments can have so parameters that you're passing will have default values so you can say like def uh say hello username equals matt and which means if you call it with no args it'll say matt and if you call it with an arg it'll say what you passed in so you get to default

Starting point is 00:07:42 the parameters that you're passing into a function. Got it. Yes, yes. And so the problem is that Python has no opinions at all about what those values can actually be. They can be anything. Right. And very unintuitively, I think, if you pass in something like a dictionary or a list, the declaration of that, the creation of that object is done once when the function is evaluated, not when it's executed. So every subsequent execution of that function will

Starting point is 00:08:15 reuse the same instance of a list or a dictionary over and over again. Everything that's not a scalar type like an integer or a float or a string is a reference to an object that exists and so there is exactly one object that is the like empty dictionary you pass as the default exactly and now of course if you mutate that you're mutating the actual thing that's going to be passed in every single time i see right right right and so that seems like that's a language flaw that seems like that's actually a problem with the language well and that's why i kind of make a little bit i mean it's very tongue-in-cheek but i make the argument you know with with uh compilers and stuff that like the lints are just all the things the compiler team

Starting point is 00:08:52 didn't around get around to making warnings about um which is not it's not really true it's no but it's it has it carries some flavor of it yeah yeah yeah yeah yeah so um so that's like an example of a lint rule that everyone who gets into a room would would agree with right would say like you know this is all there is almost no situation yes in which you want this to be true and i mean i suppose the other thing is that with linting rules you can reasonably decide to turn them off on a per line basis which is something you normally don't do for other things like in an actual compiler it's like there is no way to get past the fact the compiler is saying this is broken your types are not compatible you try to pass

Starting point is 00:09:34 them in and if but here you can say like no no just on this line don't do it but it's it's an opt-in thing to do maybe even turn those off maybe if you decide but like if you did really really really mean to have a mutable default parameter because maybe you're accumulating things into it or whatever you're doing but it's a very purposefully done then you can turn it off but it's not the default anymore and it stops you from it protects you from yourself um down the line right yeah yeah yeah so that's that's an interesting, that's a lint rule that everyone can get behind. But I'm sure there are ones, examples of rules that people could reasonably disagree about. Oh, yeah.

Starting point is 00:10:12 I mean, a lot of times I feel like you'll have linters that have stylistic rules. Like, oh, this variable name is too short, right? That's a linting rule. Right. And like, you know, okay. Like, you know, I get that people would want to do that, but you're almost certainly going to run into situations where it's like, no, no, no, in this context.

Starting point is 00:10:33 It's fine to have X and Y. It's fine to have X, yes. Because those are what we call the parameters that get passed into, I don't know, a plotting library. Yeah, the graphing library that's plotting X and Y. And like, that's actually what they are. And you have to be very careful that you're not incentivizing people to start adding like x underscore version underscore one just to

Starting point is 00:10:50 make it long enough right yeah yeah right it's a sort of anti-pattern uh yeah but yeah there's i mean and you know um i suppose an extension of linting rules and maybe this is actually i don't know if you would agree with this how do you think about automatic code formatting as essentially a lint or rather you know like you know the the the checking of a format could be considered to be a lint it's like hey if your code doesn't look the way that we said our code is going to look then it's you're you're breaking the sort of the style guide and we we value consistency in layout so that we don't spend our, our, our time by parsing and recognizing,

Starting point is 00:11:30 Oh, Oh no, that's just a function declaration or whatever, but you did it in a different way for me or however it's done. Absolutely. Yeah. I, my,

Starting point is 00:11:36 my take on that is that you should use a consistent style like a hundred percent of the time. It is possible to do that without something to enforce it if you have a small enough and cohesive enough team. And I have been on teams like that where it's just like, yeah, we're just going to do it right. And that's fine. But no matter if it takes automatic formatting and if it takes checking of that automatic

Starting point is 00:12:02 forming to make sure that you are doing it then that's what you should do it's sort of like you know if you were to model this over the evolution of a team you start off with just like two engineers who are very senior and know each other well it's just like we're using that that formatting that we always use right yes that's we're doing right and then you work for six months and you add two more people to the team and you start to notice as you open up files that the formatting isn't quite right. And you tell them like, hey, guys, you need to make sure that your formatting is okay. They go, oh, yeah, sorry.

Starting point is 00:12:30 And then maybe it gets a little better. And then you add two more people to the team and everything goes out the window. You're like, that's it. We're putting in automatic formatting. I can't do this anymore. I think, you know, but then there's a level above that, which we're grinning at each other because I think we agree on this, that senior or otherwise experienced, I think is probably a nicer way to put it, enough where I'm like, no, day one, in comes the automatic formatting. It's just, I don't want to have to argue about this. I don't even agree often with the way that the code ends up looking, but I value the consistency and the the simplicity and and frankly things like um so i'm talking specifically in like

Starting point is 00:13:06 a c++ code base i will use uh clang format which is now pretty much across the board accepted by all editors and it's an external tool it just works and it has it's configurable enough it's got many bells and whistles and you can usually find a decent enough compromise and then you check in your format file and you say like that's just how our code's going to look now and you can usually find a decent enough compromise and then you check in your format file and you say like, that's just how our code is going to look. And you can turn it off in small regions if you really, really, really want to like have like a data structure that has a sequence of hex numbers or something.

Starting point is 00:13:33 You're like, no, these are eight align and whatever. But for the vast majority of time, you just want to say, this is our code style. And what I have found is that the consistency is great and it allows me to do refactorings in my code base without having to worry about all those other funny little rules where you know if you did a

Starting point is 00:13:51 find and replace and you replace a name with a slightly shorter name and you have like rules about things either aligning or not aligning or fitting on a line or not fitting a line you know we've all come back to code bases where you can clearly see things don't line up anymore because somebody renamed something and it's like oh it's all shifted and that just goes away and that's a kind of a nice property especially with ids that can do refactoring and that also understand uh the format file because i'll do the refactor and i'll do it in the style as described by the formatting you've got so that's very valuable to me and final thing on that as well is that um it it tends to make diffs between uh like long-lived branches less painful as you will know that like if you have arrived at this if you've like separated from a the baseline the the

Starting point is 00:14:40 main line and you've been effectively applying small changes that are equivalent, you'll run the formatter in both sides and you'll basically agree on everything, except for the parts where you're actually different. And that's fine. And then you can, like, merge them back in together. So I'm a big fan of that. But, you know, I just wasn't sure whether or not you considered this to be a linter.

Starting point is 00:14:59 I mean, here I am, like, testing to see whether or not the thing that I'm talking about, matches are very important as discussed carefully decided uh decided topic for today yeah right oh my god am i going off ticket off topic or ticket uh no i i and anytime i would say anytime you can get automatic formatting in a consistent way it's just it's less things you have to worry about right fewer uh it's you don't you don't have to like spend any brain cycles at all thinking about the formatting the one place where i think it gets weird though is if you have an ide that does automatic formatting and the configuring and i think this is becoming less and less true, but I think there are still situations

Starting point is 00:15:45 where this might be true, where the formatting operation cannot be externalized. So basically, in order to enforce the format, you have to have the IDE running. You can't enforce it in a build, or you can't use a different editor and have a command line tool that does it. Whatever you choose um to do

Starting point is 00:16:07 your formatting and to check your formatting uh needs to have a like character for character equivalent that can be run automatically it can be run from the command line can be running different tools on the ci system to make sure that you have conform all of these things i wholeheartedly agree i think that's one of the big deals. I think when a lot of the JetBrains IDs have always had their own idea about formatting, and for the longest time, I just used their defaults. But they are moving now more and more to where they're like,

Starting point is 00:16:36 well, here's the.config file that tells me how to format, and I'll use that instead. That's great. Brilliant. Don't use your own formatter. Use one that my friends who are using VI or you know edlin or whatever they're using the not yeah it's about it's just like tooling you have to have intersubjective tooling you have to have tooling that everyone and everything like every part of your process can use to produce the same result, right? You know, they can follow the same steps and produce the same result.

Starting point is 00:17:12 Because if you don't, it's just going to create a whole bunch of headache that you don't need. Right. I think what you've kind of implied in that also is that those, one way to achieve intersubjectivity is to just mandate these are the exact tools everyone must use, right? So I could just say, hey, everyone uses, you know, this this editor yes but you know we're all using pie charm we're all using sea lion we're all using intellij idea right yeah or visual studio code or vi with these yeah that's one way of doing it but right i think you and i have realized that that's a great recipe to make unhappy developers there's like one thing the one thing you can do to a developer to make them feel up you know not effective in their own you know not not able to

Starting point is 00:17:52 make the changes at the speed they would otherwise be able to do is to make them use an editor they're not familiar with it's like wearing someone else's clothes no one likes no one likes doing that i mean it's a it's a funny thing and this is sort of slightly different i know but like you know i always used to think it was funny that people would buy expensive keyboards to to work on until i bought one myself and went oh my gosh my life is a ton better now i don't have pains in my hands i got like all those good things and feel like i'm more effective in my job and then it occurred to me it's like that's that's that's the primary way i communicate to do my job is through typing yeah and so it i should spend if i'm going to spend any money on anything i should spend it on a decent keyboard a decent mouse have a decent editor that i'm happy

Starting point is 00:18:38 with so that i can do my craft as effectively as i can in the way that i am happy with and i've kind of got over that hump now that that's not like, oh, I should just be able to pick up any old thing and get on with it. You know, like it's like you don't see musicians wandering around just picking up any old, you know, cello and kind of going, oh, this will do, I suppose. I mean, I'm sure they can, you know, a good musician will be able to make good music out of anything,

Starting point is 00:18:59 but you wouldn't expect them to. And I'm not saying that that's what I was thinking, but it was an epiphany moment I have. And's certainly how i'm justifying you know hundreds of dollars worth of keyboards to uh to myself yeah yeah no and i mean you know uh you will definitely have a hard time hiring if you're like we use this ide and this tool chain very good point and you know only macbooks and all the other constraints that you put on there it's like you better be putting them on there for a good reason because you're eliminating a whole pool of potentially really good engineers uh that will not work at your company because they can't use vi or whatever

Starting point is 00:19:37 right yeah no that's an interesting thing i mean so anyway going back to it this is obviously us saying that uh any kind of linting tools or formatting tools need to be intersubjective and the obvious way to say that they need to be able to run headless on a CI server, if nothing else, right? If only that alone, that's a good reason not to have... That's like the minimum level, right? Yeah, totally. And yeah, talking about lint,

Starting point is 00:20:00 obviously you mentioned about compiled languages and there are definitely a million rules that you sorry a million a million warnings that you can turn on in your c++ compiler some of which are a little bit a bridge too far because they are very very opinionated about certain things but um there are definitely um linters that are useful to have run on your code as well there is clang tidy which i always get confused between clang format and clang tidy because they both sound like they would basically do the same thing but there are a number of things that that can do static analysis that i can do there are a number of commercial products as well um and the thing about these is this is where it

Starting point is 00:20:40 becomes more subjective very much you know like you can definitely say like there's it's not the compiler's job to tell you that you shouldn't have a naked new this is like an unmanaged new that's not being held by a smart pointer so you're like you're basically promising that i'm going to write the code to clean up this object afterwards i promise and that's a completely reasonable thing it's allowed been allowed since the big dawn of the language very very occasionally it's what you want to do nowadays it definitely should not be the compiler's position to tell you that you can't do that is my opinion but i think a linter could reasonably say under no circumstances should you use a naked new anymore you should always use one of the managed things a smart one you know sharepoint a unique pointer in fact always a unique pointer never a share pointer but that's that's a whole

Starting point is 00:21:23 other talk um and you know then you can maybe elect to turn it off but it's not really i don't believe morally that that's a compiler's job um it's a style thing and this is certainly what the the clang tidy way has moved you know there's a lot of companies have published their own guidelines there's google guidelines and then there's the llvm's own projects guidelines and these are enshrined in rules that are inside of of clang uh clang tidy so that a lot of those do say yeah you can't you can't name this thing this way you know or you can't have you can't use this idiom of the language it's just not you know don't don't do it it's widely accepted to have been a mistake and we can't take it out of the language because of the way c++ is but please

Starting point is 00:22:04 don't. Probably you didn't mean to do this. And obviously you can turn it off. So I just wanted to make a bit of a stronger case for the fact that I think linting and static code analysis, you know, these are great safety nets to have around. Let's just assume the programmer is a hostile witness. Make sure that we have tests for any changes they might make make sure that we have styles uh guidelines that are enforced and make sure that every kind of thing that we can analyze can be thrown at them and then i think it gives you a lot more confidence to make changes which is yeah a nice thing nice property there is there is a trade-off there by the way with um like you know basically build times or just deployment

Starting point is 00:22:46 times you know interesting point where you know all these static checks take time sometimes they're pretty fast but sometimes they can take a long time and you do have to kind of make a little bit of a trade-off between cycle time and the sort of safety or correctness or formatting or other checks that you get putting those things in place. I mean, I think the ideal world is that you have, and, you know, we've been messing around with this a little bit recently, but you have a pre-commit hook that just does all of these things for you. So it's impossible to check in code. I think one of the, I'm going to, I got to get on my soapbox for just a minute.

Starting point is 00:23:20 All right. I think. Hang on, I'm just getting comfortable. Go. Yep. Get, you know, get a cup of tea, lay back. So I think that the software development world lost something when we started building CI tools.

Starting point is 00:23:38 Because in the very, very early days of CI, the way it worked was there was a machine that was the CI machine and you either, you know, copied your code over there or you walked over there with a floppy disk. Like you took your code over to the machine and you integrated your code on that machine. And then when it was all integrated and the tests were all passing, then you would leave it and when people wanted to update, they would go to the machine

Starting point is 00:24:08 and they would get the latest version of the code and they would pull it back. Now, obviously, that's a really slow process. I'm not actually suggesting that anyone does that. But one of the benefits of that was that it was basically impossible to share code with other people that wasn't correct.

Starting point is 00:24:26 What we have now with basically every single CI system that I've ever seen is you share the code and then the checks run, right? Now it could be that you're sharing it on a branch, right? Which makes it a little better, but the model is still the same, right? And you still have that moment when you take the branch and you merge it into master and then you find out whether or not it's correct i think what you're sort of alluding to really is that sort of intersubjectivity it's like if if uh if you can if you can't develop into confidence that your code works intersubjectively without first checking it in and saying hey the first person

Starting point is 00:25:05 who's going to find out it's a ci machine oh and anyone else who types git pull at the same time right yep then you've got a problem right yeah right that's interesting and i fully grant that the way that this problem was being solved before was with a really horrible lock that people would hold for hours right so it's like actually while they integrated so it's it's not like it was better but there was this dimension of it that was lost where it was like you know i in an ideal world what i would want to do is not have any ci at all what i want is a completely hermetically sealed infinitely fast pre-commit hook not much that you could that you could run yes i know right i'm not asking for much here that that you could run uh on, I know, right, I'm not asking for much here, that you could run

Starting point is 00:25:46 on any developer machine and it would be completely intersubjective, would not depend on the environment at all, and it would give you extraordinarily high confidence that the code met all the testing and the linting and all the other things, such that it was impossible to commit

Starting point is 00:26:01 code that was incorrect, and then you just push, and then there is no ci because you have high confidence that the commits themselves can't be wrong yeah you've proven a different level and you know it's a distributed proof as well which means that you know computational stuff you know you haven't got the single bottleneck and all that kind of stuff that's really interesting yeah yeah um you can kind of do this with server-side pre-commit hooks, but GitHub in particular... You get all of, like, five seconds to do it, right? Yeah, GitHub has a timeout on them.

Starting point is 00:26:29 If you run your own Git server, you, of course, can do this. Yeah. And I have done that, actually. Oh, interesting. Yeah, there was a small project where I actually... It was kind of like a little bit of an experiment, but basically, like, the production deployment server and my Git repository were all the same thing.

Starting point is 00:26:46 And I backed them up accordingly. And so you would push to the server. It would run the build in a pre-commit hook that took as long as it took. It didn't really matter. And then just deployed right onto that machine, restarted the services and did all that. And like, that was my only repository.

Starting point is 00:27:02 So it was sort of like the, the Heroku-style Git deployment, except the production server was also the Git repo. Interesting. And I would take frequent backups and snapshots of that, and that was it. That was all I had. Again, I'm not recommending that as an entirely good idea.

Starting point is 00:27:20 No, but it was an experiment for doing, yeah, now I get it. It gave me this property which was like you know you can't share code that isn't both correct and in production right right that's an interesting one i guess you know you alluded to this earlier about the branch but you know enabling um requirements of pr in say something like gith GitHub, that gives you some of that ability. You are still using the fact that you're pushing it to some other place

Starting point is 00:27:51 as a sort of intersubjective test. Then, you know, you get it reviewed. And then if you don't allow mergers without the passing of a CI, then you've kind of reclaimed that. But, you know, I think probably what you're alluding to is something that we've seen at work a bit which is that if your tests take too long or too hard to run locally it can be quite easy to start using a pull request as a crutch for uh maybe

Starting point is 00:28:18 just run can someone else run my tests for me you know and that's not necessarily a bad thing if you've got other things to be getting on with if you like okay my tests take 15 minutes because they're a whole bunch of really complicated integration tests and i want to move on to the next thing so i'm going to push it to a branch and then i'm going to move myself to a different branch and go and then my machine is not bogged down running these complicated things but i think you know it's sort of obviously it would be much nicer if you had your infinitely fast pre-commit hook to run all these integration tests and you wouldn't have to think about that kind of thing you wouldn't have to context switch so much yeah i definitely i i definitely this is maybe just a

Starting point is 00:28:56 thing for me but i definitely have this sort of like idealistic mode that i have experienced in like brief moments and I never let go of it. I don't always kind of like all of the tools and all of the techniques that I wind up seeing. I'm always sort of comparing to this sort of platonic ideal of infinitely fast feedback loops and all these other things. And I'm like, yeah, okay. I kind of get why we need to do this,

Starting point is 00:29:22 but let's never lose sight of the fact that it's not what we really want. What we really want. That's a, yeah, okay, I kind of get why we need to do this. But let's never lose sight of the fact that it's not what we really want. What we really want. Yeah. Yeah. One day I'll get you working on some C++ with me and then you'll suffer for the… I've done a little bit of that. You have.

Starting point is 00:29:35 Oh, you have. Yeah, I have. I mean, that's definitely a world where I'm just like, man, this is… There's a pretty big gap. There's got to be a better way of doing this. Right. Yeah. Yeah.

Starting point is 00:29:44 So, you know, our original goal and topic was linters. We covered a little bit of C++ and some Python. Have you ever had an experience of other languages, JavaScript or something? I've used some JavaScript linters. I've used some Java linters. It's the same sort of philosophy but i am definitely in the camp of always turn on all the compiler warnings and then just do whatever they

Starting point is 00:30:09 say um one of the things that i wanted to ask you is whether or not you had ever run across any domain specific linters oh um i'm so clang tidy essentially has some of that built in um you know like you know so you could consider Google Style Guide as being domain-specific and the LLVM code base Style Guide is domain-specific. Now, I know also there are things like MISRA, MISRA-C, which are standards for the automotive industry that say you just can't do these things in English. This is not allowed. You can't ever do this this is not you know those kinds of level of things and i believe that there

Starting point is 00:30:49 are rule sets for those and if not there i believe there are commercial tools that will basically run over your code and tell you yeah you you did something that is just disallowed by misra you you downcast a 32-bit value into a 16-bit value without checking to see if it was truncated you know those are the kinds of things that you might want to say reasonably is not a good idea. Yeah, yeah. There are also, I mean, an interesting sort of corollary to that is there are now something called the guideline support. Sorry, the C++ guidelines, which are sort of a semi-normative. It's non-normative.

Starting point is 00:31:23 It's not part of the standard. But some of the people that are heavily involved in the standard are behind it some of the sort of big names of c++ and they've kind of sounded like these are definitely rules that we think you should follow right and you basically do this by default you have to have a reason to not do things this way there are clang tidy rules for those as well right to say like please turn on all the gsls that you can find sorry gsl is the name of the guideline support library and there so they've published a library that says you know this is a way of casting a number from 32 bits to 16 bits and we will throw an exception if it does fall outside of the bounds of what can be represented in a 16-bit number and the linter is in cahoots

Starting point is 00:32:01 with the library and will say okay if you are doing this through that function that's fine because i know that it's covered but otherwise i'll say no no you can't just assign this value here and of course c++ has always been very lackadaisical or very relaxed about you passing you know 12 into a function that wants a char and you're like well okay i can see that 12 is not going to overflow a char but in general 12 is an integer and an integer is probably bigger than eight bits and so the char could overflow and maybe i should be checking that now obviously the thing about c++ is that you want performance as well so sometimes you want to be able to say no i'm doing this on purpose and i'm promising scouts honor that this won't happen at runtime and so you have to use a

Starting point is 00:32:40 different version which again tells the linter look i'm doing this on purpose it kind of bakes in the linter rule that says no this is fine and so it's an interesting thing right you're using a language to almost build a dsl of like these are things that i'm i'm telling you on purpose the language has always been fine with um and it sort of almost builds on something now i'm going to go i'll go off on one here but you know builds on something that when c++ was came around um you know you remember c casts you know where you just put something parens and say like this is an int you do paren int x it's like hey there you go it's now it now it's an integer and the problem with that is that you can cast pretty much any single thing to any other single thing and the same operation was used to say no this is an integer and i want to pass it

Starting point is 00:33:22 as a floating point value on purpose or um this is a pointer to void which is just like a random address and i'm going to cast it to some really complicated structure with behavior like an actual object instance type you know wait a second those are very two very different operations one is just reinterpret the data sorry not reinterpret they actually do a conversion i know that that instruction is going to run that's going to turn an int into a float another one is like just tell the compiler internally by the way um just treat this memory address as if it's containing a completely different type of object and when c++ came in and with the sort of like complicated virtual hierarchies of objects as you were casting backwards and forwards between say um objects that were in a hierarchy, an is-a hierarchy,

Starting point is 00:34:06 then potentially there were some fix-ups that were needed to be done. Like, oh, you know, like, hey, this is... I mean, don't use multiple inheritance, but if you did use multiple inheritance and you were casting up and down to the two different base classes, potentially things needed to be fixed up.

Starting point is 00:34:19 And so that same parens cast is now going, oh, it's not actually a uh a bob it's an ian and now i need to add 16 to like the address of it so there's an ad hidden in there as well yeah the problem with the c style cast is that it would it would use whatever information it had at the time to do the best effort of what it thought it was so if it didn't know that bob and ian were actually related in any way because it hadn't it only seen the declaration of bob and the declaration of ian it would just go fine they're the same thing fine you told me this but if they were actually if it had seen the class definitions that says oh they both inherit from two different you know somewhere up on the uh

Starting point is 00:34:58 further up in the hierarchy then it would do some work so so this is a very long sidebar. No, this is good. This is good. So when C++ came along, later on in the sort of the C++, I just say, this was an, actually, I will make no bones about when this came along, but very early on in my experience of C++, there were these new style casts,

Starting point is 00:35:20 static underscore cast, dynamic underscore cast, reinterpret underscore cast, and const underscore cast. underscore cast reinterpret underscore cast and const underscore cast they have different powers and so if you did a static cast you're saying i know these are related types you can do a conversion from one to the other using knowledge like that will allow me to go from an into a float to float to an int it will allow me to go from ian to bob if the compiler knows that they have got an inherited base class it would be an error if it didn't it would say like i don't know that these are related and so therefore it would stop yeah and and so um those

Starting point is 00:35:54 kind of things they look like function calls static underscore cast template parameters the thing and so that all the way back to the beginning now the gsl things that are like hey narrow cast or narrow look like the built-in types in the language is just a function call that does this for him so there's kind of a precedent for it i suppose as far i'm going with this interesting so we need to what you're saying is we need to build a a custom linter that can check for these kinds of relationships between bob and ian and we should call it Cahoots. Because then they'd be like, you can't cast that.

Starting point is 00:36:29 It's like, no, fine, but Bob and Ian are in Cahoots. Right. I mean, well, that's... That is kind of what the static cast is doing, rather, unfortunately. But yeah, it's a funny old world. Yeah.

Starting point is 00:36:39 In Cahoots. Oh, man. It's right there. I think... Cahoots.config. Itahoots that config it's right they're right there well i guess that's uh lint has thoroughly talked about and uh code formatting and code formatting random c++ uh yes and rants about how ci used to be back in the 19th century in the i was going to say you know that my my earliest experience with ci was uh a ci server i wrote myself in python early very early versions of

Starting point is 00:37:10 python and i i actually came across the code the other day was hilarious to look through it and have a laugh but one thing i'd forgotten that i was reminded by that is that i had bought one of these um an outlet that i could control programmatically through the serial port on the CI machine. And I had had a green lava lamp that was on when the bill was green. And then I had a red spinning, like actual disco, like a police thing for when it went red, which was a great idea. I had a great fun, you know, getting, learning how to get the protocol to work over the serial cable to get to talk to this funny device and putting it in there

Starting point is 00:37:47 and as you might have guessed the red spinning light lasted about three minutes before it was unplugged and removed that's right i think i even have do i have oh i have all right okay all right i have to make i'm amusing myself here as, unseen, is rushed off into the background to his shelving and has come back with, drumroll. This is exactly what this is. Oh, my gosh. We are the same person, Ben. Here's the cable, the USB cable.

Starting point is 00:38:20 Here's the power supply. Here's the on-off switch. That's amazing. It does exactly the power supply. Here's the flashing light. A little on-off switch. That's amazing. It does exactly the same thing. This is Tooth's Compliment, man. That's what it's all about. We just branched back. We branched apart and came right back here.

Starting point is 00:38:34 We're in cahoots. We're in cahoots. All right, my friend. Well, we're going to stop there because I don't think we can top that. Yeah. I'll see you next time, my friend. Yep. See you later you've been listening to two's compliment a programming podcast by ben rady and matt godbolt

Starting point is 00:38:52 find the show transcript and notes at twos compliment.org contact us on twitter at two cp that's at t w o s c p at TWOSCP.

CODACE Plant Stand

Two's Complement - Lint and Other Fuzzy Bits

Matt and Ben talk about code linters, and meander into various topics. Matt describes the (approximately) 37 different ways to cast variables in C++. Ben argues that continuous integration was better ...in the 19th century.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

CODACE Plant Stand

Two's Complement - Lint and Other Fuzzy Bits

Matt and Ben talk about code linters, and meander into various topics. Matt describes the (approximately) 37 different ways to cast variables in C++. Ben argues that continuous integration was better ...in the 19th century.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.