Tech Over Tea - BDFL Of curl & libcurl | Daniel Stenberg

Starting point is 00:00:00 Good morning, good day, and good evening. Welcome to episode, I want to say, 185 of Tech of a T. I'm, as always, your host, Brodie Robinson. And today, we have a very special guest. Welcome to the show, Daniel Stenberg, who also happens to be, alongside the developer of Curl, the first guest on this show that has a Wikipedia page. How you doing? Welcome. Hi. Good to be alongside the developer of Curl, the first guest on this show that has a Wikipedia page. How you doing? Welcome. Hi, good to be here. It's a very short Wikipedia page. It's like one paragraph long, but there is a Wikipedia page. Yeah, I think it's actually longer in a few other

Starting point is 00:00:37 languages because it started in Swedish, actually. Oh, that makes sense. It got a prize in 2017, and I think someone made a pitch back then. Yeah, the... Why did the overlay break? Good start, good start. Here we go. We'll fix that up just right now. Yeah, I think the English one is... Sorry, it's four paragraphs long.

Starting point is 00:01:00 Either way, it's more Wikipedia than I have, that's for sure. The picture they have there is i in every picture i've seen taken of you you're staring like dead straight at the camera it it's slightly unnerving but it's kind of your thing from what i can see like the i guess so yeah like i i don't know what what it is but there's something about looking right down the lens that's just i i don't know i don't know but we're not going to focus on that uh before we get into like the actual important stuff about curl there is one thing i did want to ask you about what's the deal with Badger? Why is that the handle you use in a lot of places? Well, it's actually the

Starting point is 00:01:49 misspelled version, so it's Bagder, not the animal. Wait, is it? Exactly, yeah. And that's a common reaction, too, because I actually... It's a long time coming, because back in the mid-80s, I was playing around with the Commodore 64.

Starting point is 00:02:10 And we created demos. We made a little demo team, me, my brother, and a friend. And we started to make. And everyone had handles back then in demo groups and everything. So I needed one too. So I just figured it out. And I wanted it to be the animal. But I misspelled it immediately. That's sort of my thing. And then I figured it out and I wanted it to be the animal, but I misspelled it immediately.

Starting point is 00:02:25 That's sort of my thing. So, and then I realized it pretty soon, but then I figured, well, it makes it a little bit more unique and special, but every, of course that has the side effect that everyone first reads it as badger in their head and then they have to read. So it's sort of a reverse type of them or whatever it is. Well, it also makes it a handle that's actually possible to get like obviously you have as you know you're on master not social i thought you actually had your own server but on like if it was actually badger getting that handle anywhere would be basically impossible exactly now we at least i have a small chance now at least to have it because it's more unique than the animal. Okay, well, yeah, sure. Anyway, curl. Yeah, curl. So that's the project you're most known for. I'm sure most people in the Linux world have made use of curl in some form or another and on other

Starting point is 00:03:21 operating systems because there was the whole windows thing where people were deleting curl and then trying to update it which was a fun fun time to talk about we might talk about that as well but just for anyone who doesn't know um briefly explain what curl is well so so curl is well curses um is the project and in the curl project we may we actually make two products. We make a command line tool called Curl, and we make a library that's called LibCurl. So LibCurl is the library that the Curl command line tool is using then to do internet transfers, really, to or from a URL, basically. So most users are then used to downloading a web page or something on the command line curl and url and it'll download it spew it on the terminal if you don't have any other options

Starting point is 00:04:11 but it also then it speaks i think we now count 28 different protocols it makes uploads downloads it speaks you know countless different protocols and variations, TLS versions, flavors. So it's really a machine with a lot of different protocol tweaks and abilities in different combinations to basically do internet transfers in pretty much any shape or way you think you want it. Mm-hmm. So right now, I'm not going to read the entire list, but notable ones, FTP, Gopher, HTTP, HTTPS, IMAP,

Starting point is 00:04:52 SCP, SMTP, SNTPS, Telnet, and a bunch of others. If you want to read it, go to the cool website. It's probably the easiest way to find it. Is that just up-to-date, or is there other things it does support? No, I think it's up-to-date. Sometimes I go through that to make sure that it actually reflects my opinion. So yes, it is. Actually, the most recent ones I'm working with is the

Starting point is 00:05:19 WebSocket support and they count them as two different ones. So the WS and WSS in URL scheme. Okay. So when Curl first started, why did you make Curl in the first place? What didn't other tools achieve at the time? Or were there other tools that were achieving the job that was even close to what you wanted Curl to do? There were other tools around, of course job that was even close to what you wanted Curl to do? There were other tools around, of course. I mean, there were always tools. But it started, of course, before I named it Curl, actually, because it started already in the 1996, when I wanted a tool to just download HTTP data every once in a while because I wanted to download currency rates for my IRC bot.

Starting point is 00:06:06 But so I wanted a little tool to download HTTP. How do you do that? I had no idea how HTTP worked. So I just looked around for a tool that did that for me. And I found one that was, so I started with one, a tool called HTTPGAT. That was released in November 1996. That was around 160 lines of code. It downloaded HTTP.

Starting point is 00:06:27 Fine. That's what I wanted. But, of course, it has some bugs and some tweaks to do. So I sent back some patches. It was written by a Brazilian developer. And I started to send him small patches to improve it. It's still tiny, right? But that's how I got into it.

Starting point is 00:06:44 I learned how to HP it's pretty easy at least it was easy back then or I thought it was easy and it was easy to get started or tricked into believing it was easy uh so that's what I have and and um Rafael and the developer of it he sort of grew bored with me pretty quickly me sending in patches because I think he didn't really want to much he didn't want to do much other than you know throw it out the door here's something for everyone to do I don't want to maintain this you you can take over so I took over the maintenance of that tool in well I I think I started in in late 1996 with that H2P get tool and I played with that for a while until I found a gopher site to download

Starting point is 00:07:27 Currens Race from as well. So I added gopher support and then HTTP get was a stupid name. Right. So we renamed it to URL get. And already back then I started to get some other patches from other people too. So it wasn't just me already from those early days. So we started to do some other URL get releases. And one day we had a support for FTP. And because I found more currencies, I was a lot into currency rates. So I added FTP.

Starting point is 00:07:57 And then after a while, we added support for uploads as well. And then URL get became a really silly name because it wasn't only get. There were URLs, but not only get. So what should we call it? And also at the time I realized that URL get was too much of a sort of ordinary name

Starting point is 00:08:18 that there were also a lot of other URL get tools, right? So how would you know which URL get you were talking about? So it was just a too simple name to use. So combine there. So let's call it curl. So by the time I called it curl in 1998, in March 1998, it was already 2,000 lines of code. It did those three protocols. It had 24 command line options. So sort of that was the embryo. It was already a little command line tool that could do a lot of transfers. So at that point, there wasn't a lib curl. There was just the command line tool. Exactly. It started with a command line tool. And after a while, actually,

Starting point is 00:08:56 so I managed to get that going, got patches back, people started to use it. And after a while, I always had that in the back of my mind that this could become a cool library because it's a lot of powers in being able to do good internet transfers so i sort of figured or suspected that others might want that feature sort of that power too into their applications and they didn't everyone doesn't want to rewrite their own hp code so i turned that into a library in the year 2000 in the summertime. So from 2000, I shipped libcurl and sort of rewrote curl the tool to use the library. So that was a sort of a big shift internally. But since then, we have them both then. And I've always kept shipping them both in the same package. So they're sort of bundled when I ship the code, but they're separate, so you can use just the library if you want to.

Starting point is 00:09:48 So in a sense, maybe this wasn't your intention, but in a sense, the curl command line tool now acts as kind of a reference use case of the library. Yeah, exactly. More like a binding for the shell, sort of, to access the library. Yeah, exactly. And it's actually also a very good way for me so i often tend to when i implement something in the library it's good to implement it to make it available in the command line tool as well because it makes testing easier and access to that feature much easier so it's much easier to develop it when i can use it from

Starting point is 00:10:21 the tool rather than just have it in the library. So when you're saying you renamed the project to Curl, you kind of skipped over like what Curl actually meant. Like why was that the name that you went with? Yeah. So URL get being still a name. So I had to have a different name and I liked that I wanted it to be sort of a URL in the name because it works with URL and it still does that. So I wanted it to be sort of a URL in the name because it works with URL and it still does that. So I wanted something with URL in the name. So what, you know, naming is really hard. What do you name something? So for me, it just struck me that, well, curl, that's an English word.

Starting point is 00:10:56 You can pronounce it. And the C could be for client maybe. I also enjoy that you could actually say it like C URL. You could actually see the URL, sort of type it out in the terminal. So, I mean, that was all I needed. So, and it was short. So, you know, command line style. I'm a command line user.

Starting point is 00:11:15 So, I wanted it to be short, sort of in the Unix tradition. So, short, pronounceable, includes URL. That's all I needed. So, I went with that well yeah you could have gone with what like a bunch of other projects from the time where which is either you name it after the place it was made or you name it after yourself like there's a ton of early exactly yeah i could have done that but yeah but you know naming is really hard so what do you call something i actually think i ended up with something pretty decent. So I was happy with that.

Starting point is 00:11:46 Like if you look at things like the first generally considered Linux distro, like MCC Interim Linux, which is just Manchester Computing Center. Or then there's other distros like TAMU, which is Texas A&M University. Like that's just the entire name. It's just the entire name. Like naming, especially naming and you're trying to keep it short. If you're just trying to give it some sort of name, like that's obviously difficult as well. But if you want to make it so it's still easy to use in a command line context, like that's getting something short and easy to remember, especially with how sort of how few good options

Starting point is 00:12:26 that are available in that like one to six character mark is going to be quite difficult. Yeah. Yeah. And if you want it pronounceable too. So it's also sort of limited. So yes, it was,

Starting point is 00:12:39 it's hard, but it worked like that. And then when I, usually when I tell you about sort of this early journey then people mention wget because wget also downloads hp right and wget actually existed before um before that the the first version of wget was released earlier in the year 1996 so they actually i think they were like half a year before me or something. January 1996, it says on.

Starting point is 00:13:09 Yeah. But I didn't know about them. I didn't find, I guess also that by the time, you know, when I wanted that first little thing to just download currency rates, I just wanted something dead simple. So I guess that if I had found Wget at that time, I would probably have ditched that anyway, because I wanted something much smaller and simpler, not knowing exactly how things work. So, but I didn't. So it took, I don't know when I realized that WGAT existed. It was at least a long time afterwards. And by the time I understood it, I sort of thought of the differences to be bigger than just you know ditching my

Starting point is 00:13:46 project and going with that because already when I understood that wget existed it we were different projects and we had different mindsets and different ideas how to do things we have in an overlap so sure both tools can download stuff but otherwise I think they differ in so many different ways. So they actually are sort of, they make sense to be two different separate projects. So what do you see as the advantages of your project over something like Wget? Well, I think if, I really shouldn't say what Wget does or what's their primary. But Wget, for example, is very good at downloading stuff recursively, for example, because it knows how to parse HTML and CSS and stuff like that to find everything.

Starting point is 00:14:39 Curl has nothing of that. It doesn't understand content at all. So if you download HTML, sure, it can download HTML, but it doesn't know what HTML is. So it can't find any links or images or anything. So there's no recursive ability and stuff like that. So that's one of the primary things. And that's one of the things I've always tried to maintain in Curl.

Starting point is 00:14:58 Curl never understands the content, neither upload, download. It's just a transfer machine. It doesn't know anything about what it transfers right so so i always wanted curl to be sort of specific just for the transfer up or down and and basic transfers from the beginning and then you add the fancy stuff with options so that you can craft it exactly how you want the transfer to be done. And it's sort of, it's there, it does the transfer, and then you add, you customize it accordingly,

Starting point is 00:15:33 however you want it to, however you want to change it. That also makes it pretty, it also makes it a good way to, you know how it works and it remains working like that. Even if we add new features, we rarely add them by default, right? So it pretty much works the same way as it did last year. But now the next year, maybe you can add some new options to make it do other things

Starting point is 00:15:52 to improve that or change that. I would imagine not focusing on the content of the pages also makes it a lot easier to sort of add in new protocols and then not have to worry about how do I handle the specific kind of data being transferred over that protocol which would make it a lot more it would add a lot

Starting point is 00:16:10 more maintenance burden if you were really worrying about like okay so we have html here we have markdown files from this protocol we have these files and these files it's going to add a lot of extra work that just being like okay it doesn't matter what the protocol is. We download, we upload from it. And that's pretty much as far as the important relationship ends. Yeah. And I think that is perhaps most notable when you do email protocols with curl. Like if you download IMAP or you upload SMTP, because I want to see them as download upload, right? You don't read emails, you download emails, you don't send emails, your upload data to SMTP because I want to see them as download upload, right? You don't read emails, you download emails, you don't send emails, your upload data to SMTP.

Starting point is 00:16:50 That curl doesn't know anything about the contents, right? It doesn't know email protocols. So if you want to send an email with curl, right? Because it can speak SMTP, but it has no, it doesn't format the content for you. So you have to sort of, yeah, you have to write the entire content correctly and then use curl to send it. And that way you can do it, but curl doesn't know about it. So yeah. And I think for

Starting point is 00:17:10 me, it has helped me all the way to keep the focus really on what curl is and what it isn't. I mean, it's still grown ridiculously much anyway, but know, you have to have the line in the sand somewhere. This is curl. This is not curl. So it's good to have that in some ways, at least to know that no, no, no, no. You're talking about content. Content is for someone else. So you create the content, you use curl to send the content or receive the content. So you're focusing on this singular use case rather than trying to expand out to be, I guess, an entire... I don't know how you'd really describe it if you did expand out. Like, an entire... protocol consumption and... Like...

Starting point is 00:17:56 Yeah, like, you know what I'm trying to get? Like, if you did try to... If you don't have that line in the sand, you're going to end up massively increasing the scope of the project and sort of losing this idea of just we're trying to obtain and post content. Exactly. So, and sure, I mean, that could possibly have been possible, but I think also then we would have easily have grown the scope just immensely. And someone would have to do that work. And we would easily have, you know, wandered off in one direction.

Starting point is 00:18:30 That certainly is not where we are today. So, no, I'm pretty happy with sticking to that. So, basically, we have a few rules with basically what is curl. And that is we don't know about content is one of the things. And it's supposed to be upload or download. So no real sessions or, you know, we support Telnet, but that's the sort of, we shouldn't. One of those things we probably should never.

Starting point is 00:18:57 How did Tonic get in there then? Like, is there a story there? It was one of those early ones. I think I was a little bit too happy to just add another protocol. Sure, it's another protocol we can do it but uh looking back at it that's that i think that was the wrong decision but hey that was 20 years ago so so and then i will say i want the protocol to be you need to be able to specify it with the url and uh with the url that is have at least some attempt of being a real syntax. So, I mean, not made up by just a guy in the cellar somewhere, but some attempt to make it a standard. Ideally a standard, but at least so because URLs are supposed to be interoperable, right?

Starting point is 00:19:39 It's not supposed to be just one tool. So as long as you sort of fulfill all those conditions, then it could be a curl protocol. Well, by keeping this focus, it also adds in this possibility for someone to take what curl is as this upload and download tool and then build upon that, build some sort of extension to the tooling that then does understand the content

Starting point is 00:20:02 rather than you trying to do that yourself and sort of getting in the way of other projects that might have different approaches for handling that content like you just focus on your one thing and then if someone wants to be like okay we're going to use curl to absorb html content and then pause it in a certain way and display it in a certain way like that's up to them to go and do yeah Yeah, exactly. So that is really the intention. So we provide the engine that gets and sends the data for you, and you manage the data in however you want it. And that has turned out to be really effective and a good way, sort of a good layer to be

Starting point is 00:20:38 in. So most people don't want to implement their own data transfer engine. It's good enough if they can just get someone else's and getting data up and down. And I mean, I have to say that Curl also came in a pretty good time or sort of, it was good in time that at the same time when sort of the internet use exploded or, and internet and protocols and everything. So curl could take off with the entire explosion. I mean, who knew back in 1990s that we would have, you know, any device we ever own from now on will have internet access, you know,

Starting point is 00:21:15 cars, printers, fridges, whatever. They didn't have that back then. We had a single computer somewhere that had perhaps internet access every now and then when we dialed up. So when you were first getting started, I know that you said Wget already existed, but what other tools did exist at the time that maybe don't exist now that maybe you'd looked at when you were thinking of doing this stuff with Cale? Well, HTTP-wise there weren't many tools. So the stuff with curl like well hp wise there weren't many tools so networking wise you know that was sort of just at the end of the gopher era and and a lot of people then used when you

Starting point is 00:21:53 automated downloads you often used ftp command line tool right and that was always a bit of a problem problem to do automated downloads with ftp tool because you had to sort of pipe in the commands and hope that it works so that was the kind of tools we had back then very quirky and they were not at all good to do like you know in a cron job to download everything something every week or day or something so it was a it was not at a many different options. So downloading, for example, then an HTTP URL. As I said, I didn't find any other tool than this. So it was early days. And then as I was developing this, I of course then stumbled over others in those simple early stages that I was in then. So other people did HTTP get tools and URL get tools in those early days. But there wasn't really anything sort of really established just yet. It was just a

Starting point is 00:22:51 lot of people trying out different things. Exactly. So there were just a plethora of tiny ones that appeared and vanished. So, yeah. Okay. That makes sense. That makes sense. So, actually, where are we going from here? So, I just got sidetracked. So, we talked about where Kel... Okay, so, as you were developing Kel, what have you found to be some of the biggest challenges you've found along the way?

Starting point is 00:23:22 Oh, the biggest challenges. I'm sure that's a very broad question. We've been around for a while, right? The curl itself turned 25 earlier this year. So we've suddenly been doing this for a long time. Now I think one of the, of course, the early lessons we, so that has always been with us ever since is of course that speaking protocols like HTTP, there are protocols, they're defined in standards somewhere.

Starting point is 00:23:56 You can read the specs and it's there, but it never actually works exactly that way. So one of the lessons is that you really need a long time and you really need to just run it with different servers, different use cases and everywhere across the internet to figure out exactly how strict can you be following the standards? When do you have to be lenient? Or when is the standard just maybe a guide or when sometimes you have to do the opposite just because you want it to work in a specific way and then along the way then when so when the browsers do things for example and and then suddenly we realize that if the browsers do things in one way the specs has something another way and curl does it the third way. So which one is the right way?

Starting point is 00:24:45 We always have to deal with that. So I think that's a constant battle we do or have. Usually we try to go, often we go with the way the browsers do, but because often users want to be able to mimic browser operations with curl. So if they do things a certain way, then we usually go roughly the same way. So that's one of the things that we've, of course, learned along the way. And that's one of the reasons why it takes a long time to get the internet transfer tool or code to actually become good, because you have to have a lot of time to be able to polish all those, polish the rough

Starting point is 00:25:24 corners everywhere and understand how things actually should work apart from what its standards say. Do you have an example of browsers doing things differently than what the spec would expect to be the case? Oh yes. So I worked for Mozilla for a while, right? So I worked a lot of the HTTP implementation in Firefox and one of the, implementation in Firefox. And one of the, for example, just in maybe an easy, but perhaps a little bit sort of internal detail is when you speak HTTP one, that's a text-based protocol, right? So you send a request and there's some text-based headers that comes before the content in both directions. So when you send a request,

Starting point is 00:26:03 you send, I want to get this path, here are some headers, and then the body comes. And then the response comes with some headers and then the response. And for example, when you get an HTTP response from a server, there are different ways for the server to tell you how big this response is, how much data is following the headers here,

Starting point is 00:26:24 two bytes, a hundred bytes, or it can all. So how do you know how much data is following the headers here, two bytes, a hundred bytes, or it can all. So how do you know how big it is? Well, there's this standard header called content length, pretty straightforward, right? It says how big the response is. What happens if the response has a different size than what's in there? Okay. Well, first, how do you know it's different? Because you're supposed to, if it says 12, right? Then you know that it's 12 bytes. You just read 12 bytes of the network and then you're done. But you can't do that because there are enough number

Starting point is 00:26:54 of servers out there that will say the wrong number, right? And deliver something less or something more. And that is one of those ways where it takes a long time until you realize that. And when you realize that sort of, okay, but that's just wrong. The server is just wrong, right? So you just return an error then because the server is wrong. But that's not how you do it in browser land, right? You never say the server is wrong. You try to do the best out of the situation. And basically, browser never says the server is wrong.

Starting point is 00:27:25 It's just, okay, it delivered 11 bytes, so let's try to show those 11 bytes as a web page somehow. Even if it said it was going to deliver 12, or something in that style. And that's one thing. In curl, we usually don't have to be that lenient when it comes to that particular validation. But it's in that kind of, in that vein that sometimes, for example, browsers have a completely different motivation for doing things.

Starting point is 00:27:56 They really, really go very hard on displaying something to a user rather than showing an error. While in curl, we have many times actually rather have the error so that the user have a sort of understanding what's going on and can take precautions, maybe do the request again or something like that. So, and that also makes it really, really hard to make those decisions. What's the right choice here?

Starting point is 00:28:20 The specs has one thing. A lot of users then see that, that well when I use this tool it does this it's completely different so yeah those things can be really hard and there well and there are of course countless of examples when you really dig into those details it's supposed to do this but um one out of 25 servers do this what do you you do then? Do you call it an error or do you not call it an error? Yeah. So it's sort of about judging the user expectation there and sort of the way people are going to be using the tool. Yes. And sort of balancing just for... I mean, you cannot sort of sacrifice security, for example, just because

Starting point is 00:29:05 the users might get a better experience. Then you have to put your foot down. So it's sometimes a tricky balance. What's the right choice here? Like, for example, the silly thing, you know, checking the certificate in a TLS from a server says, of course, sometimes it's wrong. You can't just say that, continue anyway, because that it's wrong. You can't just say that continue anyway, because that's completely insecure. You just have to say fail. There's something wrong with the certificate. Even if the user sometimes says, well, I don't care about it.

Starting point is 00:29:34 I just want the data. So there's a lot of that. So it's hard sometimes to know which way is the correct way forward. You have to sort of try different things. And I think, of course, listening what users say, what is the right way forward, understand the protocols, see what others are doing and yeah. And I would assume in certain cases, it would also make sense to include like an override to have both options available where,

Starting point is 00:30:08 like obviously if it's like really important security stuff, maybe not, but if it's maybe the option of downloading like a broken webpage or putting an error, like maybe you want the option to download that data anyway. Exactly. So, so that's sort of, that opens up for this sort of option explosion because suddenly it comes up. So no, no, most users wanted to do this, but there are always these exceptions, but I have this broken thing that I really want to download from. How do I do that?

Starting point is 00:30:30 So no, then you have to add an option for override this check for these kinds of stupid servers. So you add that option and then someone comes up with the different broken server that they also want to download from and, you know, add and add it. And suddenly you have 24 of those options that add flags for different weird or broken crap. But they're all there and there are users of them. So yes. And that's actually an explanation also why the number of command line options to curl really has taken off like crazy since then.

Starting point is 00:31:03 How much are you at now if you know the number off the top of your head 257 since a few days ago so 24 in in march 1998 257 now i don't even know what half of these options do i'm sure they've all got a use case for somebody, but yeah, I probably use three of the options at most. Right. Well, some of them suffer a little bit by our decision to not break existing use cases. So we basically, we introduced them once and we don't remove them again. So some of them were perhaps a little bit prematurely or stupidly added, you know, in 2004. Maybe we didn't really anticipate this to survive forever. So maybe if you look back at them, maybe we should have

Starting point is 00:31:53 designed them differently if we would have done it again, but it's too late now. So we add a new one. That's the better one. So forget about them. We don't use that, but it's still there because we still have to. So it suffers a little bit of that. So maybe we could have removed 20 of them if we wouldn't have, you know, want them to still work. It's a little bit of that. And then there's a lot of

Starting point is 00:32:15 exactly what you talked about, adding different things for different use cases. Basically one wants this, but we don't do that normally. And we add stuff all the time. And over time, when we add, there's also this combination explosion in protocols and ability to do things. And then you have a lot of different options to control what to do when you add different layers and protocols of different versions and blah, blah, blah.

Starting point is 00:32:40 So this really, I mean, the number of combinations is of course, I don't know, I can't even count it, but it's ridiculous, right? And a number of combinations in options and then numbering combinations of what kind of protocols that are actually then used over the wire. So it's, yeah, you can do a lot. So is this not wanting to break previous use cases, the reason why even though you don't think Tailnet should have been added, it's still there at this stage? Yes. Yeah.

Starting point is 00:33:11 That plus sort of there's not a big of a gain to remove it. I mean, I get that question quite a lot when I say, sure, it supports Gopher too. And they say, Gopher, isn't that sort of, why do we have that around? But in most cases- If you go into like the Linux YouTube space, there's like five people that talk about gopher all the time exactly but but there's also that usually those tiny edge case protocols they're very easy

Starting point is 00:33:36 to just keep around there's no not many users so we don't get a lot of problems with them there's not a lot of development there's not a lot of friction so they basically know, they sit there over there in the corner. Nobody looked at them for the last year. That's not a big problem for us. So usually it's why remove that? That's just more work to remove them than just leave them in the corner. Don't bother them. I mean, it's the big protocols that everyone is using. Those are the ones that get all the attention, get the problems, get the development and, you development and all the time. So outside of HTTP, what are the protocols you often see, discuss the most, with issues or anything just relating to the project? Obviously HTTP is probably the biggest one, but what are the other ones? Yeah, I'd say in HTTP,

Starting point is 00:34:18 they're really by far the biggest ones, but otherwise the SSH ones, SFTP, SAP. Right. FTP is still some traffic and i have users i mean still happening on imap smtp but then i know that we also have occasional reports on rtsp so there are users of. And I have this annual survey when I ask users, well, they voluntarily participate, of course. Please help me answer some questions about curl. And every year, all the protocols are selected by more than one user. And I usually get at least several hundred of responses. So at least every year, at least three people marks the checkboxes for all of those protocols, including RTSP, Gopher and the... I get Gopher, but how is there at least one person that's still looking at Telnet?

Starting point is 00:35:15 I want to know what use case that person has for Telnet. I really do. In most cases, what I get to hear about is people saying, well, I use this when I run my pen tests or when I want to do some other. Right. And a lot of people have removed the Telnet clients from their operating systems or they were so currently still there to do their Telnet stuff when you want to run them on some weird port to do manual fiddling or whatever. So there's a little bit of that, trying out crazy things that

Starting point is 00:35:43 nobody really thought you should do. But then, of course, I don't know. I mean, people say that you're using it. Sometimes I suspect that people just say that so that I won't remove them. Sure, by sympathy. No, no, no, keep this. I promise I use this. I have no idea what it is.

Starting point is 00:36:01 So I tend to actually check how many users are checking all the protocols because that seems highly unlikely, right? That someone would say, sure, I used 28 different protocols last year. No way, I don't believe it. But it's actually very rare that someone claims that. Now, you say that, but I did have someone on here before who has, so he, because of weird American laws, he owns the telephone lines on his property

Starting point is 00:36:25 and has a local dial-up network that he uses to control a local weather system so you say that's weird but there are people doing things like that out there so i'm surely surely someone doing that yes so yeah exactly and and when when you've shipped something long enough and you have all those weird things i mean who knows there could be two people somewhere doing all that. And sometimes I get that question. So, I mean, it's also a matter of counting users, right? For me, it's not more work to support something for two users than two billion users as long as code is there. As long as someone maintains it it it doesn't matter the

Starting point is 00:37:05 number of users it's still the same code and the same effort so i don't know as long as things keep running it's there everyone is happy that makes sense as long as it doesn't cause some sort of issue in something else that's more important it's just stays in its own place and does its thing exactly if it would have gotten in the way for something more important. It just stays in its own place and does its thing. Exactly. If it would have gotten in the way for something more important, like sort of if the gopher code would have, you know, been in the way for the HTTP,

Starting point is 00:37:33 then we would have gotten rid of it because I think that's much more important because that's sort of the main protocol that we do. But it isn't. And we actually have a pretty good architecture in curl so that most of the protocol implementations are off in their own corners. So we don't have to deal with them if we don't deal with that protocol specifically. And most of the transfer engine stuff is a generic thing that just shuffles data in either direction.

Starting point is 00:37:58 Is that architecture something that was thought about very early on as you got from HTTP to FTP and all of this, or was that something that had to come a bit later when you realized that this was going to become a lot more than just a couple of protocols? Yeah, it's actually just been a gradual development for a very long time. So no, it started out the most simple, basic way. And I think mostly by accident or by luck, we chose a pretty good API for Libcur when we selected how to do transfers. So we came to pick an arbitration

Starting point is 00:38:33 where you don't know a lot of the protocols when you ask for transfers. So you basically say, here's a URL, and then you change behavior with a lot of different options that are not super close to the protocol. Sometimes they are, sometimes they're not. But at least that abstraction layer made it possible for us to actually change quite a lot of internals without changing the externals. And that helped

Starting point is 00:38:56 us a lot. So it has been re-architectured and refactored quite a lot of times over the years. So no, it started out really stupid, but we have been able to sort of clean things up pretty, it still has its ugly corners, of course. And I mean, 25, 27 years of code. So, but it's pretty good, I would say. And I'm sure you've learned a lot about development as you've been going through this project. It's not just, oh, you knew what you knew then, you still know only that amount.

Starting point is 00:39:27 Like, you've learned a lot more about how you can structure this code in a way that's going to be useful long into the future. Exactly. So, exactly. And I really did not know that in the beginning, so I wouldn't have been able to do it. So, no, I think it's, I think, I mean, know so so no I think it's I think I mean as a developer I think it's pretty good to just do things that work now and not anticipate so much

Starting point is 00:39:50 what happens in 10 years because we don't know so I think it's it's been good that way and of course I've learned how to do things and we you know we added protocols that I didn't know about then and we sort of realized how to do things you know example, I mentioned I did FTP back in 1997. And FTPs are very different protocols compared to Go for an HTTP because FTPs are such a back and forth command response. You know, you have to do a lot of commands, get response, command response, command response. And then it turns out, I think roughly 10 years later or something,

Starting point is 00:40:21 2000, I don't remember, I added support for IMAP, SMTP, and POP3. And they happened to be very FTP-like, right? And that was an opportunity. Oh, suddenly I had the FTP, and then I added three new protocols that were very command response driven, more or less like FTP, but with different commands and different ways to get responses. But still, so then I, of course course re-architectured that part so now we have a generic engine to do those back and forth protocols and so i think it makes sense to re-architecture and and handle what's sort of the

Starting point is 00:40:56 next workload right what comes at you now and when we handle. We actually did a pretty good refactor of curl again late last year when we added support for new protocol stuff. These days we had, well, we have HTTP3 support since a few years, but HTTP3 is adding quite a

Starting point is 00:41:20 few new challenges, I would say, problems, interesting quirks for for someone who's doing internet transfers because it's done over quick which is done over udp so it's a completely different network stack you don't have that in the kernel you have a different libraries and stuff so you know opens up a range of new different ways to do things and new paths in the code. So we had a pretty big refactor quite recently to be able to handle all that in a decent fashion. Are those refactors a relatively common thing or does it just sort of happen whenever it really needs to happen?

Starting point is 00:42:00 I think, yeah, so they happen every once in a while. I think, yeah, so they happen every once in a while. But so we do them when we sort of, eventually we feel that, well, we've sort of come to the end of the road when it comes to handling things this way, we really have to redo something here so that it becomes easier to manage. And maybe, you know, we've added four different things and now someone says we should add a fifth to sort of everything. Oh, wait a minute. Adding the fifth one is just going to break all of this. We have to rethink how we do this and do it in a better way so that we can manage number five, six, seven as well and so on.

Starting point is 00:42:36 So let's go back a little bit to when you were first getting started. When you were writing Curl initially, did you have any sort of formal programming education? Were you self-taught? How did you actually learn to do what you were doing? I was already working full-time as a developer, as a professional. So yes, development has always been my thing. I learned as a teenager at home. I was always into programming and development. So yeah, I worked as a developer then. I did my first TCP networking open source that I see what I mentioned before I started Curl.

Starting point is 00:43:15 So I had a little bit started doing TCP networking stuff open source before that. But of course, I mean, I didn't start it with any anticipation that it would become anything. It was just my little toy to download a few things. And then it just grew a little bit and people had features. And so it was never, I didn't have any lofty goal

Starting point is 00:43:34 or ambitions to do anything major. It just grew gradually. And of course, I learned things and then we did things. And so, yeah, I've learned a lot of things, of course, not only in development. I think at some point in time doing things open source, suddenly you realize that it's not really code or development. That is the tricky part is the management of the volumes of people and

Starting point is 00:43:58 communication and talking about things and information and documentation and all of that. So in the end, so nowadays I would say maybe writing code is the easy part, right? So it's all of the other things to get everything, everyone on board and all that. I did notice your post recently on MasterDen that you had close to 1,200 like total contributors over the years, obviously not all active contributors, but close to 1,200 people that have contributed something to the project over the years. Yeah. How many people do you reckon you have actively working on the project?

Starting point is 00:44:33 So out of those 1,200 who ever committed anything, I actually count contributors too, who actually also reported bugs. And that's 3,000 now. But anyway, so I think there's about 75% of those committers only committed once. Right, right, right, right. So that's the first, you know, that's 700 something people. And then I would say that I also have, of course, I like graphs of the projects. I've noticed you post them a lot.

Starting point is 00:45:01 like graphs of the project. So I know that roughly in a year, then people required to do something like 80% of all the commits. It's like eight, seven, eight, nine, 10 persons in the team. So we are that, if I would say how big is the sort of the core team, maybe we are 10, 15. We don't have any, it's not a pronounced team we're all just here so we're i have a bunch of people who have the permission to merge code and do that but mostly people just hang around the contribute code when they can and we hang around so i don't know really

Starting point is 00:45:39 how we don't have any i mean explicit core team it's just people running around no I sort of meant just people that actively show up like obviously you're gonna have people that pop in just for one thing but you're gonna have people that sort of they want to make this their like main project they focus on helping out yeah so so I would say that's between 10 and 20 people are actively hanging out and then of course and maybe if you say people will show up once a month, they would say maybe that number would be a hundred people. So I recognize when they come back again, because there's always those people pretty far away, but repeat contributors. They probably possibly use Curl in maybe their products or some commercial things somewhere.

Starting point is 00:46:23 And they show up, you saw them two years ago and now they have patches again. So that's a very long tale of people, of course. So what do you find to be the... I know you were saying that code is the easiest part now. What do you find to be difficult about managing the human aspect? Is it just dealing, like different people are going to have different intentions for the project? Is it sort of just dealing with the different ways people communicate with the project? All of that. People have different goals, so they all work on, they all want something,

Starting point is 00:47:02 well, they all want Curl to work, but they want it to work for their particular purposes and that's a different kind of important to them. As I said, a lot of people are using curl right in their commercial products. So they are employed by someone who's shipping this in a product. So when they get assigned a thing to fixing curl, they do that for their employer. But for them, it's important to do that. But for me, that's not my main concern, right? If they do the fixing in a weird way, I tell that to them. No, no, we can't have that.

Starting point is 00:47:29 And and of course, that's and that's one way to problem. And there's this challenge. The regular also just cultural language and people just, you know, how do you actually say something online and there's a lot of just abuse because people try to be too short or i don't know you know people are just being very unpleasant at times and that's kind of annoying and and difficult to deal with sometimes you can just of course dismiss them if they go over a certain line but sometimes you don't quite understand, is this a cultural thing? Is that maybe just the way you try to articulate right now? And things like that. All the way, and of course, the regular thing that is hard sometimes to deliver review comments, just the human side of things. How much

Starting point is 00:48:22 can you critique someone's code until they give up and think you're annoying practitioner you know so like all of that and and then the educational part did you actually understand the what you are trying to do here right right there's a protocol underneath it does things on the network possibly with another server in the under end right so you have to follow protocol standards blah blah blah so yeah it's just it's just a lot of humans and a lot of communication and code in there and and things to do and then of course you know sticking to his code style and and following things and yeah yeah we we think we should write code this way but maybe they

Starting point is 00:49:05 don't think that or why do you have that weird way of doing code? Why do we insist on short lines when I want to write my lines very long? Why do you insist on? So it's all of that. Do you have a- Usually it's not a really big problem, more like that's just the things to work with all the time. When you're saying about code styling, do you have a defined style guide for the project,

Starting point is 00:49:28 or is it sort of you just judge it as you go? We have a pretty... We actually started out without anything, of course, because we pretty much didn't have anything back then. But over the years, we have sort of gradually have anything back in the... But over the years, we have gradually become stricter and stricter in many ways. I think I've tried to tighten the bolts and screws everywhere. So it's code style, test cases, documentation, everything has become... So nowadays, 20 years ago, it was really easy to just add code, just add it, submit it, and we'll ship it. But nowadays, I have to actually follow the style, code style. I have to bring tasks, bring the documentation,

Starting point is 00:50:08 motivate what you want to do and explain it to us and make us sort of buy your proposal, and then you can do it. So yes, we have. But that also comes with me over time realizing the importance of it or sort of, you know, for example, code style, that might sound silly, but I actually believe in when you can read the code and it actually looks like it's a single author is actually helping because it makes it much easier to understand. And when you do debugging code, reading code, you want it to be a single flow, right? You don't want to have a jump. Wait

Starting point is 00:50:37 a minute, what changed here? Different styles of doing things. So I actually think that's a very good way to just make the code easier to read and understand and work with. So that's one reason why I sort of nowadays insist on a very unified code style. But we check all that with tools. So nowadays you get red lines in the CI build. So usually we don't have to point that out as a user. I'm sure that makes things a lot easier, not having to manually check over styling now that everything just be it's just you just set up your style guide and whatever software you're using and it just does it for you and hopefully hopefully that means they just give it to you in the correct style guide and the correct style it's easier and i think it also

Starting point is 00:51:18 helps for the human side because i think the contributors it's easier for them to take it when there's you when there's a tool pointing out the problem, rather than a human saying you should remove a white space there. Because I think if humans does it, it's a different sort of, you interface humans differently than you interface just the error output from a tool. Right. No one's going to complain the compiler is telling them their code's wrong.

Starting point is 00:51:43 It's just the code's wrong, fix it. Exactly, then you're just, okay, I'll fix it. Sort of rather than argue with the tool that the tool is wrong, yeah. Now, you might think the compiler's wrong and then try to run it again, it still doesn't work, but, you know. Yeah, yeah, of course. There's always those, no, no, it must be a compiler bug. Yeah, for sure, for sure. I know, certainly uh i've done that plenty of times myself it is always funny when something actually does start working it's like wait what what happened there did i just forget to save the code or something like there are there are definitely

Starting point is 00:52:15 times when i've i do think something has gone wrong but it's just some error that i missed in some of the locations like oh i tried to recompile the exact same code without saving my changes. And it, like, and then it, whatever. It's just, usually if the compiler says it's wrong, you're actually wrong. Don't argue with it. Just do it. Yeah. Yeah.

Starting point is 00:52:37 So, do you find that having these extra restrictions sort of acts as a barrier to entry for bringing people into the project or do you find that most people tend to just accept that they are there and then work with them um i don't think those restrictions are a particular big sort of hurdle for people to to come over so no i think we actually have usually we have pretty low bars for people to come over. So now I think we actually have, usually we have pretty low bars for people to come into the project. So we try to be as low friction as possible and sort of stick to the GitHub working model of doing things, pull requests. So you just have to go to GitHub, fork it, make a pull request and submit it. And we don't have any, you don't have to, there are no paperwork, you don't have to do any agreements.

Starting point is 00:53:26 You just submit it and we can accept it. So I think in general, that's, it's low friction. I think what people find harder is when you, when you want to do bigger things, because then we insist on doing, as I mentioned already,

Starting point is 00:53:38 test cases and documentation. And that might be a bigger hurdle because writing test cases and doing that in a way that works everywhere, that might be a bit of a problem, but then you have a bigger motivation too, right? If you want this feature added, then you have to just accept that you need to do it in the proper way. It's more than just bringing the code for the feature. You also have to prove that it works and describe how it works and things like that.

Starting point is 00:54:06 I know there are a lot of projects out there, obviously, most notably things like Linux, which have kept using their old model that they've pretty much been using forever. It's like the mailing list model. You send an email in, you get your code reviewed by people there. At what point did you decide to start using the GitHub model? Well, at some point I did.

Starting point is 00:54:25 So we switched to using Git in 2010. Okay. And in 2010 was actually already done fairly late. We just CVS before that. But from the curl perspective, that was not a big problem because we've always had a pretty simple way of developing, you know, linear main branch.

Starting point is 00:54:42 It wasn't that hard to use CVS. But at some point we we switched to Git, and it's, of course, a much better tool in many ways. But I think we didn't switch to the GitHub model immediately. We just switched tools first. And then after a while, I just realized that there's so many users with GitHub and with now switching to that mindset, really, of development.

Starting point is 00:55:04 So at some point in time is more of a sure you can keep sending emails you can still do that send an email patch on the email list if you prefer that way but if you ask users these days that's a minority yeah so people don't want to do it so i so i you know i ideally wanted to offer either way you go with the github way or you send an email and just over time it's just apparent that well we get an email every third month nowadays with a patch and we get you know pull requests many every day so it's that's just where people are and how they want to contribute so i again that goes with the i want to contribute. So I, again, that goes with the, I want to be there where the users are, where the developers are, and I want to have a low friction. So, and to allow them to, you know,

Starting point is 00:55:50 if they find a spelling error in the documentation, I want it to be easy for them to just send a correction to me this afternoon without, you know, adding problems or hurdles or no, no, no, you have to send an email formatted correctly. No, no, you can't use these different emails that you've used your entire life. You have to do it in another way. I think that is, I mean, some projects can do that. And always when we compare to the Linux kernel, sure, no one can compare us with them because they're unique. They can set whatever rules they want. Everyone will still- It's a very special project. Most people are not going to be able to do that.

Starting point is 00:56:21 Exactly. And they can also set whatever rules they want because it doesn't matter. You just have to go there and follow their rules anyway. And I feel that I don't have that because, I mean, we all have a limited number of contributors out there, right? There are open source people, that bunch. That bunch can only contribute to so many open source projects. So if you're adding friction and problems to them, maybe they just, ah, then I won't do that. I will go to the other project

Starting point is 00:56:47 because I had a patch for them as well. So I want it to be easy so that they can submit their patch and continue with their lives. I do respect what the Linux Turtle is doing there. Like they've decided that this is how we're going to do it and just basically have stuck to it. Like there's no point changing the model that works. It's, this is the model and if you go- I think it also works really well for the kernel because of the fact that

Starting point is 00:57:11 it does filter out those people that may not have the best of intentions. Like if you are- like if you're a project the size of the kernel, there are going to be people that are trying to do things in a malicious way and it is gonna filter out some of those people. Obviously it does filter out be people that are trying to do things in a malicious way. And it is going to filter out some of those people. Obviously, it does filter out the people who are maybe fairly new to development, who haven't really been introduced to this model before. But in a way, it does help them. But you are definitely right that most projects simply can't do that. It definitely helps them.

Starting point is 00:57:42 But they're also unique in so many other ways, as you say, in the volume, in their sort of... So, of course, they have to do a lot of things just to manage the grand scope of the code and the contributors and everything. So I'm sure that doing things my way with their volume, that would break immediately. So, of course, they have a different situation.

Starting point is 00:58:03 So we can't compare like that i wouldn't dream of comparing myself to that project but that's huge well yeah time project in comparison yeah your project i think you're saying it's past like 160 000 lines or something along that yes yeah that's i don't know how many lines are in the kernel at this point it's a couple million so it's a you know a little bit Yeah, I think there's more than a couple. I think it's more than 10, 20, 30 million. I haven't checked the number in a while, but I wouldn't be surprised if it's somewhere at that point.

Starting point is 00:58:34 So many, many magnitudes longer. And also just the number of commits they have for every single release, like thousands of commits per release. Yeah, yeah, yeah. The commit traffic there is insane. So yes, they obviously need to manage that of commits per release. Yeah. Yeah. Yeah. The commit traffic there is insane. So yes, they obviously need to manage that in a completely different way and more streamlined.

Starting point is 00:58:52 So I understand that they need different tools, different models, different concepts than we do. That makes sense. So with Curl, how do you handle... Let me think what I'm trying to say. You have the GitHub model as your sort of development model, but how do you handle communication? Do you have communication outside of that, or is it just communication is done through the issues

Starting point is 00:59:17 and merge requests on there? Is there a separate IRC, or what do you actually have? Yeah, we have an IRC channel, but that's more, there are a few core people that hang out. So IRC is still such a niche way to communicate. We have mailing lists because, you know, we exist since a long time.

Starting point is 00:59:35 So we started out with mailing lists. We have that. And we have, nowadays, I would say most communication is done in GitHub issues and GitHub pull requests because people have problems or proposed changes. We also use the GitHub discussions these days. So I would say those are the different channels that people communicate with the project.

Starting point is 00:59:55 And of course, I try to complain, yell at the world through my blog and stuff like that. Yeah. There are definitely some fun posts you made on there for sure so um we'll definitely talk about that in a moment but nowadays curl runs on pretty much every system you could possibly want it on even systems that nobody out there is using um yes but when curl was first getting started were you developing developing that on Linux or was it for something else initially? It started out before I started using Linux.

Starting point is 01:00:30 So I hadn't tried Linux in 96, actually. So it started on some, I think, SunOS or Solaris. Okay. Because we had those machines at work and I started because I ran my cron job at work. So I started out with that. And I already before that had sort of, I mentioned my bot and that was already before that working on SunOS and Solaris already in a sort of cross platform way. So I already knew about different Unixes and how to do things to work differently.

Starting point is 01:01:03 So already from day one, I wanted it to work and compile on different platforms. And that was also one of the reasons why I already made it open source from day one, because I wanted it to be open so that people could get it to run on their different platforms. But because you remember back in those days, we had a bazillion of different Unixes. So it wasn't obvious then which Unix people were going to use, right? There were so many different ones. And Linux certainly was not an obvious choice in the 90s. It became a more obvious choice as time passed. So then I switched to Linux as a primary development platform, I think early 2000s or something. Okay. Oh wow. That was, okay. That's a lot later than I thought it was.

Starting point is 01:01:49 Yeah. I don't remember, but it was something like that. Right. Right. Okay. Do you remember what you first switched to when you started using Linux? I think I used Red Hat. Ah, that makes sense. Yeah. Some of those early things. I think I started trying it out in, yeah, it was actually the late 90s. I think 1999 maybe. Because I remember I installed that and ran a web server at work. What do you happen to use nowadays?

Starting point is 01:02:17 Nowadays, I'm sticking to Debian. Debian unstable on most of my machines. Any particular reason why you've gone down the Debian route or it just sort of works for you? I did it a long time ago. I stuck with it. I'm sort of now I'm familiar, so it works for me. So I'm just not rocking the boat. I'm happy with it. So. No point changing it. It just does what you need to do. Exactly.

Starting point is 01:02:41 Right. That's totally fair. So at what point did it start working on the Windows side, if you happen to recall? Pretty early on, because we shipped it open source from the beginning. I believe already when I called it URL get, we had Windows versions of it. I mean, I didn't build it, so someone else built it, but I accepted patches. So we had it buildable on Windows from very early on. So I think the first version of Curl we shipped, that one built on Windows. So it has always worked on Windows as well. So how does it feel to know that a project you've written is getting shipped as a part of Windows now? You know where we're going with this.

Starting point is 01:03:26 We're going to talk about the pixel this. No, I think it's pretty cool. I mean, it's shipped with macOS forever. It's been in commercial operating systems since a very long time. So yeah, it's really cool. And I think it's cool to be one of those

Starting point is 01:03:42 first true open source components that Microsoft actually sort of incorporated into Windows as a component so yeah really cool so yeah I you know we're going with this we're gonna talk about obviously why did that no don't open on that one okay just open the link on the wrong screen so the um the whole Windows people deleting curl from their system, that happened back in April. Right. Yeah.

Starting point is 01:04:13 That was a situation. Yes, and I suspect that might happen again. But, so, yeah. Since, yeah, since, yeah, that's part of my entire argument, discussion, fight, combat with NVD, the National Vulnerability Database. So when we report security vulnerabilities in curl,

Starting point is 01:04:41 which we do at the rate of a few every year, maybe more than a few a total of 145 now i think through the years so we take that seriously because that's sort of what people expect from us so we you know people report a security problem with we research it figure it out document it and report it and there's a cve id number and blah, blah, blah. And when we do that, I mean, anyone who does that, we submit those CVs, so they get collected in the database,

Starting point is 01:05:12 and NVD is a U.S. government organization that's funded by the U.S. government. It's called National Vulnerability Database, whatever nation that is i wonder but yeah okay so anyway so but but they when they when they um they import those cves that people submit and they then re-evaluate the score or they set a score for that vulnerability evaluate the score or they set a score for that vulnerability. And that is a fun subject because who are they to say how important a bug in, for example, curl is? I've documented it. This is what the vulnerability, this is what happens. This is sort of the bug and here's the fix. So

Starting point is 01:06:01 everything is open, right? So if you would like like to you can go and read the code and figure out exactly what the problem is if you know if you understand c code and want to spend some time on it so it's not in i mean it doesn't take a genius but it takes you have to invest some time and some have some clues but i i don't think they have that i mean they might have clues but i don't know they i don't think they have the time for this so they don't really actually investigate the problem. They just put a finger up in the air and say, oh, this is a severity. And they have a CFSS score thing. So yeah, they end up with usually a very inflated score.

Starting point is 01:06:38 So in the Windows case that we talked about this spring, it was a problem. I think I graded it as a low-severity thing, because it's a use after free. Yes. It's a very limited time window, and actually free, and then we use the data. So it was sort of, OK, this is a potential security problem. Really, really hard for anyone to take advantage of.

Starting point is 01:07:02 It will probably maybe cause a crash sometimes usually you won't even see that so okay but uh nvd considered that to be a I don't remember the grade they said I think they said critical first and they lowered it to a high or whatever but anyway they then they set a score for it so the nvd comes up with oh, this CV has score and the number, right? And fine, we can just disagree, right? They can set a score. I can set a different score. Who's to say it was right?

Starting point is 01:07:34 That would be fine if it wouldn't be the sort of the follow-up is that nowadays there's a lot of these security scanners who are also importing all those CVEs into their tools. So now they have a bazillion different CVEs and they scan operating systems for vulnerable components. Who is vulnerable for these different CVEs? And they started then to find curl vulnerable for this CVE rated this high. And a lot of organizations apparently nowadays also then have a contractual obligation to fix all problems that have CVE graded higher than seven, eight or whatever. And they have a contract to say they must fix this problem within this number of days. So sure, now you have.

Starting point is 01:08:20 And then when they find curl embedded in Windows, vulnerable to this CVE, very serious one, critical, you have to fix it within five days. What do you do as a user? How do you fix it? Well, you can, I guess most of these people, they don't actually call someone at Microsoft. No. I don't know if they can or if it's even possible. Well, especially if you're like an individual user who is running one of these security scanners. Yeah, exactly.

Starting point is 01:08:45 And so then you can do well, a lot of people did one or two things. A lot of people downloaded curl from us, you know, a new Windows executable, the latest one, and just replaced the one in Windows, you know, overrode it with a new one. And some just deleted the one in Windows, which is in, you know, C colon win, blah, blah, blah, whatever it is, and just delete it. And the security scanner is happy. Mission complete. Until, of course, a few days later or whatever,

Starting point is 01:09:15 when you try to do Windows update. And then it says, nope, I'm not going to update. You have tampered with the Windows installation. I love Windows. It's great. You know, analytics system, if you tamper with it, it's like, the file's gone. Okay. Whatever. It's fine. It's not my problem. We just downloaded it. Then you put it back in the next, because you get a new version of that. Yeah. But that's not how Windows works apparently so yeah so i've got quite a few questions and people actually pretty upset about that but because suddenly their windows updates didn't work so how do you actually upgrade curl for windows

Starting point is 01:09:56 and you know that then you end up like me you end up in a really weird position because i didn't ship that curl to them they didn't download that curl from us even. They downloaded that as a part of the Windows installation, right? So they got it from Microsoft who built it, compiled it, and shipped it as part of Windows. It's really part of the Windows. Did you get bug reports directly to the curl project during that? Yes. project during that yes and also i also got since i sell curl support i also got people trying to buy buy help from me to solve that but i mean i couldn't it's not that easy to fix because it's

Starting point is 01:10:35 not our problem right it's a problem with that user and their windows installations it's not yeah what i found the what i found kind of the worst part about that was when people were going to places like microsoft answers.com and there were people who were like marked as like official help there who were suggesting deleting it yes exactly it's one thing if you go to reddit and there's just random person tells you when you are marked as like the official help guy And your advice is Tamed with the Windows system, which is going to break the updates like that's not good advice No, no, that's really horrible. So of course that is totally I mean a good explanation why people would do it, right?

Starting point is 01:11:22 It says so on the Microsoft site. That's the solution. So, I mean, that's not even, I mean, of course you would do that if it says so on their site. And it seems to be a trustworthy person who has sort of, yeah. So, yeah, they kind of led the users into that trap. And then, bam, now you're in deep doo-doo. And I did notice there was a more recent one you talked about in your blog, the CVE202019909.

Starting point is 01:11:54 Yes. So I didn't actually read the blog. I briefly skimmed over it. But what's the deal with this one? Because that's a CVE from three years ago, judging by the number. Yeah, this is a totally different level of crap, really. Because this is multilayer stupidity. I don't mean to say that I'm unique here either. This is not the first time it happens. This is just a sign of the CVE system being a rather flawed system. So basically CVE is a way to, you know, this is a bug tracker ID.

Starting point is 01:12:26 So anyone can submit a CVE or create a CVE ID. It's just submit it and say, here's a bug in this product, blah, blah, blah. And someone then did that. And this is an old bug from 2019. Right. It was actually a bug. Someone actually reported as a security problem

Starting point is 01:12:46 back then. I dismissed it because it's not a security problem. It's an integer overflow. So it makes curl miss. So it's a fun overflow too, because it's the retry option. So if you say retry in nine bazillion years, and so instead of retrying in nine bazillion years, it actually wraps around, it becomes 12 seconds instead. Right, right. But it's a highly unlikely scenario to actually happen because who is going to use that? Apart from, okay, so I determined this is not, this is a bug because it wraps and it behaves stupid. So it's a bug, I graphics debt. But there's nothing security related. Three years later, someone created a CVE for this. Right.

Starting point is 01:13:28 It ended up in the NVD database again. And what do they do? They set a score. And do they check it out? As I said before, they probably don't have the knowledge and skill. And in this case, they probably just roll the dice or they just pick the default value or whatever.

Starting point is 01:13:43 They said 9.8 critical, which is, you know, 9.8 out of 10. That has to be a pretty nasty bug, right? Yeah, yeah, yeah. Instead of that wraparound for the retry time. So they could actually retry in 12 seconds instead of at the end of the universe. So now there's this totally bogus CVE said to be 9.8. And it's not even a bug. I mean, and we fixed it three years ago. So now there's this totally bogus CVE said to be 9.8. And it's not even a bug.

Starting point is 01:14:08 I mean, and we fixed it three years ago. It's totally silly. And then nowadays, then of course, that bug now appears everywhere because everyone is adapting to those new CVs that pop out. So now everyone has to say if their operating system is vulnerable to this security flow or not. But this is not really. This is just a stupid thing. So I've, of course, tried to reject this now in the database.

Starting point is 01:14:34 But it's a slow process. And it's much harder to take it away than for someone to put it there. So I would say it's a sign of a rather non-optimal system for doing these things. I wasn't aware that just anybody could submit a CV in that way. I thought there was some sort of oversight in that regard. I'm telling you now, I'm opening a Pandora's box. No, it's actually slightly more complicated than so, because there's also a way to just make it slightly more complicated. So when you ask for one of these CVEs, an ID, there are hundreds of different organizations that work on giving them

Starting point is 01:15:18 out. So it's a distributed system. So one of those that you ask for. So, and those they're, they're called CNA number authority, CVE numbering authorities. I think they can actually lock down and say, only this authority can give out numbers for that particular product. So you can actually have a particular authority. So then you can actually limit how much. So that's why you cannot just, you know, get a CV on any known brand product you can think of, because then they will say, oh, well, you have to talk to this CNA. And they will, so if you make that up,

Starting point is 01:15:54 they will say, no, no, this is a bogus. We won't give you one. So there is some kind of system in place, but that system just protects a certain subset of products. I don't have a product that is protected this way by any CNA. So you can, for our products, anyone can create a CVE at any time. It seems like the, I was just having a look at the CVE now. It seems like there has been some update here. It is now marked as disputed and links to your blog post about it.

Starting point is 01:16:24 So it seems like that's good there seems like there's been some effect there which is good it's still marked as 9.8 critical but it's at least got disputed there so there's something that is good that is good yeah so since since I've had this come I have this conversation with nvd so so I've been since I think they've inflated the scores on our CVs many times so i tend to email them every now and then so i emailed them about this as well so i've sort of said this is completely wrong you should just mark it somehow remove it ideally and they are mitre.org is the organization that actually hosts the sort of this source database so now i have asked for it to be rejected there as well. So ideally

Starting point is 01:17:06 hopefully at some point someone will say, yes, it's silly, we remove it. But I think also my blog post actually sort of got noticed quite quite. I noticed that the Ubuntu one also says now it's disputed and links to my

Starting point is 01:17:21 release. It says this CVE is not a security issue and the co-author intends on disputing a cve marking as not affected priorities so there's there's hope that there's going to be some level of sanity at least surely if you just complain to them enough they will eventually fix the system yeah and i'm going to write the exact you're going to write the exact same blog post three months from now i guarantee it i i fear that but i also have some other uh plans in motion to to address this in a in a wider and in a better way too so actually i'm probably going to apply to become cna myself okay for the curl organization in that way, because then I can, as I mentioned, limit

Starting point is 01:18:06 that scope so that I can limit who can submit CVs for our products and also hopefully reduce the rescoring that NVD does on the issue. So hopefully that action can fix both of those. I think it's a bit sad that I need to go that path because it doesn't seem like a scalable way because certainly not the entire world can solve it like this. But I'm trying the best I can to solve my issues at least. I'm sure it's not just an issue in regards to your project. This is like a widespread thing with cves absolutely exactly so it happens all the time is becoming a cna like a massive undertaking like just doing it

Starting point is 01:18:53 in regards to your project no no okay well i'm only in the beginning so i'm not the right guy to ask so ask me again in a few months what you've done so far does it seem like it's going to be like this at least from what you've looked at it like does it seem like it's gonna be a massive deal exactly so since my scope is limited so it seems to be a rather limited amount of effort and energy to waste on this and there's no fees and there's no contracts and stuff like that so it seems to be decent a pretty good way to do it i hope right okay so then if if that does happen then you are basically like the the people people can then come to you be like hey there's a problem here but you are the sort of the single source of truth on whether there are these

Starting point is 01:19:38 vulnerabilities is that understanding exactly exactly right that makes sense. So hopefully then, you know... Exactly, that will hopefully prevent me from writing the same blog post in two months. It'll be slightly different. Well, that would definitely be good because I know... I feel like I've read that blog post a couple of times. It's just gotten gradually more and more

Starting point is 01:20:03 and more annoyed with NVD. Exactly, exactly yeah i feel that way as well it feels like well wait a minute didn't already have this problem but it's yeah so variations of the same issue when do you feel like this like first became a problem for you because obviously there was a point where you know there wasn't really any tension in that cve way but at what point point did you start realizing like maybe there was a bit of a problem? I think it started to become a problem when people actually put some weight to these scores they put on the issues. Because previously I've never bothered about them,

Starting point is 01:20:37 you know, setting whatever, it doesn't matter to me if they set nine or four, you know, seven, I don't care. I've always sort of put an honor on documenting our stuff properly and well and well-reasoned and sort of argued this is how we grade it. And I think that is the sort of the canonical truth, if you will, about our problems. And then I haven't really cared about what they then think about it because

Starting point is 01:21:03 everyone can have their own opinion but now when everyone puts so much importance to their scores suddenly it makes a much bigger difference because now when they put such a high scores and people take that so seriously and they run into problems and they come back to us so we sort of have to spend time and effort on those problems caused by them inflating this course that's when i think it starts affecting us in a much more negative way and that's the time when i start to sort of say hey well why are you doing this because it affects us in a negative way right right okay so it's sort of just now that you have these

Starting point is 01:21:43 security scanners everywhere now the organizations do take it as like a big deal it's sort of just now that you have these security scanners everywhere and now the organizations do take it as like a big deal, it's now become more of a problem that you need to actually do something about. Exactly, exactly. Right, right. That makes sense. That makes sense. So. I heard a beep again.

Starting point is 01:22:01 I don't know where that beep came from. It was like slightly in the background. I don't know where that beeping is. It was slightly in the background. I don't know what that... I'm sure there's something weird going on with either Jitsi or with... I don't know. Because that came back a little bit. I cannot hear any beeping here, at least. Yeah, I don't know what the deal is there.

Starting point is 01:22:20 I don't... Maybe I'm just hearing things. That's entirely possible. And people are going to listen to the recording, and they will notice absolutely nothing. But, I hope that's- Raise the volume to max now, and listen for the beep in the background. Is it there, or isn't it? It just came back for a moment, like, it's not been there the entire time. Whatever.

Starting point is 01:22:39 Whatever, doesn't really matter. So, actually, where can we go from here? So, you know what we can talk about? We can talk about what you're involved in besides Curl. Like, obviously, Curl is your main focus, but, like, there are other projects that are related to Curl that you get involved with as well. Yes.

Starting point is 01:23:01 So, yeah, I work full-time with Curl, and Curl really is my big thing since forever, as I said. So, and during the development of curl, actually, well, many years ago, I also started out actually when I wanted to do proper asynchronous name resolving in curl. asynchronous name resolving in curl and then i sort of joined and then forked and become the maintainer of the project that nowadays is called c-aris which is an asynchronous dns library for name resolves and dns operations so that is a library that also keeps living um i'm a nowadays a less active maintainer of that project, but it's still there, still used quite a lot, you know, in a lot of big places. So that is one of those libraries,

Starting point is 01:23:50 Curl can still use Ceres for name-solving. And a few years after that, I added support for SCP and SFTP, right? And that's based on the libSSH2 project that I also became a maintainer of. See a pattern here? When the maintainer had to step away. So I became, I was the head maintainer of see a pattern here uh when the maintainer had to step away so i became i was the head maintainer of that for many years i don't do a lot of maintaining in that now either but

Starting point is 01:24:14 those are two libraries that i sort of i'm still there in the in the shadows and i do releases in those projects so in that aspect i've tried to work with projects that I use in curl to make curl a better product. Right, right. And for example, and this year I started a new project called TrueRail, which is just a new command line tool for URL manipulations, URL parsing, URL manipulation. Because it's more of a companion tool to curl, really. If you want to work more with URLs

Starting point is 01:24:47 like in shell scripts and extract host names, pass query parts from URLs in shell scripts or create URLs in different ways. Pretty much because it's a use case that people have ran into in many cases when they have

Starting point is 01:25:03 written scripts and stuff with curl. Right, right, right. But it wasn't really a thing for curl to do. So then I did this. It actually uses libcurl for the URL parsing things. Okay. So sort of all of these tooling. That tooling obviously relies on libcurl.

Starting point is 01:25:22 But then your other tooling sort of builds into curl as well to make it all sort of pieced together and sort of it helps you out along the way it's not just you're just doing random other things because you just feel like it no so most of my other things have been like that to sort of support curl right in different ways i mean i've done other things as well but these are the ones that have survived the most, and that's the most footprint today. So, at what point did you realize that going full-time with Curl was actually going to be possible? Oh, yeah, I didn't, I never really realized that. Well, you know, since, since, um, Well, you know, since, well, it was my spare time project for so long.

Starting point is 01:26:13 And then I started working for Mozilla in 2014, I want to say. Yeah, I think so. Started 2014. And then I worked with HTTP and stuff in Firefox. And, you know, quite a big overlap with the curl stuff because it's HTTP networking. So stuff I knew, stuff I was familiar with was fun. And then I left Mozilla at the end of 2018 after about five years. And at that point, I really, I didn't know what to do. I didn't know that it was going to be able to, I was going to be able to work on curl

Starting point is 01:26:43 full time or anything. I just, one, it felt like it was a good time to try it. And I talked to my friends who work at this company called WolfSSL, who I work for now. And they basically said, sure, we want to do this with you. So you come work for us. You get a paycheck from the first day, and we sell support to customers. We do it like this.

Starting point is 01:27:00 And WolfSSL already sells support like this for a bunch of other libraries that they host all also open source stuff so they basically have the business set up for this kind of business support contracts with commercial customers for open source libraries so it wasn't a really good fit for me and they were willing to take the risk that we wouldn't find any customers. And then, so it was pretty easy to just jump and try it out and see, does it work? Maybe, maybe it doesn't. So I just did it and they sort of believed in this. And since then we're doing it.

Starting point is 01:27:37 That's awesome. It's great that you've, obviously Curl, like by that point, Curl was already a tool that sort of everybody would had some sort of usage for but i guess you just weren't really sure like if there was going to be that commercial requirement for support there because obviously it's in every linux system but whether there was additional support there that was desired was unclear. Yeah, and I always struggle with how to take that first step, right? How do you, at day one, actually offer commercial support?

Starting point is 01:28:14 It's really hard. You have a lot of systems set up to deal with support issues. You have to have someone write contracts. You have to do all of that. And how do you do that as a single person? It's really hard to just scale that up one day. Or at least I struggled with have to do all of that. And how do you do that as a single person? It's really hard to just scale that up one day. Or at least I struggled with how to do that. So when Wolf of

Starting point is 01:28:30 the Shell said that, sure, we can do it like this, it solved that problem. And then, of course, I still actually struggle with that thing. As you mentioned, Curl already existed. Curl already works. Curl is already a solid, stable product. So I always

Starting point is 01:28:45 have that struggle when I talk to customers, right? It's already working, it's already finished, we don't have any problems, we don't need to pay anyone any support. Because they've used this for 10 years already and never had a problem. So tell me again, why are we going to pay you anything at all for this? You gave it away so of course it's that struggle too so what then goes into the commercial support usually it is a the most most my commercial support actually start with them finding a problem so usually it is that they actually fall into pits somewhere and then want to help with that. And then they realize that it's pretty good to have that relationship. So when they have

Starting point is 01:29:30 a problem, I can fix it in much shorter time. So it's guaranteed response time and a communication and a little bit of a hand-holding and helping them with questions and stuff like that. and helping them with questions and stuff like that. So that's the normal just support thing. But there's also a decent business in expanding things. People fall into, well, we'll miss this particular little feature around this. So maybe you can do that. And then we can do that as just contracting around it,

Starting point is 01:30:03 which in addition to the support. So you get more options in curl. Yes. And usually, and also just completely open source. So contract customers, they pay us to do something and then it ends up in the open source version and everyone can use it. So win for everyone. Well, yeah, that is the nice thing about doing everything in this open source way like you're getting paid to basically help everybody receive this feature for free even if it's not something that most people are going to be using like there might like those weird people out there that mark yes on every single protocol like they're going to be someone else out there that probably needs this feature as well. So, like, it is great that, like, this whole idea of open source does exist.

Starting point is 01:30:50 Right. And in many cases, people don't actually know today that they will actually use this in five years, right? So, the company that pays for something, I add it today. And people today say, wait, why are you adding that now? Because, I mean, who's going to use that? But it's something that, you know, who's going to use that? But it's something that we're going to switch over to. And in the future, we are going to do that way because that's the way we are going. So it's a lot of that too, that we don't really know, or most people don't really keep track of what we are using today or what we are going to use

Starting point is 01:31:19 tomorrow, right? So who knows where we're... And one of these days you just end up in a situation, now I need to use this obviously, because I noticed that the browsers are doing something. And then you wanna do this with a tool, which tool is going to support that. And then it's a really good thing when we already had that support because someone paid us to do that three years ago.

Starting point is 01:31:43 So one thing I did see you talk about the other day, I kind of have to scroll through your master, because you're posting a master on more than I do. I don't know how you have time to post a master on as much as you do. But I did see you post this, I guess it's a slide

Starting point is 01:31:59 of people sort of discrediting a lot of the work that Curl has done. Like, you know, I could write Curl in a weekend. Yes, yeah. I collect some of them, some screenshots of them. I think it goes back to the sort of notion people actually think that some of the network stuff is so easy.

Starting point is 01:32:26 And I mean, it looks so easy at times, and it's often actually also easy to do it in a, you know, half broken way that works most of the time against your particular server in a controlled way, right? So, I mean, as I said, the first version of HTTP get, that was just a few hundred lines of code, and it actually downloaded HTTP fine, right? Well, I'm sure, yeah, copying that would probably take a weekend to do. Exactly. And I think that is the mindset with so many people. So sure, I mean, it doesn't take a genius to write those hundred lines of code. You can do that in a weekend. Sure, everyone can do that. And actually, sure, you can do that. But that is not really what curl is. And I think people have a hard time sometimes

Starting point is 01:33:08 to grasp exactly what it means to be curl. Yeah. My favorite one is the one you put in the middle. I think you can replace 99% of the uses of curl, download one file via HTTPS with like 100 lines of Python or Rust or Go. It's not critical infrastructure in the same way that OpenSSL or LLVM or WebKit is.

Starting point is 01:33:26 Like, yeah, you could replace downloading a file with 100 lines of Python script. But what about the rest of it that it does? Yeah. And there's such a stupid comment. Because, yeah, you would use that. But how would that download it? It would still be a library in there that would download HTTP, right? So someone would still have written that anyway so yeah so and in many cases they would end up using

Starting point is 01:33:53 lib curl anyway in somewhere below there right so so of course i and i also get that quite a lot i think because from the outside you know it doesn't look that complicated and it looks the same as it has looked like forever. And a very common comment is that I used curl 10 years ago and I used it yesterday and didn't spot this, there's no difference. I used it exactly the same way. It looks the same way, behaves the same way. So what did you actually do with the last 10 years? You work full time on this and it's exactly the same way. So what did you actually do with the last 10 years? So you work full time on this and it's exactly the same. Well, that's sort of the thing you're trying to go for. Like you're, as I said, you're not trying to break old use cases.

Starting point is 01:34:34 So it should look the same that it looked 10 years ago. Like that's the point. Exactly. That is so sure. You know, it is exactly. And that takes quite a lot of effort to make sure that it actually looks and works exactly the same. But when you're sort of, if you don't think about it properly,

Starting point is 01:34:50 it might look easy or stupid or I don't know. Well, it's also really easy to, this is why I like to bring developers onto this podcast. It's really easy to criticize a project when you don't really know the person behind it. Like, you look at the project as this giant block of code. It's like, this is a thing. It doesn't matter who made it. It's just, this is a thing that exists and ignore all of the effort that's gone into that over the years, making sure the test case is there, making sure the documentation is there, making sure that all of this fits together in a way that makes sense and doesn't unnecessarily introduce security vulnerabilities. Like, dealing with all this stuff, like, someone has to do that. It's not just magically appearing.

Starting point is 01:35:55 And I think a lot of people tend to forget that there are these people that work in the open source space that do spend a lot of time working on these projects that sort of, I guess, yeah, a lot of people just ignore the human aspect entirely. Yeah. Yeah. And the time it takes to mature. And I often try to come back to that, that one of the biggest the biggest competitive advantages curl has is time we have sort of been out there so i mean if you want to compete with curl sure you can write a replacement but can you write a replacement that has you know proven itself for two decades already without a breaking functionality that is some of the powers with curl not that it actually works today today because you can write something that seems to work today.

Starting point is 01:36:26 That's easy. But keep working and keep working the same way 15 years from now. That is harder. That is a lot of work. Well, what also helps now with making a copy of curl is that curl has come up with this model that works. So it's not like you're going from where you were back in 1996 where you didn't know what you were doing like if someone wants to copy it like they can do it they

Starting point is 01:36:50 can use it as a reference yeah that is that is so true because sometimes also people seem to think that that where we are now is sort of it's natural that we ended up here right look back from 96 to here of course this is where we ended up but it was never i mean why would we end up with this it's every single thing we do right it's a decision should we do it that way should we go that way is this the right thing is this the wrong thing should we do this should we do that and every little decision we do and you know building things in a particular direction brick after brick after brick, and then eventually today we have this. So sure, you're completely right. So replicating or reproducing, writing a

Starting point is 01:37:30 replacement today, that would be much easier because we've already showed how this is how you can do it. So if you want to do a replacement, this is sort of one way to mimic it. Of course, much faster than all the wrong turns we have taken through the times, or as we've talked about rearchitectures and refactors numerous times, because we came up with things that, oh, wait a minute, we didn't think about this before, and now we have to do it this way. So of course it takes a lot of that.

Starting point is 01:37:57 And, but ideally also we are actually underneath then underneath the surface actually pretty well suited for what possibly comes next as well, right? Because we've done this so many times already. We know that down the road, around the next corner, there's going to be new protocols. There's going to be new versions of every little subcomponent in here coming, maybe not this year, maybe not next year, but they're coming. And then we know that we are going to have to add support for them. And we know how to do that.

Starting point is 01:38:26 And, you know, adding more combinations into this lovely mess of different protocols and versions. As you've been adding these protocols over the years, which protocol do you find was the hardest to work with? Maybe that's obviously the first one would have been like the one you sort of weren't really sure about how to do but i guess besides the first the very first thing you did like what what have you found to be sort of difficult to add over the years well i think i mean hdp is by far most complicated and the biggest one in parallel. So I think that is we always come back to that

Starting point is 01:39:07 anyway. And also we have so many different versions and versions that are so different from each other. So I have HTTP one, two, three, and they are so different. So sure, they might look different. It was the same on the outside, but they're so different on the inside. So right, we have a lot of different libraries, a lot of different libraries, a lot of different handling and fallbacks and different code paths. And a lot of these legacy too, since was created in the nineties, right? And we did a lot of things differently then, but we

Starting point is 01:39:37 still did things and we have that around still. So it's a lot of that. So just all the bulk for HTTP, all the different versions of HTTP and proxies and authentication and down to cookies and all of that. So yeah, it's super complicated. If we would cut out HTTP from curl, there's not a lot left. The other ones are very simple in comparison.

Starting point is 01:39:59 Well, maybe I should possibly say that other stuff like TLS is probably also complicated. Right. But from our point of view, it's not because we use libraries for that. So we don't implement TLS ourselves, for example. So therefore, I don't include TLS in my, it's not in my scope. That's someone else's problem to deal with.

Starting point is 01:40:20 Exactly. So that's, I don't include that when I think about complication. to deal with. Exactly. So that's, I don't include that when I think about complications. If you had to just guess off the top of your head, how much of the code base is dedicated to just getting HTTP working? It's hard to say. I don't know. I mean, so much, I say the 160k lines of code, but that's also because, you also because we have so many different backends for different things. So that also adds up a lot of different, a lot of code that is not used at the same time.

Starting point is 01:40:51 For example, we support, now I think it's 12 different TLS libraries, but you usually just go with one of them, right? So we have a lot of code, right? But you just go with one of those TLS libraries. So I guess you would have different backends to different operating systems. Yeah, that makes sense.

Starting point is 01:41:06 Yeah. And different alternatives on any of them as well. And we have different HTTP3 backends, we have different SSH backends, we have different name resolving backends. So that means that you pick and choose a lot. So that's also why the total amount of code is 160k, but you never build all of that in a single build. You're probably less than 100k once you go to a particular huge build yourself. So maybe out of that for HTTP, it's going to be, I don't know,

Starting point is 01:41:36 40k, 50k, I don't know. So right now, obviously you probably would be working on it if this is the case. What do you find right now is an area where curl is lacking? Like, is there anything where you feel like maybe it needs to be cleaned up? Maybe there needs to be some additional functionality there? Or do you feel like right now curl is in a really good state and it's just adding things as they come up?

Starting point is 01:42:02 things as they come up? I think curl is covering a lot of what I think it should cover. It covers really well and in a good way. So I think it has a pretty good coverage of a lot of the protocols that I mentioned. So I think it's most of the stuff is just

Starting point is 01:42:20 there, right? So I think what we mostly adapt to is new edge cases and things sort of changing. As in coming now, for example, I mentioned HTTP3, so we still only have that experimental. So we don't even have that enabled by default. So that's one of the things that's sort of going to change things. But then around the corner is, for example, adding new things to do proxying over HTTP3. And then you can do HTTP3 over the older HTTP protocols. So you can prox HTTP3 through H1 and H2

Starting point is 01:42:53 proxies and things like that. So we don't have that today. So that's not included in the coverage. But when we add on, users are going to start doing new things outside of what curl does today. And then we're going to see maybe, well, that curl is lacking in that area. So then we're going to spot those things that curl can't do that. And I think we see that happening when we do things, when we change networking or how we do networking in general. You know, IUTF invents new things. The browsers follow and do new things. Now we can do, you know, like, for example, when I still was working for Mozilla, we did

Starting point is 01:43:34 DNS over HTTPS, right? Now suddenly we opened up an entirely new way to do name resolving over HTTPS instead of just over UDP as we had done before. Suddenly, the browsers were doing that, and whoa, Curl didn't do that. So suddenly, Curl could do everything you could do with a browser, but suddenly there was this thing that Curl could not do that you could do with the browsers as people started doing. So then it was suddenly an area that, Hey, here's something we should

Starting point is 01:44:05 just add to curl so that we can keep up and offer that feature as well. So I think it's a lot of that. So we covered most of it and then something changes and it opens up an area where we don't, or people fall into weird combination of things that, Hey, it's turns out that you can't do FTP with a SOX proxy and do stuff like that. Okay. Okay. So one thing I want to talk about is the governance of Curl.

Starting point is 01:44:34 So Curl, at least on the website, it says that it sort of follows a BDFL style project. Right now, obviously, you're very much like involved in the project but do you ever see a time where you would willingly step away obviously there might be health concerns that come up at some point but do you see a point where you would ever just want to let someone take over just because you feel like stepping away sure or or rather rather, I don't see that coming in the near term. But I do lay out things. I try to work in the project so that we can manage that point in time whenever that comes voluntarily or not.

Starting point is 01:45:22 So yeah, I actually try to work really hard on making sure that everything is done in the open, everything is done with a team, everything is sort of documented in procedures and things, how we do releases, how we do handle security stuff, how we do whatever, so that there's not too many weird secrets just in my head.

Starting point is 01:45:40 So if I go on a vacation tomorrow, the ones are still in the team so we have the right credentials that in in the teams that have the sort of the right credentials they can just step up and do the things without me if they want to i'm gonna bring this up because the recent situation with uh bram molinar and the vim project is in a state where they're not entirely sure what's going to happen like there are people in place that can maintain the project but nobody really has the vision that Bram had. He's got things laid out,

Starting point is 01:46:08 but there was a lot of stuff going on that's just in Bram's head that nobody's entirely sure how they're going to implement or if there's really a way to implement it. Right, right. Yeah, I would imagine that... I don't think I have so much sort of vision outline. So if I sort of would go away, people wouldn't really know what I have anticipated to do the coming year. But I'm also, I don't think that is a problem. I would also say that if I would go on vacation tomorrow and say, hey, I don't want to

Starting point is 01:46:45 do this anymore, you go ahead and do it. I would rather that people that step up sort of did that thinking instead of following what I had in mind, because someone who leads the project, who wants to drive it somewhere, should drive it where they think it should go. So I don't think anyone would actually need my roadmap for the next year if I would go away. I think we have a pretty good track record, we have a direction, we have a general way of doing things. I think people can... If I would leave and people think that, sure, it was a good way how we did it in the past, we could quite easily just follow the same sort of trajectory going forward.

Starting point is 01:47:29 Or we could, so they could determine that it wasn't really good. We could do it in a different way. I don't think that is, I don't think that should be a problem. I want to make sure that there's no secrets and there's no, you know, hidden things anywhere that would make people not know how to do things, how to find things, how to learn things. So there's soft things like direction, guidance, vision. Sure, I leave that to the afterworld to figure that out. I don't want to impose my – I'm i don't i'm not sure that would work

Starting point is 01:48:05 in a way so and i don't work that way myself either i don't have a very strong idea what to do the next year i live very much you know looking at my feet and doing things immediately next to me so i i don't really have a vision for for the long-term future anyway so i couldn't write it down anyway if i wanted to well carol's also very different a very different kind of project it's you've got this idea of you know you're getting and you know receipt or you're getting and sending data over protocols it's not like hey we have this text editor it's like oh we're adding this vim script and we want to have vim script be the certain style of con uh certain style of like um like yeah it's like there is a

Starting point is 01:48:46 very different kind of appeal to what vim was like vim was the basically vim was bram's opinionated version of vi whereas curl is this very clear we are doing this thing where we're you know operating over web protocols it doesn't entirely matter how it's done in the backend, as long as, like, to the user, like, you're able to operate over these protocols. Yeah. Hmm. So, yeah, I guess that answers it pretty well.

Starting point is 01:49:22 Because, you know, there is always this concern with projects that are like in the BDFL style of something happening like at someone in the future. Obviously, there was the case a couple of years back where Python changed from being a BDFL

Starting point is 01:49:37 to being a more... I don't know what they call it now. It's like a community board, something along those lines where the previous BDFL is now involved in... still involved in in the management but isn't taking over that that bdfl position anymore um and then obviously you have the linux kernel where you know that's like still very much bdfl but linus has also stepped away from the project in the past so they do know

Starting point is 01:50:01 what's going to happen when linus eventually does have to step away like it's very clear that greg is the next person in line he will then take over and pretty much nothing will change at that point because he's already been in that position where linus has had to step away in the past and pretty much just the project kept going the same way it's always been going like there's there's already those protocols in place to know exactly what to do then. Yeah. But I think it could work the same way for us too. If I didn't do it, someone else could do it. I mean, other people are already merging code.

Starting point is 01:50:35 I think I still do it at a much higher pace than anyone else, but I'm also the only one who works at it full time. I compete differently than all the others so if i wouldn't do it it would be a completely different situation than others would have to step up and do it instead yeah so if it was still like a side project obviously your amount of committing would be nowhere near the the level it is today right um i don't think there's anything else that important I want to address I think we talked about pretty much

Starting point is 01:51:08 pretty much everything unless there's anything else sort of on the top of your mind you want to talk about we covered a lot I think I don't know there's so much to say but I think we did a good job

Starting point is 01:51:25 at talking about a lot of different things so far. Well, yeah, we're closing in on the two-hour mark. Yeah, that's pretty good. I've enjoyed it. Thank you. I definitely enjoyed talking to you as well. This has been a lot of fun, and I hope the people got something out of this.

Starting point is 01:51:42 Maybe they didn't know about the history of Karel or anything like that. I hope people got something out of this. Maybe they didn't know about the history of CURL or anything like that. I hope people got something out of it at least. If not, I guess, I don't know, leave me an angry comment. Exactly. So I guess let people know where they can get involved with the CURL project, how they can find your stuff or anything you want to direct people to. with the curl project how they can find your stuff or anything you want to direct people to well to get involved with the car project you just go to curl.se and everything is there we we have most of everything documented so even getting started and the known bugs or whatever you want to do i always try to encourage people to work on whatever they feel is fun first. If you think something is fun or if you find a problem, fix that and send us a pull request. Otherwise, I'm, of course, at Mastodon, at Bagder, the wrong spelled animal, at Mastodon Social.

Starting point is 01:52:42 Otherwise, I blog and yell at Nvd every once in a while and even other blog posts at daniel.hacks.se h-a-x-x.se and just scrolling through your mastodon i noticed you had like at least three or four talks like just listed random like it you seem to go out of your way a lot to talk about what's going on with Curl. I do. I do it a lot. I try to talk a lot about what's happening in Curl.

Starting point is 01:53:12 I try to involve people. So I also often sort of enjoy and say, so, you know, when we have a pull request for something weird, maybe a proposal, I try to sort of sometimes post on Mastodon about that just to make, if you happen to be interested in this weird topic, sort of, you can read about it or you can state your opinion or just scroll past. So I try to do that so that people can follow

Starting point is 01:53:36 and see what's happening. If they want to, they can ignore it if they want to. But I think, you know, the information part is really important too, that people see that things are happening and they can speak up if, oh, wait a minute, why are you doing that silly thing? I tried that last year. Don't do it. Or great, do that. And have you thought about doing it this? It's much better to do it this way.

Starting point is 01:53:55 That kind of. And also just informing, now we're adding this weird option 257. We're adding this cool thing. You should know about it because you want to use it in the future. So, things like that. I just want to make sure that people are aware of both the development part and the new things that we use, because it's, I think

Starting point is 01:54:16 it's fun. Yeah, I do find that a lot of projects really struggle to get that information out there. Like, they have their blog post on their website, they have their issue tracker, they have their blog posts like on their website. They have their issue tracker. They have their merge requests, but sort of that's all going to be great

Starting point is 01:54:29 for people already involved. But if you're someone who happens to just use the project and don't keep a close eye on it, or you happen to be someone who's maybe wants to find something to get involved in, having that sort of outreach

Starting point is 01:54:42 does make it a lot easier to get people at least somewhat interested. Right. And I think for it to be really successful, you have to sort of hit all of those different sub areas of the project, right? You have to talk with your core people you talk to every day. They have to be in it.

Starting point is 01:55:01 They have to know it. But you also have to distribute the knowledge in other ways to people who they don't follow the Git commits because they don't care. But they might care about something else that you did and stuff like that. So therefore, I try to do all of those different ways. I do videos about new features. I do, you know, post silly things on Mastodon. And I do blog posts. So I try to hit a little differently depending on different things.

Starting point is 01:55:27 It could be technical stuff about protocols. It could be APIs in the library. It could be command line options. And it could even be how to use old command line options we've had for 10 years, but people tend to abuse or misunderstand. And I try to write write this is how I think it should be done really this is how we anticipated it at least and things like that so that it it could at least find an audience it's really hard you know it's so much information and stuff out there so how do you actually get people to see things I don't know well it seems like you're doing you know you're doing something is better than doing nothing like yes yeah I think so too uh is there any other seems like you're doing, you know, doing something is better than doing nothing. Like, yeah, I think so too.

Starting point is 01:56:09 Is there any other place? So you mentioned the Curl website, you mentioned your blog. Is there anything else you want to direct people to? Nope, that's it. Okay, I guess I'll do my outro then. So go check out the main channel. That is Brody Robertson. I do Linux videos there six-ish days a week. I also have the gaming channel.

Starting point is 01:56:27 Right now, I'm probably playing through Final Fantasy XVI and Armored Core 6. Maybe. I might be playing something else. I don't know. That's Brody on Games on YouTube and Twitch. If you are watching the video version of this, you can find the audio version on basically any podcast platform. There is an RSS feed.

Starting point is 01:56:44 Stick in your favorite app. If you're watching the... Or if you're listening to the audio version, you want can find the audio version on basically any podcast platform, there is an RSS feed, stick in your favourite app, if you're watching the, or if you're listening to the audio version, you want to find the video version, that is on YouTube at Tech Over Tea, it has been an honour and a pleasure to have you on the show, I, you know, I didn't think I would ever have someone who has made a project as important as yours on here. Like that, I, that's just awesome. I'm very happy to the, who was it?

Starting point is 01:57:11 I just think whoever got us in contact. What was his name? Yeah, who was that? I forgot. Aloni Zero over on Macedon. Thank you. I wouldn't have ever thought to try to get in contact with Daniel otherwise. So thank you for doing this. It's absolutely awesome that you did.

Starting point is 01:57:30 It was. So I'll give you the final word. What do you want to say? How do you want to end it? I don't want to end it. I don't have any final words. Just curl it. Curl it.

Starting point is 01:57:44 See you guys later.

Your Ad Here

Tech Over Tea - BDFL Of curl & libcurl | Daniel Stenberg

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.