The Changelog: Software Development, Open Source - Richard Hipp returns (Interview)

Starting point is 00:00:00 What's up? Welcome back. I'm Adam Stachowiak, and you are listening to The Change Log. On this show, Jared and I talk with the hackers, leaders, and innovators from all areas of the software world. We face our imposter syndrome, so you don't have to. Today on The Change Log, Richard Hipp returns to catch us up on all things SQLite, his single-file web server written in C called Alt-HPD, and Fossil, the source code manager he wrote and uses to manage SQLite development instead of Git. Big thanks to our partners,

Starting point is 00:00:28 Linode, Fastly, and LaunchDarkly. We love Linode. They keep it fast and simple. Get $100 in credit at linode.com slash changelog. Our bandwidth is provided by Fastly. Learn more at fastly.com and get your feature flags, PowerPile, LaunchDarkly,

Starting point is 00:00:43 get a demo at launchdarkly.com. This episode is brought to you by Gitpod. Gitpod lets you spin up fresh, ephemeral, automated dev environments in the cloud in seconds. And I'm here with Johannes Landgraf, co-founder of Gitpod. Johannes, you recently opened up your free tier to every developer with a GitLab,

Starting point is 00:01:05 GitHub, or Bitbucket account. What are your goals with that? Thanks, Adam. As you know, everything we do at Gitpod centers around eliminating friction from the workflow of developers. We work towards a future where ephemeral cloud-based development environments are the standard in modern engineering teams. Just think about it. It's 2021 and we use automation everywhere. We automate infrastructure, CI-CD build pipelines, and even write in code. The only thing we have not automated are developer environments. They are still brittle, tied to local machines, and a constant source of friction during onboarding and ongoing development. With Gitpod, this stops. Our free plan gives staffs access to cloud-based developer environments for 50 hours per month.

Starting point is 00:01:45 Companies such as Google, Facebook, and most recently GitHub have internally built solutions and moved software development to the cloud. I know I'm biased, but I can fully relate. Once you experience the productivity boost and peace of mind that automation offers, you never want to go back. Gitpod is open source, and with our free tier, we want to make cloud-based development available for everyone. Very cool. All right, if this gets you excited, learn more and get started for free at gitpod.io.

Starting point is 00:02:11 Again, gitpod.io. All right, we have Richard Hipp here, a long-awaited return to the changelog. Richard, welcome back. Thank you for having me. So excited to have you back. We first had you on the show back in 2016 talking SQLite, and I will pronounce that correctly and do my best. There's no correct pronunciation. You call it whatever you want. Well, Adam was slapping my wrist yesterday because we were talking in prep for this, and I kept calling it SQLite.

Starting point is 00:02:54 He's like, now you know Richard pronounced it SQLite. And I said, I just can't do it. I'm just trying, but I'll do my best. And ever since then, Richard, I've been on your side, like out there just spreading the word how it's truly spoken and i guess if you don't feel strongly about that then we won't enforce it but you said the right way so i've been following your rules well i think we actually broke news and probably the most cited episode of ours out there on the internet is episode is it 201 201 yeah with richard hip how you pronounce s How You Pronounce SQLite. And we set the record straight. And that's probably the most linked to episode.

Starting point is 00:03:29 Not only that, Richard, but we've had many people over the years say, you've got to get Richard back on the show. So we're happy to have you. We're here to get an update on SQLite. We're also here to talk about Fossil, which is your own SCM, which does lots of interesting stuff. What's an SCM, Jared? See, I had to look this up because I thought it was Source Control Management. your own SCM, which does lots of interesting stuff. What's an SCM, Jared?

Starting point is 00:03:49 See, I had to look this up because I thought it was source control management, but I think it's software configuration manager. Richard, what's SCM stand for? I always thought software configuration management, but source control management works too, I guess. Sort of. I mean, well, I guess we'll find out that it does more than just source control, right? Like it does a lot of things. But also, do you configure software, software configuration? I don't know.

Starting point is 00:04:08 Neither one of them fits all that Fossil does. SCMs, definitely a thing. A thing that isn't discussed so much anymore because I think everybody for the most part, except for you and your community, Richard, are just using Git and GitHub. It's fun to find out an alternative that's viable and long lasting and beloved by those who use it we're gonna learn a bunch about fossil maybe we'll have some converts after this episode but let's catch up with sqlite

Starting point is 00:04:35 first it's been five years it's probably still the most used software in the world maybe second place to c lib or maybe curl is catching up. I don't know. There's a few of those that are just ubiquitous, but what else? Is it on Mars? Yeah. Is SQLite on Mars? Do you know? I don't know. But every time we have this conversation, somebody writes, says, oh yeah, it's definitely here or there. It's in just about every electronic device you have. It's in your car. If you've got a recent car, it's in most of your computers required to boot up these days. It's certainly in all of your phones.

Starting point is 00:05:10 I think that there's probably more instances of SQLite running than all other database engines combined. Which is amazing to just think about. It's scary. Well, it's scary for you because you're the one managing the configuration of the software right yeah well it does change your worldview i mean suddenly it's like um

Starting point is 00:05:30 boy i need to pay attention to this don't i right i can't mess this up so does development slow a slow pace because of that nowadays or does still move pretty fast or is is sqlite pretty mature so that you don't do too much to it? It has slowed from the early days, but I mean, we still are adding a lot of features and we do a lot of changes. We don't talk about the rate of code churn very much because that would scare people.

Starting point is 00:05:56 Because it's high? It is for a piece of software that's used this widely and is used so much. But we do have, we actually spend most of our time testing it, you know, because that's important. I was, oh, a few years ago, we were talking with a young college graduate and it was a young woman and she was talking to me and she says, well, she was in software too, and she said, well, she was in software too and she said, well,

Starting point is 00:06:25 I just do testing. I'm just a tester. She was very self-deprecating and I thought, shoot, that's all I ever do is test. I spend all my day testing. I'm just a tester.

Starting point is 00:06:37 Yeah. Because people write in, they'll have some issue or we'll do a new feature and adding the feature takes an hour and then we'll spend weeks just testing it and yeah but even that there there is a lot of code churn um i know that um

Starting point is 00:06:53 like open bsd somehow they've heard for a while adopted sqlite into their core set of packages because it was being used for their, I think for the search engine on their man pages. But they wanted to stay up to date and they feel compelled to do a code audit for every line of code that changes. Oh, wow. And so we were changing SQLite faster than the rest of the entire core package combined.

Starting point is 00:07:22 And they said, no, we just can't keep up. So they had to write their own database engine for their... Oh, they dropped you as a dependency. Yeah, they had to drop it because the code churn was just too high. Wow. Don't you have in your license where you can, or is that with Fossil? Did I misread that? Where you can, I think your words were steal the code and use it however you want, even

Starting point is 00:07:41 for commercial use. That's for Fossil. Yeah. Well, SQLite is public domain and you can do anything you want, even for commercial use. That's for Fossil. Yeah, well, SQLite is public domain, and you can do anything you want with that. Right. I wish I'd said thought of this. It's kind of evolved.

Starting point is 00:07:51 I mean, we do have a lot of public tests that are out there that are public domain as well, but some of our test code is proprietary. Some of it. Why is that? Because it was paid for by somebody? Originally, we thought we were going to sell this and make money from it. And that's how we were going to support ongoing development. That didn't really play out. Nobody ever bought it. It does sort of become our business value,

Starting point is 00:08:14 our intellectual property. I mean, you can take the SQLite code and fork it and start your own thing. The tests. But you don't have the full test suite. You've got a lot of tests, but not all of them. And so we've got a little bit of advantage over you there. So is most of your business income is support contracts for SQLite? It's pretty much all support. We have some extensions like the encryption extension that we'll sell to people on a license basis, but the bulk of the revenue is from support contracts. And a lot of people do that because if your business depends on this, you want to protect your supply chain

Starting point is 00:08:54 and we can sell them a support contract, which is a lot cheaper than them hiring somebody to support it themselves. So when I hire the experts, right? Right. The ones with all the tests. And if we're doing our job well, they never call us, you know? That's right.

Starting point is 00:09:10 How does that play into the makeup of the business then? Like when you think about growing the business, essentially you have to make worse software, right? To some degree, right? Software that requires, you know. Yeah, that requires maintenance. That's right. In order to sell more maintenance contracts, we have to deliberately introduce bugs. Okay, I'm not sure. I don't want to go there. That's not the way I want to do it. People, I have talked to a number of people who have made

Starting point is 00:09:37 a lot of money in the software business, and they look at what we're doing, and they say, oh, Richard, you could make a lot of money doing this. Let me show you how. Yeah. And they're probably right. I don't doubt that if they had been the manager of this project, we would have made a lot of money. But, you know, I'm just – I'm not gifted that way. That's not who I am. Right.

Starting point is 00:09:59 I'm much more the hacker. You know, lock me in a room with a computer and push pizzas under the door and leave me alone. So, the business – we've kept the business small. It's not a promise, but we want to support SQLite until the year 2050. And, you know, you have to be careful. And that changes your way of thinking. We want to make sure that everything we do is sustainable in a business sense. Yeah.

Starting point is 00:10:22 Yeah. So you still slinging code? Yeah, absolutely. Everyinging code yeah absolutely every day pretty much every day yeah nice what's your discipline towards that do you have like a a time block in your calendar do you it's two o'clock time to code no no it i decoded on an as needed basis which is which is daily apparently well it just depends on when things come up i mean customers will write in with questions or you know know, I'll think of an idea. I'll be out running.

Starting point is 00:10:47 And I think, this is a feature we really need. And then I'll cut the run short and come home and clean up and get busy coding. There you go. It's just. And you'll test it for two weeks. Or a month or whatever. Yeah. So how big is the company?

Starting point is 00:11:00 Like how many people are working on this, this support contract supporting? I've got three guys working on it with me right now. And we're all distributed, so it's always been that way. I'm kind of living the dream. If that's what you like doing, why not keep doing it?

Starting point is 00:11:19 Is there any plans for a SQLite cloud? There are other companies working on that as we speak. Gotcha. Yeah, so one thing that has changed, or maybe hasn't changed, but Adam and I have become aware of this, is last time we talked, 2016,

Starting point is 00:11:35 of course it was already pervasive, right? It's already out there in tons of things. But it's not client-server. And so the, I guess what you call server-side, write-heavy, web-server-style usage the it's not client server and so the i guess what you call like server side right heavy like web server style usage is really the place where sqlite wasn't playing quite as much because you would switch to a postgres or something at that point but it seems like a lot of people were taking it more serious even for like backends on web servers nowadays we know ben johnson has his light stream project which is like streaming replication so there's like tooling around Even for backends on web servers nowadays, Ben Johnson has his Lightstream project,

Starting point is 00:12:06 which is streaming replication. So there's tooling around, hey, I actually want to use this in a production capacity on a web server or a web application backend. Whereas it didn't seem like people were doing that then, or maybe they just weren't talking about it as much, they're doing it and talking about it now. Yeah, so SQLite was originally designed to be more of the database engine for the edge

Starting point is 00:12:26 of the network. Yeah, like embedded. Versus the core of the network. It's out on the peripheral devices, not in the core data center. But, for example, I can talk about now Bloomberg. Their entire organization runs off

Starting point is 00:12:42 of SQLite. Now, it's a customized version of SQLite called COMDB2. They have their own storage engine, which spans multiple data centers and is highly redundant. But the SQL query planner and executor is all SQLite. And then Expensify uses a stock version of SQLite to run everything. Dave Barrett, the founder of the company, wrote this product called Bedrock. And he open sourced it. It's out there on GitHub.

Starting point is 00:13:18 It's sort of a wrapper around SQLite. His idea is that he builds a server for the application that is doing the database processing, and the front-end devices, they don't speak SQL directly. They call essentially stored procedures. And so you don't have any concern with SQL injection because everything is done with stored procedures. But this server thing, Bedrock, uses SQLite for all of its underlying processing. He's published stuff where he's getting like, I think, 3 million transactions per second. It's incredible. Yeah.

Starting point is 00:13:56 It's an insane amount of volume. So there are cases of that. But still, I think the predominant use case is cell phones and Raspberry Pis and the internet of things. Does your business then have a relationship with Expensify and Bloomberg and this open source project you mentioned? We do. Yes. Okay. We support it for them and a few other companies like that, some of which wish to be public and others which don't.

Starting point is 00:14:25 And that's fine. I mean, we're happy to work either way. I think what's interesting here is just a side note on this really, is this sort of desire or this one-way thinking that because you built the database that's amazing and widely used, that it has to be this massive company or it has to have $2 million in recent funding with billions of

Starting point is 00:14:45 dollars of venture, you know, of valuation, you know, like this, that's the way you have to do it. And I love that you push back on, I mean, based on what you say here, that you push back on the idea that you said you're not equipped for that and that you like the small company feel, you know, you like to code every day, you know, that you're not influenced out of your norm, out of your comfort zone, your love, your passion, to build a company you don't actually want to run. Yeah, it's hard to know exactly what to do. But I have made that choice, and it's worked out really well.

Starting point is 00:15:17 Now, who knows? Maybe I would have been happier another way, but we'll never know, right? I'm happy now, and so I guess that's what counts, huh? Yeah. You can't go back and fork your life at that point and just run both tracks and see which one would have worked out better, but. No, everything's worked out really well. And when we've been able to solve a lot of problems for a lot of people, and it's been just an amazing journey. One of the great things is I've been able to go out and visit so many different companies and so many different cultures and see so many different styles of development. It's really been an eye

Starting point is 00:15:49 opener. I would have never imagined that there was such a diversity of corporate cultures and development styles out there. Jared mentioned Lightstream and Ben Johnson. What are your thoughts on that in particular? This idea that you can, you know, using the replication process of SQLite and don't get done with that. What are your thoughts on Ben's project in particular, this idea that you can, you know, using the replication process of SQLite and don't be done with that. What are your thoughts on Ben's project in particular? Yeah, you know, I think it's an interesting idea. We actually, Dan, one of the other developers and I, had a jitsy conversation with Ben at one point, and we really appreciate what he's doing.

Starting point is 00:16:22 He's not the only one doing that, let me say. There are other groups that are working on that as we speak. You know, I think it's a great idea. I really applaud him doing it. Whether or not it gets traction, takes off, I can't predict. I just don't know. I want to keep what we're doing here with us focused on the Internet or the database for the edge of the network. I don't personally want to get involved with making it massively scalable like that. I think

Starting point is 00:16:53 it's a great thing. It's a very important problem that needs to be solved. But just what we have now is enough to keep us busy. And if I try and take on too much, we would lose focus and we'd start making mistakes. You have to find the right balance there. And right now, SQLite is pushing the limits of what a small team like this can reasonably control. To go further, I would no longer be able to understand everything that's in the code. And we'd have to start delegating and who knows where that might lead. I don't think that I would be very good at that, and I don't think that I would enjoy that, so we're not going to do that. Stay focused on the small.

Starting point is 00:17:34 Stay focused on one thing that we can do well. That gives people like Ben an opportunity to do their thing as well. We're contributing to him. Well, it creates an ecosystem around the thing versus you having to be the ecosystem, which I think is healthy and, like you said, it's opportunity. Do you ever see things out there that people are doing with

Starting point is 00:17:53 SQLite or building on top of or around similar to Lightstream where you think, either I wish I would have thought of that or actually I am going to take this one and put it into the code base. You ever done that? Yeah. I can't call have thought of that, or actually, I am going to take this one and put it into the code base. You ever done that? Yeah.

Starting point is 00:18:14 I can't call specific instances to mind, but yeah, I'm always watching what other people are doing and thinking, well, that's a good idea. We should try and do that. Or how can we make SQLite solve that problem directly rather than having this add-on? The thing to watch right now is DuckDB. I don't know if you've seen that one. I have not. Duck? DuckDB. Okay.

Starting point is 00:18:30 It's a column store instead of a row store. So it's optimized for big aggregate queries. And so if you've got a large set of data and you're running analytics on it, they say DuckDB runs a lot faster. And DuckDB has borrowed a lot of the ideas that we pioneered with SQLite where they do amalgamation. It's just a single file of source code. I think they stole our command line client and just reused it, which they're fine. I'm cool with that. Let them do that. Well, it's public domain, so you better be cool

Starting point is 00:19:01 with it, right? Yeah, of course. So, you know, that's inspired me to think about, well, can we have a roast, a column store option for SQLite as well? What would that look like? How can we build that out in a backwards compatible way so that it, you know, it doesn't break legacy applications? Yeah. Because a big part of what we do is the SQLite file format is very carefully defined and we guarantee that it's going to be unchanged for years to come, or at least not changed in incompatible ways for years to come through the year 2050. It'd be much easier to write a column store if we could go back and redo the file format. Right.

Starting point is 00:19:44 There's lots of things I would have done differently knowing now, if I'd known back then what I know now. But we're kind of locked in by legacy. We need to support the literally trillions of SQLite databases that are already in the field. So how can we do that and do a column store at the same time? Couldn't you just have another file format that's like column store mode? And it's like, now it uses this file. Yeah, but then you've got added complexity. The other thing we need to balance is that because SQLite runs on small devices, we need

Starting point is 00:20:17 to be careful not to let the footprint of the library grow too big. There's been steady growth in the size of the library. We're pushing 600 kilobytes right now. That doesn't sound like very much. Yeah, these days it doesn't sound like very much. But back 15 years ago, folks like Nokia were just, and Motorola were just beating us up. Can you save another 100 bytes? You know, I mean, these days it's less of a concern, but at the same time, we just don't want to let it go wild and suddenly turn into a 10 megabyte library that you have to link into your application. So there's a balance there.

Starting point is 00:20:51 I mean, adding a column store means a totally new query planner. You know, how much extra space would that be? So, I mean, that's something that I'll be looking at in the coming year, coming couple of years probably. Well, here's a couple examples. Application size. So here I'm looking at my ios app updates zoom cloud meetings update 86 megabytes audible update 119 megabytes uh google maps this one will probably be big 206 megabytes so i feel like you know maybe that that one dependency could be a little bit larger and nobody would notice. But point taken. Especially with the edge, too.

Starting point is 00:21:27 You got, you know, edge devices probably have SD card for the most part or smaller drive types that just don't have the capacity. You know, things like that that really come into play. Something that you kind of made me think of there was when I asked you before about the business and optimizing for needing support, I think actually you're optimizing for something worth supporting, you know? That's a good way of looking at it. Yeah. Because, you know, it's not worth supporting unless people are using it. Unless it's useful. Sure. You know, needing support is one thing, but being worth supporting is a different thing. Yeah. So I'm not very good at sales. And so in order to get customers, we really have to make it so that their business utterly depends upon SQLite.

Starting point is 00:22:10 Because it's just so stinking good, right? Yeah. So that encourages me to make it better all the time. Yeah. So the reason SQLite is so reliable is because I'm such a bad salesman. Ha ha! at LaunchDarkly, feature management for the modern enterprise, power testing in production at any scale. Here's how it works. LaunchDarkly enables development teams and operation teams

Starting point is 00:22:50 to deploy code at any time, even if a feature isn't ready to be released to users. Wrapping code with feature flags gives you the safety to test new features and infrastructure in your production environments without impacting the wrong end users. When you're ready to release more widely, update the flag status and the changes are made instantaneously by the real-time streaming

Starting point is 00:23:08 architecture. Eliminate risk, deliver value, get started for free today at LaunchDarkly.com. Again, LaunchDarkly.com. So I think this will lead us into Fossil, but I wanted to touch briefly on alt-httpd because I saw this and it just made me laugh. Of course, Richard Hipp wrote his own web server to powersqlight.org. Tell us about this. I mean, I understand you like to write your own tools, but, you know, Apache existed, Nginx existed.

Starting point is 00:23:52 Maybe it was very young, but it existed. Well, no, no. Well, Apache existed when I first wrote this. Nginx was out there. But it was big and complicated, and I said, well, I'll just stand up Apache. We'll do that. I looked at the documentation. I read through the documentation multiple times.

Starting point is 00:24:10 And I said to myself, can I configure this in a way that will be secure? Maybe with some trial and error. But how would I know that it's secure? I wouldn't really know. I mean, you really have to spend some time and become an Apache expert to know that it's secure. Maybe they have better tools now, two decades on. But it occurred to me, in order to write something that I would really trust to run on my servers, I need to write it myself. And so, I put together alt-httpd. It's very, very simple. It's a single file of c code so that you can audit it and make sure that

Starting point is 00:24:47 it's not doing anything weird and i put it up there and it works it's not make no claim to be the most efficient it is not the web browser that you want to deploy at scale this is not the web browser you want to use if you're building the next Facebook. But for small websites, it works great. It's the traditional fork a new process to handle each HTTP request design. So we handle one HTTP request. It calls exit, and the operating system cleans up the mess. And so that's really simple, secure. We don't have to worry about memory leaks or anything like that

Starting point is 00:25:25 and it handles the load fine and when we're doing i mean it's not a huge load though we're getting what 10 http requests per second of about 20 of which are cgi requests and so that's fine you know a linode will handle that without any trouble. Would it be more efficient to do it with Nginx? Maybe, but this works. And so I'm going to stick with it. I'm not recommending that you go out and deploy this on your website. But if you want something quick and easy to set up that you can read in a couple of hours and understand, it's out there. You're welcome to use it.

Starting point is 00:26:04 So I wrote it back around the year 2000. It's over two decades old. I put it under – it sort of lived in other version control systems for a while. I split it out as its own project only just recently. So don't get the idea that I wrote it just recently. We've been using this for decades. It says on the website that it's been in use since 2004 and NGINX was released in 2004. So I thought NGINX existed, but maybe when you originally wrote it. Maybe it did exist. I just had never heard of it. Yeah. That's entirely the case. Have you

Starting point is 00:26:36 ever heard of not invented hair syndrome? Yeah. And you could make the case that I have a lot of that in me. I think maybe it leaves us a little bit in the fossil, but go ahead, continue. Yeah. Oh yeah. You know, I tend to write a lot of my own stuff and maybe this is just because for me, it's easier to write my own than to figure out how somebody else's works. This came up with SQLite when SQLite version 1, we're on version 3 of SQLite, which came out in 2004. Version 1, the storage engine was GDBM, the GNU Database Manager. It was a key value store.

Starting point is 00:27:13 It was hashing based. It was GPL'd, so we needed something better. And I thought, oh, well, I'll get Berkeley DB and I'll use that as the storage engine. And I spent literally two days studying the documentation, trying to figure out how it worked, and the documentation's okay. But there were a lot of corner cases that I needed to

Starting point is 00:27:30 understand, and I recognized that in order to understand these corner cases, I'm either going to have to read the entire source code to BerkeleyDB, or I'm going to have to write a bunch of test programs to see what it does really. And I thought, you know what? It's going to be easy to write my own. I'll just write my own storage engine. And so I did. And I got lucky that worked out well in the end because having control of your own storage engine, it allows you to do optimizations and features

Starting point is 00:27:59 that you couldn't do if you had to maintain compatibility to somebody else's API. So these sorts of things help a lot. With alt-httpd, I can do things on the website that I can't do easily with Nginx and Apache because it does things that they don't do. And so I can't really easily convert the website over to those now because I'd have to recode it to the Apache Nginx style.

Starting point is 00:28:24 Do you have a for instance? Like something that you can do there? Well, with alt-httpd, there's no configuration file. You just point it to a directory that contains your content. And if the files

Starting point is 00:28:39 in that directory are executable, they're CGI. And if they're not executable, they're CGI. And if they're not executable, they're static content. Okay. So any executable file can live there. You can throw a PHP script in there or a Ruby file

Starting point is 00:28:52 and it will just run it like a CGI. Or run it like a CGI. Yeah. Sounds kind of dangerous. So you don't put executables there that you don't want. I just messed up with that. But the other thing is it also drops itself into a change root jail.

Starting point is 00:29:09 So the executables you put there need to be statically linked because they're not going to be able to find the shared libraries and slash lib that they need. So you statically link them, and you put just a few that you really do need, like Fossil. Like Fossil. Yeah. It's also got one use case, too, which is your use case. So it can be that strict, whereas mainstream might be like, that's kind of painful. Right. But I've never tried to push it.

Starting point is 00:29:37 I've never tried to publish it or never tried to get other people to use it. A few other people have downloaded it and use it, and they say it's great. And if that works for them, that's wonderful. But I wrote it for my own use, and if nobody ever else uses it, it's still been a great job. The other thing is every now and then we get these very pernicious robots that come invading the website and trying to bring the server down. And because I control the web server, I can just put a little test in there that identifies the malicious robot. And whenever I see one, I call exit. Are you just detecting a certain request signature or a user agent?

Starting point is 00:30:15 Or how do you do that? IP address? It depends on the robot, yeah. So you've been doing like a tower defense game you've been playing all these years. Yeah, it's a whack-a-mole because there are always new ones coming up. Oh, I played a lot of whack-a-mole in my day. But there was one a few years ago that it tried to pretend to be an ordinary web browser, but in the user agent string, they'd misspelled one of the words. Gotcha.

Starting point is 00:30:40 So I just looked for that misspelling in the user agent string, and if I see that misspelling, call exit. You're done. You're done. Is there anything you learned, though, along this journey? Like you mentioned writing your own software. It may not be what everyone else might do. But is there any lessons you've learned in particular of writing this web server that you've been able to apply to SQLite or to Fossil, which we'll talk about?

Starting point is 00:31:03 What have you learned doing it that may be a lesson that you wouldn't have learned otherwise? I can't point to specific lessons. I do find that it does work well to control your own tools. One, if you do a diff between alt-httpd and the web server that's built into Fossil, you'll find a lot of commonality there. Okay, so they're kind of barred heavily between the two.

Starting point is 00:31:32 But what I've found is that when you control your own tools, you can go further and do things that you can't do if you're depending on somebody else for your tools. And I won't use Alt-httpd as the example, but rather Lemon, the parser generator that I use in SQLite. Now, most people, when they're doing a language parser, they'll bring up YAC or Bison. But I'd written my own version back in the 1980s because I was dissatisfied with the interface for YAC. And I used that for SQLite.

Starting point is 00:32:05 And I've had it out there for open source for a long time, and nobody ever noticed it until it appeared in SQLite. But by using Lemon as the parser generator, I was able to add new features to Lemon to support language features in SQLite that would just not be possible to do with YAC. So, for example, when you use a new keyword, just recently in SQLite, we added the materialized keyword.

Starting point is 00:32:33 But suppose there's somebody with a schema out there and they've got a column named materialized. If that became a proper keyword, then suddenly when they tried to read their database in, it wouldn't be able to parse the schema because it was using a keyword as a column name. That wouldn't work. So we have this feature in Lemon so that if it sees a keyword in a context where it thinks it needs an identifier and it can't use the keyword there, it will change the keyword into an identifier and use it as an identifier. You can't do stuff like that in YAC, but because we control the parser generator, we can pull little tricks like that and maintain backwards compatibility.

Starting point is 00:33:15 And we were also able to optimize the code generated by the parser generator so that it runs very fast since a big part of the time for an SQL database engine is actually parsing the SQL. I like that principle because something I've learned over the years is certain jobs require certain tools, basically. And it's kind of what you're saying, but sometimes when you have the right tool, hard jobs become easy. And if you control your tool, then you can have the right tool to make a hard job easy, essentially. Sure. And think back years ago, I mean, the concept of a tool and die maker,

Starting point is 00:33:53 you know, companies that had a big staff of tool and die makers, they could make their own machinery and they could out-compete. If you had to buy your machinery from somebody else and it just came as is, you had to make do with whatever they had. But if you can make your own tools, you can fine-tune your processes and out-compete. Well, it's not just the market being able to offer the tooling, too. It's all the effort that goes into it. Survey the options, evaluate the options, test the options, deploy the options, maintain the options, and then if that thing doesn't suit a future need, re-evaluate the options and rinse and repeat the thing.

Starting point is 00:34:32 Yeah, you don't want to make all your tools. I mean, I am using other people's operating systems and compilers. What else, though? Because, I mean, you told us last time, you wrote your own editor, so you go to that depth. Is there any tools beyond your OS and Bedrock that you do use? And you're like, this is actually good enough for me. I like Zed, or I like this browser.

Starting point is 00:34:56 What are some tools that you use that you don't feel compelled to write? Well, I use commercial web browsers. I normally use Firefox, but I'll use Chrome or Safari on occasion as well, or some of the other ones like Brave. I certainly use the standard compilers. Linux, Mac, Windows, use all of those. Did you write your own spreadsheet? Did you write your own? No, no. I use NeoOffice, OpenOffice. Excel's actually really winning, even in enterprise today. There's a lot of stuff about people trying to overturn the use of Excel because of the way work has changed, and they can't kill it, basically.

Starting point is 00:35:33 It survives. Well, it's so malleable and powerful. It's very powerful. I see a lot of people use Excel as they use it for making documents. It's not just a spreadsheet. It's a formatting engine. And a database, you know. And a database, absolutely.

Starting point is 00:35:51 So, yeah, you use the tools that are appropriate, but I have my own text editor. I have my own web server, my own parser generator, my own version control system etc so i yeah um i keep keep threatening to do my own email transfer agent and i've actually put work into that and that turns out to be a really really hard problem you'd think that it's that's a harder problem than writing a database actually because of all the legacy you have to support. But I'm really dissatisfied with what we have available in terms of email systems. And so if you want to host your own email, that's kind of hard to do these days.

Starting point is 00:36:34 It's super challenging. I mean, it takes so much work to do that. We'd actually just log something. I can't think of all the details, but they were like giving a walkthrough of how essentially to host your own email and all the things you would have to do. And I'm just like, no, I mean, it's just so much, you know, it's just so much. I put an enormous number of hours into trying to come up with a single unified system that will simplify that in some way. I don't have anything to show for that yet. It's a hard problem. Still working on it.

Starting point is 00:37:01 Well, the cool thing though, is that, is is the law of numbers, essentially. If you keep writing your own tools, sure, SQLite has been the winner of the tool. It's what you built your company around. It's where you and your team get your livelihood from. But there may be the next big thing behind a tool you decide to make your own. Like this editor. Are you the sole user of it? I think I'm the sole user. Yeah, yeah.

Starting point is 00:37:27 None of the other team members use it. They all use VI or Emacs. Yeah, but you never know. You never know, right? You never know. And I never expected SQLite to go viral like it did. That was a complete surprise to me. If I were you and I wrote an editor,

Starting point is 00:37:44 I would name it HIP, H-I-P-P. That's such a and I wrote an editor, I would name it HIP. H-I-P-P. That's such a cool name for an editor, right? Yes. What's it called? Well, I call it just the letter E because it's easy to type. Okay. That's editor, yeah.

Starting point is 00:37:55 Yeah, E for editor. That's easy. Yeah. E for easy. Easy does it. I think you should release that thing and just let the world decide, you know? Okay. I think you'll be disappointed. I think you should release that thing and just let the world decide, you know? Okay. I think you'll be disappointed.

Starting point is 00:38:08 I think you'll be disappointed. I'm very easily impressed. So you're not going to tackle mail quite yet because there's a lot there. Oh, I've been working on it. I've had no success at it. Yeah, you just haven't gotten a tackle. It's a tough nut. But you have tackled, as we've said and teased up, version control, source control management, software configuration management, Fossil. Tell us the story of Fossil because you've been working on it.

Starting point is 00:38:31 This is not a new thing. You've been doing this for a very long time, not as long as SQLite, but they're kind of symbiotic. You're probably the only person since, I don't know, did the Mercurial people hang it up at this point? Are they still working on Mercurial? No, I think Mercurial people hang it up at this point? Are they still working on Mercurial? No, I think Mercurial's still viable. They're still making additions and releasing new features and so forth. Okay, that's cool.

Starting point is 00:38:52 So there's not just you versus Git, but there's lots of people that just Git has won the mindshare. There's Git, and then there's Fossil and a bunch of others, yeah. So tell us what existed when you started Fossil. Was Git there?

Starting point is 00:39:07 I mean, SVN was probably the mainstay. Maybe it was before this. Tell us the history. All right. So when I first started writing SQLite, everything was CVS. And I know that CVS has a bad rap with moderns because Linus had some very bad things to say about it. And, you know, most of the criticisms of CVS are correct. I mean, it's not good. On the other hand, I'm unwilling to say

Starting point is 00:39:31 anything bad about CVS because I had to use the things that came before. And if you'd ever used the version control systems that came before CVS, you'd think CVS is really great. So, but yeah, it has its issues. And so we started out with SQLite and CVS because back great. So, but yeah, it has its issues. And so we started out with SQLite and CVS because back in 2000, that's what everybody's using for everything. And that went on for a while, but I recognized that it was inadequate.

Starting point is 00:39:56 And Git had just started to come out. It really hadn't gotten the traction it has now. Mercurial was out and it was still an open question, you know, do I use Git or Mercurial? And this was a, that was a big, big debate back then. This was before GitHub. And I had been doing some work with, on SQLite with some avionics companies. And I'd come to understand this quality standard called DO-178B. And this is used, the quality standard used in avionics. And I thought, well, I'm going to apply this to SQLite. And part of the DO-178B standard is version control

Starting point is 00:40:32 or source control management. And I looked at the requirements that they have. And in my opinion, which doesn't really count for much, but my opinion was that neither Git nor Mercurial really filled the bill here. And I thought, well, I'm going to do my own. The other one that had influenced me was called Monotone. And Monotone, if you've never heard of it, I think was one of, as far as I know, it was the first version control system that was Git-like in the sense that it used SHA-1 hashes to name everything. And I was influenced by Monotone as well. But I wanted a version control system that would,

Starting point is 00:41:14 one, it would work easily from behind a shared hosting environment. This was before the age of ubiquitous virtual private servers. Back then, when you wanted to lease space on a server, they just gave you a shell account, and you had your home directory, and you put your stuff in your – they ran Apache for you, and it just pointed to your directory and did its thing. So I wanted something that I could run out of just a simple shared hosting account like that. And nothing was available. And I wanted something that would meet the standards of DL-178B as I understood them. And there was nothing available. So I thought, well, shoot, I'll just write my own. So I played around with it for a couple of years. I started working on it about even before Git came out.

Starting point is 00:42:05 And then Git came out and I started working on it about even before Git came out. And then Git came out, and I kept working on it. And I think it was about two years after Git came out that Fossil became self-hosting. And the same principle as Git in the sense that you have immutable artifacts that get added in. And we were using SHA-1 at the time as well. And you've got a directed graph design and you commit things to it and other people can commit simultaneously and everybody has a copy of everything. All of that's all the same. Now we have different names for things, but it works very much the same. But we have some very different concepts and a very different focus. Git is very much designed for Linux kernel development. And if you're a Linux kernel developer, Git is absolutely

Starting point is 00:42:53 the best version control system in the world. It is perfectly designed for that role. But SQLite has a very different development environment. With Linux, you've got thousands of people around the world working on this simultaneously. And then they upload their changes and it goes through layers of review and administration. And Linus does not want to see every check-in that's made by every hacker that wants to contribute to the kernel.

Starting point is 00:43:20 He wants summarized and vetted patches to consider to go into the main line. And Git's ideally suited for that. But SQLite development is very different. It's a small team. Everybody knows each other. Everybody sees everybody else's work all the time. And

Starting point is 00:43:39 Fossil is very much optimized for that use case. So with for example Git, when you make some changes, you make your changes, then you push them up to somebody else. Where with Fossil, the default configuration is every time you commit a change, it automatically pushes your changes up so that everybody else can see them right away. Is it still distributed or is it client-server? It's still distributed, but when you're on network, it behaves as if it's client-server because as soon as you do a commit, it immediately pushes your changes out to the server if that server is available.

Starting point is 00:44:19 And so if your system catches on fire, you haven't lost anything. I remember a few years ago, that actually happened to Linus. He had caught fire or somehow went inoperable, and he lost a couple of days' worth of commits or something. I don't remember the details of the story. Wow. Because he wasn't pushing it out to another server until he got ready. Whereas with Fossil, that's kind of automatic. That would never happen. And which approach you want to take, I guess, really depends on what you're trying to

Starting point is 00:44:49 do and what your development style is. As it happens, the Fossil development style exactly suits what SQLite wants to do. And the Git development style exactly suits what the Linux kernel wants to do. So apart from those minor differences, they're really kind of the same thing. The storage is quite a bit different. Of course, Fossil keeps all of its data in an SQLite database. So Fossil was designed to control the SQLite source code. And it uses SQLite to store all of its information. So I'll let you and your listeners ponder that recursion later. It's kind of double self-hosted. Yeah, it's sort of – there's this little loop here.

Starting point is 00:45:37 But that's really worked out really well for us because – and I didn't plan this. It just worked out that Fossil has become a great dogfooding opportunity for me. Because Fossil is a big user of SQLite, when I'm working on Fossil, I see SQLite from the point of view of a user of SQLite, not as a developer. And it's happened many times where developers come to me and say, oh, we need this feature, we need that feature. And I'm thinking to myself, I try to be nice to people, but I think to myself, stop whining.

Starting point is 00:46:13 You don't need this. But then a few weeks later, I'll be working on Fossil and I'll see things from the application developer's perspective and think, you know, it really does need that after all. And then I'll go back and put it in. And then apologize.

Starting point is 00:46:30 No, apologize never. No, why would we do that? We're a full team on Sanjit that's featured out there. Just kidding. Yeah, it does. It really makes a huge difference to be able to experience SQLite from the application developer's perspective. It changes your whole view. And in fact, it takes me about a day to switch between developing three products because I'm looking at the world from a very different lens when I'm developing SQLite versus when I'm developing Vossal. Wow. So you can't context switch back and forth very easily. Not easily. It's hard. It's a big context swap for me to do that.

Starting point is 00:47:09 I tend to spend days working on one or the other rather than flipping back and forth between the two. So that's been a very good thing. The other big difference, I guess, is Fossil does try to... People talk about Git and Mercurial as they're distributed. Well, Fossil is distributed too in the sense that everybody has copies of all of the files. But Fossil is non-distributed in a good sense of the word. It's not just the source files that it controls.

Starting point is 00:47:41 It also controls your bug tickets, your wiki, your forum, your chat room, and you can hyperlink between all of these things, and it manages them all together. And it keeps everything in a single file on disk. So Fossil is non-distributed in the sense that you only have one place to go to find all of your tools and all of your files. Whereas if you're using another system, whatever that might be, you've got this system for version control and oh, I'm pulling in the wiki from here and I've got that.

Starting point is 00:48:14 And oh, we're using this bug tracking system and we've got a separate webpage for that. You might have slightly different looks and feels. If you're using Markdown as your markup language, you've probably got three or four different dialects of markup that get involved. Whereas with Fossil, it's all together. It's all in one file. And there's one place to go in the web to see it all.

Starting point is 00:48:40 Yeah, so is that one file per project then? One file per project. Okay, so if I have two, I then? One file per project. Okay. So if I have two, I have a SQLite, and I also am working on Fossil, they'll have separate files, like the two projects source code. Yes, they are separate files. Now, Fossil does have a feature that it keeps track of all of your Fossil repositories. So one thing that I like about it is the Fossil All command, A-L-L.

Starting point is 00:49:06 So if I'm getting ready to go off network, take my laptop off network for some reason, I can go on my laptop and I can say Fossil All Sync. And it'll go and sync every single repository that's on my laptop, pulling down all the latest changes. Then I can go off network, do lots of work on multiple projects. Then I go back on network and do fossil all sync, and it will, again, sync everything that's on that laptop and push it back out to the cloud. So it does keep track of all of your repositories, but each repository is itself distinct.

Starting point is 00:49:49 And is the way that it handles branching, merging, conflict resolution, is that all, would that be familiar to Git users or not? That's going to be familiar. It does have the difference that Fossil retains the names of the branches. That's part of the synced logic. So with Git, I'm not sure how Mercurial works, but with Git, Git doesn't have branch names. It only remembers the names of the leaves of the graph. And it infers branches based on those leaves.

Starting point is 00:50:22 Fossil actually names every branch. And every check-in, every commit, there's a tag on it that shows what branch are you a part of. And so that's part of the historical record. So everybody's talking about the same branch. With Git, if you've got multiple people working on the same project, everybody's got their own master or main or whatever they call it these days. But with Fossil, we use the term trunk. And there's only one trunk. And if you talk about trunk, everybody's talking about the same thing. If we're talking about branch version 3.26.0, then everybody's talking about the

Starting point is 00:50:56 same branch. So the branch names are part of what gets synced. But other than that, the whole idea is the same. You have separate branches and people go off and work on branches and then we merge the branches onto trunk. The thing is, because it's hosted with relational database, we can follow branches forward in time in addition to backwards in time. If you think about it with Git, if you know a check-in, it's really easy to find the check-ins that came before. But if, say, you've bisected and landed on a check-in, or say a customer's coming and says, hey, we're having trouble with this check-in, you can't easily find out what came afterwards,

Starting point is 00:51:37 what things were added to this check-in later in time. You have to go searching the Git log or do some stunts like that, and the GUIs don't typically provide you with this information because it's hard to find. Because the internal data structure, it has a pointer to the ancestors, to the things that came before, but there are no pointers going forward in time because the check-in is immutable. And at the time of the check-in, you don't know what's going to come next. But if you store this information in a relational database, then you can create an index and you can follow that index forward in time. And so, given a point in time, we can see what's going on in all branches simultaneously, both forwards and backwards. It's a very powerful feature to maintain situational awareness. And I talk to Git users and say, oh, I don't need that. I've never used that.

Starting point is 00:52:31 And, you know, fair enough, but I never needed Bisect until I had the capability, and now I can't live without it. Once you start using this powerful feature, being able to see what comes next, what came after this check-in, it's hard to go back. So you mentioned the Git GUIs don't make it easy. Does Fossil have a GUI itself? Fossil has a built-in web interface.

Starting point is 00:52:56 So if you're working from the command line, you can type just Fossil space UI and that will automatically bring up a web browser pointed at your repository. So it's running, it's got a web server running there in the product, and it automatically brings up your web browser and points it at the homepage. And then you can click down through that.

Starting point is 00:53:17 And the web interface, I mean, Mercurial has the command hg serve, which is a similar concept. But with Mercurial, hg serve doesn't automatically bring up your web browser. You have to type hg serve, and then over somewhere else, you have to type a URL into your web browser to get it going. And the web interface is not nearly as rich. With the Fossil web interface, you can see everything you need to do. You can see all your tickets. You can see your you need to do. You can see all your tickets. You can see your wiki. You can get very detailed listings of branch history and diffs and blames and all of this. And so that is essentially your GUI is the web interface. And the nice thing is that then when you set up a server,

Starting point is 00:53:59 if you want to, you don't have to have a server to use Fossil. You can do it peer-to-peer. But if you do set up a server, you have the exact same interface on your server. You run this same web interface, and you get exactly the same views on the server as you do on your local machine. And the way it's set up, when you do Fossil UI, it's got a little mini web server running locally, but you can also run it from CGI or SCGI or whatever hosting mechanism you prefer. Same interface, either way. This episode is brought to you by our friends at Square. For our listeners out there building applications with Square, if you haven't yet, you need to check out their API Explorer.

Starting point is 00:54:50 It's an interactive interface you can use to build, view, and send HTTP requests that call Square APIs. API Explorer lets you test your requests using actual sandbox or production resources inside your account, such as customers, orders, and catalog objects. You can use the API Explorer to quickly populate sandbox or production resources inside your account, such as customers, orders, and catalog objects. You can use the API Explorer to quickly populate sandbox or production resources in your account. Then you can interact with those new resources inside the seller dashboard. For example, if you use API Explorer to create a customer in your production or sandbox environment,

Starting point is 00:55:18 the customer is displayed in the production or sandbox seller dashboard. This tool is so powerful and will likely become your best friend when interacting with, testing, or playing with your applications inside Square. Check the show notes for links to the docs, the API Explorer, and the developer account signup page, or head to developer.squareup.com slash explore slash square to jump right in. Again, check for links in the show notes or head to developer.squareup.com slash explore slash square to play right now so back to the i'm gonna hop us back to the branching and merging, if you don't mind.

Starting point is 00:56:09 One thing that I do often is throw stuff away, you know? Yeah, you've hit upon the point of contention, haven't you? Yeah, you did this in person. Yes, so I wrote this famous article called Rebase Considered Harmful, which has created a lot of ire amongst people. It is a difference in philosophy, and I try and understand other people's point of view, and I have come to appreciate the rebase point of view more as people have pushed back. So a lot of people use Git not so much as a version control system, but as a distributed version to file system. The difference here is subtle, but yeah, if you're doing a distributed version to file system,

Starting point is 00:57:00 oftentimes you want to delete files, which is kind of what rebase or throwing things away does. And if that's what you're doing, that makes sense. It really does. But my view of version control, which came out of this DL-178B document that I referred to earlier, is that you always keep everything. There's no way to delete stuff. Now you can shuttle stuff off into a branch that's labeled mistake or something if it doesn't work out. Mistake one, mistake two, mistake three.

Starting point is 00:57:34 We have lots of that. Actually, well, one of the things is because it's a relational database backing it up, it's okay to have multiple branches with the same name. Now that can get confusing to humans, but the database doesn't care. Okay. It's really cool with that. So we have lots of branches named mistake actually. And you can move stuff onto a branch after you've checked it in. You can attach and you do this without changing the check-in in any way. You

Starting point is 00:58:05 just add a new tag to that check-in that says, oh, I want you in this branch, not the one I put you in. So that happens a lot. We'll put something up there and say, oh, that was a boo-boo. Let's move this off into the mistake branch. And if you go searching on the mistake branch, you'll find lots of entries there. Just call it trash. Or You could call it trash if you wanted to. Call it whatever you want. Call it whatever you want. You can also add a tag to these check-ins that say that they're hidden

Starting point is 00:58:33 so that they don't show up on normal timelines and things. You can still dig in and find them if you're doing forensic analysis, but they would be hidden from Common View. And so this is just a difference in philosophy is that we believe in keeping everything. And this is going to store all of history, the good, the bad, and the ugly. There was a situation I saw with Git in particular, which maybe in this case would be bad,

Starting point is 00:58:58 that someone had actually included copyrighted code into an open source project. And they were faced with litigation essentially, or at least the threat at that time. And they were faced with litigation, essentially, or at least the threat at that time. And so they had to go into the Git repository and perform some Git foo, which required experts and people who could go through all the different things, essentially, more than your average Git user would do.

Starting point is 00:59:22 You had to get Witch Doctor, who knows the incantation. Yeah, the Witch Doctor. Somebody who really knew Git. would do. You'd get Witch Doctor who knows the incantation. Yeah, the Witch Doctor. Somebody who really knew Git. We do have that capability. It's a system called purging. Or no, shunning. Excuse me. Shunning.

Starting point is 00:59:33 You can shunning. Yes, you can shun artifacts. And so if somebody checks in something that is copyrighted and you get sued. Yeah. Or a developer goes rogue and checks porn into your repository or a private access token or something or whatever it is you can go back and shun it and it's the same drill where you you need to bring in somebody with a large amount of fossil food to make this happen but it does happen and we actually have had to do it once or twice but

Starting point is 01:00:03 it doesn't come up in your daily routine. But it's possible. It's reserved for emergencies such as the one that you do. So it really depends on the development style. I really push for, look, record everything. Disk space is cheap. Other people say, well, I to to work by myself and and get everything perfect and then once it's all perfect then i will push it up so that everybody else can

Starting point is 01:00:31 see it i'm going to argue that's that's not the best way to do it i think that you need to have the humility to push up your mistakes as well as your successes it makes it a performance really right like a pull request can be a performance. Yeah. You've done all the work. You've prettified this thing. You've put up this great pull request. Yeah.

Starting point is 01:00:52 You've explained it very well. And it's a presentation. And it can be very performative in that case where it's like, I'm going to perform for my team rather than be who I really am, potentially the one who's bumbling and making mistakes. And maybe that mistake was actually a smart thing, you know, or a really dumb thing, but you never know. But it can become, it can essentially inject the requirement of performance in the flows of things.

Starting point is 01:01:17 And my view is I'm very much opposed to that because I would get sucked into that trap very easily because I want to always make myself look good. So Fossil is somewhat designed to force you to show your mistakes as well as your successes, which is important to me. I have to do that for myself. I don't think of it quite so much as an ego thing or a performance as it's signal versus noise. I mean, why would I want to give you all my noise when I could just hand you my signal? If you're doing noisy stuff, you can do that off in a branch.

Starting point is 01:01:50 And then once you're ready to, to blend it in with your, to blend with your coworkers, then you merge it into whatever they're working on. And the good point is there, if you go on vacation for two weeks or something happens to you and you land in the hospital for a few weeks, you know, I hope that never happens, but it could because it's on a branch and it's being checked

Starting point is 01:02:09 in and synced, your coworkers go into, oh, what was Jared working on? We got to take this over for him while he's recovering. And they can do that. Whereas if it's in your own little private branch, it's kind of dead for a while. Yeah yeah i've definitely seen that meme where it's like in case of fire and it's like get pushed then run out of the building kind of thing because right and that wouldn't happen when it's possible because everything's out that's right when the fire alarm goes off first type get pushed then exit the building yeah i've definitely had those moments where i'm like dang i actually haven't pushed for a few days i should go do that before my laptop dies and i regret it you know I've had those feelings so I like that about Fossil I definitely would like to not have that feeling but I do also think there's

Starting point is 01:02:55 value in I guess maybe the I wouldn't call it the privacy but like the cheapness of being able to just sling and and just and then be like, this doesn't have to ever go anywhere. Because maybe it's not going to go anywhere. Yeah. In fairness, I think yours is the majority view. Sure enough. Yeah, I think it is. But there's enough people out there that like my way of doing things

Starting point is 01:03:17 that we have a small but devoted following. I believe it. I like how everything's built in. I think it's more difficult to buy in as a user because there's so much. Like maybe I love Fossil's single file model and the things you're talking about, but I really hate the wiki or I really don't like the chat. You know, here's the thing, and I encourage people to do this. I wrote Fossil for SQLite and if it it accomplishes nothing but support sqlite it's achieved its mission and it's done that very well and any

Starting point is 01:03:52 other use is just gravy so look even if you want to keep using git i'm fine with that you're not going to hurt my feelings in any way but it's worth it to you to study what we've done and look at the ideas and then take these ideas and ideas and move them to whatever other version control system that you're using. Say, hey, they had this cool idea over here. Why can't we do this in GitHub or GitLab? Why doesn't GitHub's lab support this? That will make your experience better, but maybe blended with your work style. That was my next question was like, you like, how do you take some of these features

Starting point is 01:04:26 or really just ideas and transplant them to the Git world essentially, GitLab, GitHub, because it seems like something that's happened, I think with GitHub or GitLab and these centralized repositories, these places where a lot of people congregate essentially, which is great for the progression and innovation of our own software. We've seen a massive uptick in innovation because of GitHub over the last 12 years or more even.

Starting point is 01:04:56 I think they're 13 years old. use Fossil or even if they believe in your ideas, they've got to essentially ostracize themselves, eject from the norm, the social norm of where to code. And how do you share that code back to, I suppose, that world? I guess you could do like mirrors, right? You can run Fossil locally and do mirrors with GitHub or something like that, I guess, if you wanted to. Yeah, we have like a GitHub mirror for SQLite that's completely automated. I mean, every time somebody commits, it automatically goes intoithub and it's a funny thing we do that for a client that is not actually using git but all of their all of their import infrastructure assumes that everything's on github so so we export to github and then they import from github into their own

Starting point is 01:05:43 proprietary version control system and use it from there. You don't have to give that world up then. You can live in the fossil world, except for it's – how many letters then? F-O-S-S-I-L. That's six letters? A lot of people abbreviate it with just F or something. You said E for your editor. I'm assuming F for just fossil, right?

Starting point is 01:06:00 You know, F push. I guess you could do that, or FX or something. People use different things. The key differentiator, I think, and one of the things that's really helped us to innovate in Fossil is the fact that it's a key value database, that limits what you can do. So my idea is that, look, you could backfit a relational database into Git by just making another file in the.git directory. And whenever you want to use this relational database, it would look at the Git log and say, well, what's happened since I was last updated? And then it would have to go back and, oh, there's been three new commits since then. Let me pull those in and parse out all the information I need and build up my relational tables from that.

Starting point is 01:06:53 And then let you use the database. But it would be completely backwards compatible. It would not change anything. It's just adding a new file to the repository. And then once you had a relational database in Git, you could very easily do things like say, what check-ins came after this one? It would completely eliminate the whole question

Starting point is 01:07:15 of a disconnected head. You would never again have a disconnected head because they would all be findable using the relational database. Richard, what if Nat Friedman was listening to this show right now? And he's like, you know what? I like these ideas.

Starting point is 01:07:28 I want to hire Richard. I want to borrow him, borrow his ideas. Well, he couldn't hire me. We could certainly talk and I would certainly be happy to give him these ideas and say, run with them and you do not need to give me credit. I would thrill if Git or GitHub or something would improve the usability so that people could

Starting point is 01:07:49 be more productive. I'm not going to move off of Fossil. It's ideally designed for the SQLite development environment. But if these ideas can be imported to other design methodologies, then that would be great. So there's a fellow named Patrick DeVivo

Starting point is 01:08:06 who has a website, askgit.com, and he has done a lot of work around, basically I think he is retrofitting a relational database around a Git repository's history. He allows you to basically query Git as if it was SQL. And I haven't looked at how he's doing it. I think he might be doing exactly kind of what you're describing. But I think the power that you're describing and having a relational database on your source

Starting point is 01:08:34 control history would allow for a lot of interesting mining and visualizations and connecting of the dots that you're describing. And he's doing some of that with Git, but he's having to add tools in order to provide that kind of a thing. Sure, sure. But once you get the relational database there, innovation tends to happen because, hey, we need a wiki.

Starting point is 01:08:57 Well, shoot, we've got this relational database. We'll just stuff it in there. Right. Or we need a forum. We'll just stuff it in the relational database it's sitting right there we'll just use it if you build it people will come and lots of interesting things will happen

Starting point is 01:09:12 if you were to do something like that you can even use a different relational database other than SQLite and you won't hurt my feelings use.db if you want to you're not going to make me mad so one thing you did different with Fossil we touched on it at the beginning of the show, is that you didn't go public domain. You went BSE style license. Was that a reaction to something

Starting point is 01:09:32 that happened with public domain? Or why did you decide to switch? Because it's still very permissive, but obviously it's less permissive than public domain. I started out in GPL. And early on, within a year, I got requests from proprietary people, hey, we want to use this behind our firewall. And our lawyers say we can't use GPL because of the viral nature of it. And you can argue that that didn't make sense there, but it's easier to change the license than to argue with lawyers so um truth you know i i got everybody who had contributed at that point at that point there hadn't been that many contributors

Starting point is 01:10:11 and i got everybody to sign a release to bsd and so we changed we re-licensed it to bsd and that that just allowed more people to use it and and in different. Public domain, it turns out to be hard to do. I didn't realize this when I started SQLite. I thought public domain would be really easy. I'd just say it's public domain and we're done. But there are many jurisdictions that discourage that or don't recognize that. And I didn't know this at the time. And there's actually a lot of paperwork that you have to go through to release your code to the public domain. Whereas we have the standard CLA, Contributors License Agreement, for people to contribute for a Berkeley DSB for a BSD-style license. So it just worked out better to go with

Starting point is 01:10:59 a traditional BSD-style license than trying trying to public domain again. Is it possible SQLite will change to non-public domain considering that? No. And this is just force of tradition and legacy. I think that it's always been public domain and we're going to keep doing it that way just because at this point it's too late to change. Maybe if I'd known now, known in 2004 what I knew now, or I guess 2002 when I did this,

Starting point is 01:11:35 if I'd known in 2001, 2002 when I did this what I know now, I would have done it differently. But no, we've got too much legacy behind it now. It's 20 years of tradition in public domain, so we're going to do that. I even went to the trouble of of there's a set of standard licenses, and they have codes. I forget what this is called. But I got the software blessing that is the head of every SQLite source code file. I got that registered as one of the acceptable licenses so that the automated tools that were scanning things would see this and say, oh, that's okay.

Starting point is 01:12:10 We can accept that. Actually, I had a whole show on the first one was you may do good, not evil, which really made it challenging for a maintainer to maintain the software. Eventually, it went by the wayside. And it actually had a massive change in, I guess, their contribution and others to it because of the whole, what is good, what is evil? How can you really, you know? It seems black and white in terms of opposites, but it was just difficult to actually put into practice. Yeah, and the blessing on Esculite is not a requirement. It's literally a blessing. It says, may you do good and not evil. It doesn't say you must.

Starting point is 01:12:49 That's the difference there. Truth. Yeah. Good point. It's like grace versus law there. Yeah. There you go. Absolutely.

Starting point is 01:13:06 Well, one thing you do say in regards back to the license, you said, quote, you are free to steal bits of the fossil source code to use in other projects, including proprietary projects. That means that you're not really holding these ideas to you and others can use these ideas essentially. Absolutely. Encourage other people to use it. So let me throw a startup idea at you and you tell me if it's good or bad. You're asking the wrong person but i'll give you my opinion okay it's one word two syllables three syllables fossil hub you know there's this thing called um that's already been done it's it's called um oh why can't i call the name of it i i uh chisel no that's a good name and uh it's hosted by um roy keen but the thing with Fossil, it's really designed for self-hosting. We make it really easy to set up your own Fossil server on a $5 a month VPS or on a spare Raspberry Pi that you happen to have lying around. It takes very little hardware to run Fossil.

Starting point is 01:14:05 I know some of these other systems, they say, oh, you've got to have at least a $40 a month VPS in order to support this. It's so heavyweight, but it's very low resource. And so you can just plop this up there. So a single executable, you plop it on your machine, a two-line CGI script gets it running, and it just does everything for you. And so the motive for having a service like GitHub for Fossil is greatly reduced. Because if you were to just take raw Git or raw Mercurial and want to set up a collaborative development site like GitHub, that's a lot of work. GitHub provides a very valuable service. With Fossil, the amount of work to set this up is greatly reduced. And so the need for that is

Starting point is 01:14:52 also greatly reduced. Now, what people have told me though, is that for some people who live in other countries, coming up with $5 per month in hard currency for a vps is a hard problem and for them having access to a free repository like that is is a big deal but for those of us who are fortunate to live in the u.s or our other western countries it's probably easier just to set up your own and then think of all right let me me just, I'm coming back to this subject, but think with me just a second. If you talk with people that like to go backpacking, do you have any friends or do you like to go backpacking yourself or do you have friends that do that?

Starting point is 01:15:36 Yeah. And you go out in the wilderness and you're on your own for five days and people ask, why do you do that? And people say, well, it's the freedom. It's just a lawyer being outdoors and having drinks. Think about this. and people ask, why do you do that? And people say, well, it's the freedom. It's just Lloyd being outdoors and having drinks. Think about this. Freedom means taking care of yourself.

Starting point is 01:15:54 That's what people like about backpacking and wilderness adventures is they go out and they're responsible for themselves, every aspect of their lives. They're carrying their house on their back and all of their food. That's what they like. Freedom means taking care of yourself. And Fossil tries to promote that.

Starting point is 01:16:09 It gives you the tools to make it easier for you to take care of yourself. Because you can take this one standalone binary, plop it on a server, add a two-line CGI script, and suddenly you've got a complete developer website up and running. Can you do that with other systems? Absolutely. But there's a lot more moving parts, a lot more you have to install and a lot more to maintain. To say that's pretty cool. Think of Fossil as your ultralight backpacking tent. There you go. Camp anywhere. It's not as nice as a Hampton Inn, but you're taking care of yourself. There's your new tagline, fossil. Not as nice as the Hampton Inn, but you're taking care of yourself.

Starting point is 01:16:55 Taking care of yourself. I like it. That's the essence of freedom is taking care of yourself. Yeah. But there's also balancing that out as community, I think. And so the thing that GitHub has that's even, I think, better than Git, more powerful than Git, is that's where it's the hub part, right? Yep. And everybody, I'm going to use your analogy and kind of abuse it to a certain degree,

Starting point is 01:17:19 everyone wants to climb a mountain, but eventually they come back down to the base camp. You said back to back, now we get a base camp. You want to hang out with people and you want to see what they're doing. Is there any way with Fossil to at least federate or have a directory or like, here's my cool open source stuff, here's my Fossil instance, here's Adam's Fossil instance,

Starting point is 01:17:42 he's out there over there. Let's get together and collaborate because that's what I think. I think that's the magic on GitHub. Yeah. Federation is interesting, Jared. Sure. I agree with you. And now if you talk to the people at GitHub, they will be quick to tell you that their company is not about Git. It's about Hub. Yeah. Absolutely. And I agree a hundred percent. It's a place for people to gather and collaborate.

Starting point is 01:18:07 And they're quite open about the fact that, well, they started on Git, but they stay with Git simply because that's what everybody uses. If Git were to vanish tomorrow and everybody were to go to Mercurial or Monotone, GitHub would change. But it would still be the same company because it's about the hub. Yeah.

Starting point is 01:18:25 And so, yeah, I think it'd be really cool if GitHub allowed you to have fossil repositories. Yeah. That would be interesting. I don't think that'll happen. How would that work then? And, you know, cast some vision for how that might work. How could you have a repository on GitHub that was not a Git repository? What would it take to make that happen

Starting point is 01:18:47 behind the scenes? I don't understand their infrastructure enough to really say, but I know that SourceForge allows different kinds of repositories, don't they? Yeah. How do they do that? I'm not sure. I think you can actually

Starting point is 01:19:02 have fossil repositories on SourceForge, if I'm not badly mistaken. I've never done that myself. The underlying data model of Git and Fossil is the same. You've got commit objects, and you've got file objects, and the commit objects link together to form a directed graph. And you walk the graph to pull out the pieces you need so the underlying data model is the same now that the the details of the file formats are completely different but the overall concepts are the same so it seems like you should be able to

Starting point is 01:19:35 use the same infrastructure to build a github with fossil yeah you probably have to introduce an abstraction layer somewhere in there that says, here's my interface, and I'm going to put Fossil on one side of it and Git on the other, and it's going to unify to what their frontend does. Exactly. Frontend not meaning their web UI,

Starting point is 01:19:56 but everything that's in front of that layer. So there would be some work involved, but... Yeah, a lot of work, which is why I think it'll probably never happen, but... It would be just as easy then to fossilize Git, right? To borrow some of the ideas of Fossil that you talked about, the relational database, some of the different principles and practices that you live upon that if they agree might carry over to maybe you make the backwards and forwards history i mean because how many times do people get stuck behind some git issues that sure seems to be solved by some of the things you've made simple with fossil or the running out of the building on fire git push you know like and i mean there's

Starting point is 01:20:37 certain things like streaming the git repository to github or whatever like there could be some ideas that you've you've laid claim to that could be translated, fossilized Git maybe. I think that would be a better solution because what I hear a lot is people, they look at fossils really cool, but it doesn't have rebase.

Starting point is 01:20:55 That's the number one complaint. Well, just take the cool features out of fossil and land them in Git and then you've got rebase. And all of your old tools continue to work. All of your build infrastructure that depends on Git continues to work as it did before. But you've got cool features like Git space UI, and it brings up a web browser and points it at your repository.

Starting point is 01:21:20 It gives you a cool timeline. Or Git all sync that goes around and finds all of your git repository and syncs all of them that would be cool honestly i mean you know you're essentially at a repository level for most commit i mean every commit really like you're not yeah unanimously across all of your git repositories inside of your code directory which i think is probably standard for most developers you got your home directory your user, and you got a directory in there called code or source or something that you'd put all of your source code in. Then you've got multiple directors beneath that

Starting point is 01:21:51 which were basically individual repositories. But Git, my Git, doesn't know about any of those other things. It just knows about its own single repository. Right, and wouldn't it be cool to be able to sync them all with a single command? Yeah. Yeah, that'd be really cool. Especially if you're going off network.

Starting point is 01:22:08 There's nothing worse than getting on the airplane with no Wi-Fi and suddenly remember, oh, I failed to sync the one key repository that I need to make this work. And you can't work for those four hours or whatever it is. Exactly. Sucks to be that person. Yeah. four hours or whatever it is like well exactly sucks to be that person yeah yeah so um i think that's that would be a great a great way to move forward i really do and i'm happy to contribute ideas to anybody who wants to undertake this anybody who's listening to this who wants to build these go look and see what fossil has you don't have to agree with every with my view of

Starting point is 01:22:43 how things should be done but look at the ideas and steal them. You don't have to agree with my view of how things should be done. But look at the ideas and steal them. You don't even have to give me credit. Fossil-SCM.org if you're listening. It'll be in the show notes, of course, but that's a good place to start. Yeah. Take the ideas and run with them. I love it. Let me just throw this out here. A year ago, we used Markdown. Markdown has become the de facto language for documentation and so forth. I needed to draw diagrams in my Markdown documents, stick diagrams for architecture diagrams and stuff. And so I took the legacy language from the 1980s Bell Labs called PIC, P-I-C, and I created my own implementation of it that works on the web. And it's called Picture, P-I-K-C-H-R. And so in the middle of a markdown document, you can have just a little bit of code that does these elaborate diagrams.

Starting point is 01:23:37 It's a really cool feature. Picture was originally written for Fossil, but I put it out in a separate repository, which is mirrored on GitHub, with the hopes that other people who have their own Markdown engines would pick it up and integrate it into their Markdown implementations as well. It conforms with the Markdown standard for fenced code blocks. So it's not a language extension it's well it's an extension in the sense of it's an allowed extension that's specified in the

Starting point is 01:24:09 in the markdown documentation so if you want to look for ideas please please look at that I wish you would adopt it yeah well Richard

Starting point is 01:24:17 it's good to have you back I mean it's been too many years I think I think we should make this more frequent if possible I love just hearing your ideas I love hearing really I think your spirit you know the the programmer spirit I think you bring you know more frequent if possible. I love just hearing your ideas. I love hearing, really, I think, your spirit.

Starting point is 01:24:25 The program of your spirit I think you bring. And the freedom you bring, the ideas you bring, this aspect of freedom, this aspect of blessing, just this aspect of giving, really. I love that about who you are. And I appreciate all that you've given this world and all the ideas you've shared here today. And it's been awesome. Thank you. Thank you for having me on the show. All right.

Starting point is 01:24:51 That's it for this episode of The Change Law. Thank you for tuning in. We have a bunch of podcasts for you at changelog.com. You should check out. Subscribe to the master feed. Get them all at changelog.com slash master. Get everything we ship in a single feed. And I want to personally invite you to join the community at changelog.com slash community.

Starting point is 01:25:10 It's free to join. Come hang with us in Slack. There are no imposters and everyone is welcome. Huge thanks again to our partners, Linode, Fastly, and LaunchDarkly. Also, thanks to Breakmaster Cylinder for making all of our awesome beats. That's it for this week. We'll see you next week. Game on.

The Changelog: Software Development, Open Source - Richard Hipp returns (Interview)

This week, Richard Hipp returns to catch us up on all things SQLite, his single file webserver written in C called Althttpd, and Fossil -- the source code manager he wrote and uses to manage SQLite de...velopment instead of Git.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

The Changelog: Software Development, Open Source - Richard Hipp returns (Interview)

This week, Richard Hipp returns to catch us up on all things SQLite, his single file webserver written in C called Althttpd, and Fossil -- the source code manager he wrote and uses to manage SQLite de...velopment instead of Git.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.