The Changelog: Software Development, Open Source - Richard Hipp returns (Interview)
Episode Date: August 19, 2021This week, Richard Hipp returns to catch us up on all things SQLite, his single file webserver written in C called Althttpd, and Fossil -- the source code manager he wrote and uses to manage SQLite de...velopment instead of Git.
Transcript
Discussion (0)
What's up? Welcome back. I'm Adam Stachowiak, and you are listening to The Change Log.
On this show, Jared and I talk with the hackers, leaders, and innovators from all areas of the software world.
We face our imposter syndrome, so you don't have to.
Today on The Change Log, Richard Hipp returns to catch us up on all things SQLite,
his single-file web server written in C called Alt-HPD,
and Fossil, the source code manager he wrote and uses to manage SQLite development
instead of Git.
Big thanks to our partners,
Linode, Fastly, and LaunchDarkly.
We love Linode.
They keep it fast and simple.
Get $100 in credit at linode.com slash changelog.
Our bandwidth is provided by Fastly.
Learn more at fastly.com
and get your feature flags,
PowerPile, LaunchDarkly,
get a demo at launchdarkly.com.
This episode is brought to you by Gitpod.
Gitpod lets you spin up fresh, ephemeral,
automated dev environments in the cloud in seconds.
And I'm here with Johannes Landgraf,
co-founder of Gitpod.
Johannes, you recently opened up your free tier
to every developer with a GitLab,
GitHub, or Bitbucket account. What are your goals with that? Thanks, Adam. As you know, everything we
do at Gitpod centers around eliminating friction from the workflow of developers. We work towards
a future where ephemeral cloud-based development environments are the standard in modern engineering
teams. Just think about it. It's 2021 and we use automation everywhere. We automate
infrastructure, CI-CD build pipelines, and even write in code. The only thing we have not automated
are developer environments. They are still brittle, tied to local machines, and a constant source of
friction during onboarding and ongoing development. With Gitpod, this stops. Our free plan gives staffs
access to cloud-based developer environments for 50 hours per month.
Companies such as Google, Facebook, and most recently GitHub have internally built solutions
and moved software development to the cloud.
I know I'm biased, but I can fully relate.
Once you experience the productivity boost and peace of mind that automation offers,
you never want to go back.
Gitpod is open source, and with our free tier, we want to make cloud-based development available for everyone.
Very cool.
All right, if this gets you excited, learn more and get started for free at gitpod.io.
Again, gitpod.io. All right, we have Richard Hipp here, a long-awaited return to the changelog.
Richard, welcome back.
Thank you for having me.
So excited to have you back.
We first had you on the show back in 2016 talking SQLite, and I will pronounce that correctly and do my best.
There's no correct pronunciation.
You call it whatever you want.
Well, Adam was slapping my wrist yesterday because we were talking in prep for this, and I kept calling it SQLite.
He's like, now you know Richard pronounced it SQLite.
And I said, I just can't do it.
I'm just trying, but I'll do my best.
And ever since then, Richard, I've been on your side, like out there just spreading the word how it's truly spoken and i guess if you don't feel strongly about that then we won't enforce
it but you said the right way so i've been following your rules well i think we actually
broke news and probably the most cited episode of ours out there on the internet is episode is it
201 201 yeah with richard hip how you pronounce s How You Pronounce SQLite. And we set the record straight.
And that's probably the most linked to episode.
Not only that, Richard, but we've had many people over the years say,
you've got to get Richard back on the show.
So we're happy to have you.
We're here to get an update on SQLite.
We're also here to talk about Fossil, which is your own SCM,
which does lots of interesting stuff.
What's an SCM, Jared?
See, I had to look this up because I thought it was Source Control Management. your own SCM, which does lots of interesting stuff. What's an SCM, Jared?
See, I had to look this up because I thought it was source control management, but I think it's software configuration manager.
Richard, what's SCM stand for?
I always thought software configuration management, but source control management works too, I
guess.
Sort of.
I mean, well, I guess we'll find out that it does more than just source control, right?
Like it does a lot of things.
But also, do you configure software, software configuration? I don't know.
Neither one of them fits all that Fossil does.
SCMs, definitely a thing.
A thing that isn't discussed so much anymore because I think everybody
for the most part, except for you and your community, Richard,
are just using Git and GitHub.
It's fun to find out an
alternative that's viable and long lasting and beloved by those who use it we're gonna learn a
bunch about fossil maybe we'll have some converts after this episode but let's catch up with sqlite
first it's been five years it's probably still the most used software in the world maybe second
place to c lib or maybe curl is catching up. I don't know. There's
a few of those that are just ubiquitous, but what else? Is it on Mars? Yeah. Is SQLite on Mars?
Do you know? I don't know. But every time we have this conversation, somebody writes,
says, oh yeah, it's definitely here or there. It's in just about every electronic device you have.
It's in your car. If you've got a recent car, it's in most of your computers required
to boot up these days.
It's certainly in all of your phones.
I think that there's probably
more instances of SQLite
running than all other database engines
combined. Which is amazing to just
think about. It's scary.
Well, it's scary for you because you're the one
managing the configuration of the
software right yeah well it does change your worldview i mean suddenly it's like um
boy i need to pay attention to this don't i right i can't mess this up so does development slow
a slow pace because of that nowadays or does still move pretty fast or is is sqlite pretty
mature so that you don't do too much to it?
It has slowed from the early days,
but I mean, we still are adding a lot of features
and we do a lot of changes.
We don't talk about the rate of code churn very much
because that would scare people.
Because it's high?
It is for a piece of software that's used this widely
and is used so much.
But we do have, we actually spend most of our time testing it, you know,
because that's important.
I was, oh, a few years ago, we were talking with a young college graduate and it was a young woman and she was talking to me and she says,
well, she was in software too, and she said, well, she was in software too and she said,
well,
I just do testing.
I'm just a tester.
She was very self-deprecating
and I thought,
shoot,
that's all I ever do is test.
I spend all my day testing.
I'm just a tester.
Yeah.
Because people write in,
they'll have some issue
or we'll do a new feature
and adding the feature
takes an hour
and then we'll spend
weeks just testing it and yeah but even that there there is a lot of code churn um i know that um
like open bsd somehow they've heard for a while adopted sqlite into their core
set of packages because it was being used for their, I think for the search engine on their man pages.
But they wanted to stay up to date
and they feel compelled to do a code audit
for every line of code that changes.
Oh, wow.
And so we were changing SQLite faster
than the rest of the entire core package combined.
And they said, no, we just can't keep up.
So they had to write their own database engine for their...
Oh, they dropped you as a dependency.
Yeah, they had to drop it because the code churn was just too high.
Wow.
Don't you have in your license where you can, or is that with Fossil?
Did I misread that?
Where you can, I think your words were steal the code and use it however you want, even
for commercial use.
That's for Fossil.
Yeah.
Well, SQLite is public domain and you can do anything you want, even for commercial use. That's for Fossil. Yeah, well, SQLite is public domain,
and you can do anything you want with that.
Right.
I wish I'd said thought of this.
It's kind of evolved.
I mean, we do have a lot of public tests that are out there
that are public domain as well,
but some of our test code is proprietary.
Some of it.
Why is that?
Because it was paid for by somebody?
Originally, we thought we were going to sell this and make money from it. And that's how we were going to support ongoing development.
That didn't really play out. Nobody ever bought it. It does sort of become our business value,
our intellectual property. I mean, you can take the SQLite code and fork it and start your own
thing. The tests. But you don't have the full test suite. You've got a lot of tests, but not all of
them. And so we've got a little bit of advantage over you there. So is most of your business income
is support contracts for SQLite? It's pretty much all support. We have some extensions like
the encryption extension that we'll sell to people on a license basis, but the bulk of the revenue is from support contracts.
And a lot of people do that
because if your business depends on this,
you want to protect your supply chain
and we can sell them a support contract,
which is a lot cheaper than them hiring somebody
to support it themselves.
So when I hire the experts, right?
Right.
The ones with all the tests.
And if we're doing our job well, they never call us, you know?
That's right.
How does that play into the makeup of the business then?
Like when you think about growing the business,
essentially you have to make worse software, right?
To some degree, right?
Software that requires, you know.
Yeah, that requires maintenance. That's right. In order to sell more maintenance contracts,
we have to deliberately introduce bugs. Okay, I'm not sure. I don't want to go there.
That's not the way I want to do it. People, I have talked to a number of people who have made
a lot of money in the software business, and they look at what we're doing, and they say,
oh, Richard, you could make a lot of money doing this. Let me show you how.
Yeah.
And they're probably right.
I don't doubt that if they had been the manager of this project, we would have made a lot of money.
But, you know, I'm just – I'm not gifted that way.
That's not who I am.
Right.
I'm much more the hacker.
You know, lock me in a room with a computer and push pizzas under the door and leave me alone.
So, the business – we've kept the business small.
It's not a promise, but we want to support SQLite until the year 2050.
And, you know, you have to be careful.
And that changes your way of thinking.
We want to make sure that everything we do is sustainable in a business sense.
Yeah.
Yeah.
So you still slinging code?
Yeah, absolutely. Everyinging code yeah absolutely every day
pretty much every day yeah nice what's your discipline towards that do you have like a
a time block in your calendar do you it's two o'clock time to code no no it i decoded on an
as needed basis which is which is daily apparently well it just depends on when things come up i mean
customers will write in with questions or you know know, I'll think of an idea.
I'll be out running.
And I think, this is a feature we really need.
And then I'll cut the run short and come home and clean up and get busy coding.
There you go.
It's just.
And you'll test it for two weeks.
Or a month or whatever.
Yeah.
So how big is the company?
Like how many people are working on this, this support contract supporting?
I've got three guys working on it with
me right now.
And we're all distributed, so
it's always been that way.
I'm kind of living the dream.
If that's what you like doing,
why not keep doing it?
Is there any plans for a
SQLite cloud? There are other
companies working on that as we speak.
Gotcha.
Yeah, so one thing that has changed,
or maybe hasn't changed,
but Adam and I have become aware of this,
is last time we talked, 2016,
of course it was already pervasive, right?
It's already out there in tons of things.
But it's not client-server.
And so the, I guess what you call server-side,
write-heavy, web-server-style usage the it's not client server and so the i guess what you call like server side right heavy like web server style usage is really the place where sqlite wasn't playing quite as much because you
would switch to a postgres or something at that point but it seems like a lot of people were
taking it more serious even for like backends on web servers nowadays we know ben johnson has his
light stream project which is like streaming replication so there's like tooling around Even for backends on web servers nowadays, Ben Johnson has his Lightstream project,
which is streaming replication.
So there's tooling around,
hey, I actually want to use this in a production capacity on a web server or a web application backend.
Whereas it didn't seem like people were doing that then,
or maybe they just weren't talking about it as much,
they're doing it and talking about it now.
Yeah, so SQLite was originally designed
to be more of the database engine for the edge
of the network. Yeah, like embedded.
Versus the core of the network.
It's out on the peripheral
devices, not in the core
data center.
But, for example, I can
talk about now Bloomberg.
Their entire organization runs off
of SQLite. Now,
it's a customized version of SQLite called COMDB2.
They have their own storage engine, which spans multiple data centers and is highly redundant.
But the SQL query planner and executor is all SQLite.
And then Expensify uses a stock version of SQLite to run everything.
Dave Barrett, the founder of the company, wrote this product called Bedrock.
And he open sourced it.
It's out there on GitHub.
It's sort of a wrapper around SQLite.
His idea is that he builds a server for the application that is doing the database processing, and the front-end devices, they don't speak SQL directly.
They call essentially stored procedures.
And so you don't have any concern with SQL injection because everything is done with stored procedures.
But this server thing, Bedrock, uses SQLite for all of its underlying processing.
He's published stuff where he's getting like, I think, 3 million transactions per second.
It's incredible.
Yeah.
It's an insane amount of volume.
So there are cases of that.
But still, I think the predominant use case is cell phones and Raspberry Pis and the internet of things.
Does your business then have a relationship with Expensify and Bloomberg and this open source project you mentioned?
We do.
Yes.
Okay.
We support it for them and a few other companies like that, some of which wish to be public and others which don't.
And that's fine.
I mean, we're happy to work either way.
I think what's interesting here is just a side note on this really,
is this sort of desire or this one-way thinking that
because you built the database that's amazing and widely used,
that it has to be this massive company
or it has to have $2 million in recent funding
with billions of
dollars of venture, you know, of valuation, you know, like this, that's the way you have to do it.
And I love that you push back on, I mean, based on what you say here, that you push back on the
idea that you said you're not equipped for that and that you like the small company feel, you know,
you like to code every day, you know, that you're not influenced out of your norm,
out of your comfort zone, your love, your passion,
to build a company you don't actually want to run.
Yeah, it's hard to know exactly what to do.
But I have made that choice, and it's worked out really well.
Now, who knows?
Maybe I would have been happier another way, but we'll never know, right?
I'm happy now, and so I guess that's what counts, huh?
Yeah. You can't go back and fork your life at that point and just run both tracks and see which one
would have worked out better, but. No, everything's worked out really well. And when we've been able
to solve a lot of problems for a lot of people, and it's been just an amazing journey. One of the
great things is I've been able to go out and visit so many different companies and so many different
cultures and see so many different styles of development. It's really been an eye
opener. I would have never imagined that there was such a diversity of corporate cultures and
development styles out there. Jared mentioned Lightstream and Ben Johnson. What are your
thoughts on that in particular? This idea that you can, you know, using the replication process of SQLite and don't get done with that. What are your thoughts on Ben's project in particular, this idea that you can, you know, using the replication process of SQLite and don't be done with that.
What are your thoughts on Ben's project in particular?
Yeah, you know, I think it's an interesting idea.
We actually, Dan, one of the other developers and I,
had a jitsy conversation with Ben at one point,
and we really appreciate what he's doing.
He's not the only one doing that, let me say.
There are other groups that are working on that as we speak.
You know, I think it's a great idea.
I really applaud him doing it.
Whether or not it gets traction, takes off, I can't predict.
I just don't know.
I want to keep what we're doing here with us focused on the Internet or the database for the edge of the network.
I don't personally want to get involved with making it massively scalable like that. I think
it's a great thing. It's a very important problem that needs to be solved. But just what we have now
is enough to keep us busy. And if I try and take on too much, we would lose focus and we'd start making mistakes.
You have to find the right balance there. And right now, SQLite is pushing the limits of what
a small team like this can reasonably control. To go further, I would no longer be able to
understand everything that's in the code. And we'd have to start delegating and who knows where that
might lead. I don't think that I would be very good at that,
and I don't think that I would enjoy that, so we're not going to do that.
Stay focused on the small.
Stay focused on one thing that we can do well.
That gives people like Ben an opportunity to do their thing as well.
We're contributing to him.
Well, it creates an ecosystem around the thing versus you having
to be the ecosystem, which I think is
healthy and, like you
said, it's opportunity. Do you ever see things
out there that people are doing with
SQLite or building
on top of or around similar to
Lightstream where you think, either
I wish I would have thought of that or actually I am going
to take this one and put it into
the code base. You ever done that? Yeah. I can't call have thought of that, or actually, I am going to take this one and put it into the code base.
You ever done that?
Yeah.
I can't call specific instances to mind, but yeah, I'm always watching what other people are doing and thinking, well, that's a good idea.
We should try and do that. Or how can we make SQLite solve that problem directly rather than having this add-on?
The thing to watch right now is DuckDB.
I don't know if you've seen that one.
I have not.
Duck?
DuckDB.
Okay.
It's a column store instead of a row store.
So it's optimized for big aggregate queries.
And so if you've got a large set of data
and you're running analytics on it,
they say DuckDB runs a lot faster.
And DuckDB has borrowed a lot of the ideas that we pioneered with SQLite where they do amalgamation. It's just a single
file of source code. I think they stole our command line client and just reused it, which
they're fine. I'm cool with that. Let them do that. Well, it's public domain, so you better be cool
with it, right? Yeah, of course. So, you know, that's inspired me to think about, well, can we have a roast, a column store option for SQLite as well?
What would that look like?
How can we build that out in a backwards compatible way so that it, you know, it doesn't break legacy applications?
Yeah. Because a big part of what we do is the SQLite file format is very carefully defined and
we guarantee that it's going to be unchanged for years to come, or at least not changed
in incompatible ways for years to come through the year 2050.
It'd be much easier to write a column store if we could go back and redo the file format.
Right.
There's lots of things I would have done differently knowing now, if I'd known back then what I know now.
But we're kind of locked in by legacy.
We need to support the literally trillions of SQLite databases that are already in the field.
So how can we do that and do a column store at the same time?
Couldn't you just have another file format that's like column store mode?
And it's like, now it uses this file.
Yeah, but then you've got added complexity.
The other thing we need to balance is that because SQLite runs on small devices, we need
to be careful not to let the footprint of the library grow too big.
There's been steady growth in the size of the library.
We're pushing 600
kilobytes right now. That doesn't sound like very much. Yeah, these days it doesn't sound like very
much. But back 15 years ago, folks like Nokia were just, and Motorola were just beating us up.
Can you save another 100 bytes? You know, I mean, these days it's less of a concern, but at the same
time, we just don't want to let it go wild and suddenly turn into a 10 megabyte library that you have to link into your application.
So there's a balance there.
I mean, adding a column store means a totally new query planner.
You know, how much extra space would that be?
So, I mean, that's something that I'll be looking at in the coming year, coming couple of years probably.
Well, here's a couple examples.
Application size. So here I'm looking at my ios app updates zoom cloud meetings update 86 megabytes audible
update 119 megabytes uh google maps this one will probably be big 206 megabytes so i feel like you
know maybe that that one dependency could be a little bit larger and nobody would notice. But point taken.
Especially with the edge, too.
You got, you know, edge devices probably have SD card for the most part or smaller drive types that just don't have the capacity.
You know, things like that that really come into play.
Something that you kind of made me think of there was when I asked you before about the business and optimizing for needing support,
I think actually you're optimizing for something worth supporting, you know? That's a good way of
looking at it. Yeah. Because, you know, it's not worth supporting unless people are using it.
Unless it's useful. Sure. You know, needing support is one thing, but being worth supporting
is a different thing. Yeah. So I'm not very good at sales. And so in order to get customers, we really have to make it so that their business
utterly depends upon SQLite.
Because it's just so stinking good, right?
Yeah.
So that encourages me to make it better all the time.
Yeah.
So the reason SQLite is so reliable
is because I'm such a bad salesman. Ha ha! at LaunchDarkly, feature management for the modern enterprise, power testing in production at any scale.
Here's how it works.
LaunchDarkly enables development teams and operation teams
to deploy code at any time,
even if a feature isn't ready to be released to users.
Wrapping code with feature flags gives you the safety
to test new features and infrastructure
in your production environments
without impacting the wrong end users.
When you're ready to release more widely,
update the flag status and the changes are made instantaneously by the real-time streaming
architecture. Eliminate risk, deliver value, get started for free today at LaunchDarkly.com.
Again, LaunchDarkly.com. So I think this will lead us into Fossil,
but I wanted to touch briefly on alt-httpd
because I saw this and it just made me laugh.
Of course, Richard Hipp wrote his own web server
to powersqlight.org.
Tell us about this.
I mean, I understand you like to write your own tools, but, you know, Apache existed, Nginx existed.
Maybe it was very young, but it existed.
Well, no, no.
Well, Apache existed when I first wrote this.
Nginx was out there.
But it was big and complicated, and I said, well, I'll just stand up Apache.
We'll do that.
I looked at the documentation.
I read through the documentation multiple times.
And I said to myself, can I configure this in a way that will be secure?
Maybe with some trial and error.
But how would I know that it's secure?
I wouldn't really know.
I mean, you really have to spend some time and become an Apache expert to know
that it's secure. Maybe they have better tools now, two decades on. But it occurred to me,
in order to write something that I would really trust to run on my servers, I need to write it
myself. And so, I put together alt-httpd. It's very, very simple. It's a single file of c code so that you can audit it and make sure that
it's not doing anything weird and i put it up there and it works it's not make no claim to be
the most efficient it is not the web browser that you want to deploy at scale this is not the web
browser you want to use if you're building the next Facebook. But for small websites, it works great.
It's the traditional fork a new process to handle each HTTP request design.
So we handle one HTTP request.
It calls exit, and the operating system cleans up the mess.
And so that's really simple, secure.
We don't have to worry about memory leaks or anything like that
and it handles the load fine and when we're doing i mean it's not a huge load though we're getting
what 10 http requests per second of about 20 of which are cgi requests and so that's fine
you know a linode will handle that without any trouble.
Would it be more efficient to do it with Nginx?
Maybe, but this works.
And so I'm going to stick with it.
I'm not recommending that you go out and deploy this on your website. But if you want something quick and easy to set up that you can read in a couple of hours and understand, it's out there.
You're welcome to use it.
So I wrote it back around the year 2000.
It's over two decades old.
I put it under – it sort of lived in other version control systems for a while.
I split it out as its own project only just recently.
So don't get the idea that I wrote it just recently.
We've been using this for decades.
It says on the website that it's been in use since 2004 and NGINX was released in 2004. So I thought NGINX existed, but maybe when you originally wrote
it. Maybe it did exist. I just had never heard of it. Yeah. That's entirely the case. Have you
ever heard of not invented hair syndrome? Yeah. And you could make the case that I have a lot of
that in me. I think maybe it leaves us a little bit
in the fossil, but go ahead, continue. Yeah. Oh yeah. You know, I tend to write a lot of my own
stuff and maybe this is just because for me, it's easier to write my own than to figure out
how somebody else's works. This came up with SQLite when SQLite version 1, we're on version 3 of SQLite, which came out in 2004.
Version 1, the storage
engine was GDBM, the GNU
Database Manager. It was a key value store.
It was hashing
based. It was GPL'd, so we needed something
better. And I thought, oh, well, I'll get
Berkeley DB and I'll use that as the storage
engine. And I
spent literally two days studying
the documentation, trying to figure out
how it worked, and the documentation's okay. But there were a lot of corner cases that I needed to
understand, and I recognized that in order to understand these corner cases, I'm either going
to have to read the entire source code to BerkeleyDB, or I'm going to have to write a bunch
of test programs to see what it does really. And I thought, you know what? It's going to be easy to write my own.
I'll just write my own storage engine.
And so I did.
And I got lucky that worked out well in the end
because having control of your own storage engine,
it allows you to do optimizations and features
that you couldn't do
if you had to maintain compatibility
to somebody else's API.
So these sorts of things help a lot.
With alt-httpd, I can do things on the website that I can't do easily with Nginx and Apache
because it does things that they don't do.
And so I can't really easily convert the website over to those now
because I'd have to recode it to the Apache Nginx style.
Do you have a for instance?
Like something that you can do there?
Well, with
alt-httpd,
there's no configuration file.
You just point it to a directory
that contains your content.
And if the files
in that directory are
executable, they're CGI.
And if they're not executable, they're CGI. And if they're not executable,
they're static content.
Okay.
So any executable file can live there.
You can throw a PHP script in there
or a Ruby file
and it will just run it like a CGI.
Or run it like a CGI.
Yeah.
Sounds kind of dangerous.
So you don't put executables there
that you don't want.
I just messed up with that.
But the other thing is it also drops itself into a change root jail.
So the executables you put there need to be statically linked because they're not going to be able to find the shared libraries and slash lib that they need.
So you statically link them, and you put just a few that you really do need, like Fossil.
Like Fossil.
Yeah.
It's also got one use case, too, which is your use case.
So it can be that strict, whereas mainstream might be like, that's kind of painful.
Right.
But I've never tried to push it.
I've never tried to publish it or never tried to get other people to use it.
A few other people have downloaded it and use it, and they say it's great.
And if that works for them, that's wonderful.
But I wrote it for my own use, and if nobody ever else uses it, it's still been a great job.
The other thing is every now and then we get these very pernicious robots that come invading the website and trying to bring the server down.
And because I control the web server, I can just put a little test in there that identifies the malicious robot.
And whenever I see one, I call exit.
Are you just detecting a certain request signature or a user agent?
Or how do you do that? IP address?
It depends on the robot, yeah.
So you've been doing like a tower defense game you've been playing all these years.
Yeah, it's a whack-a-mole because there are always new ones coming up.
Oh, I played a lot of whack-a-mole in my day.
But there was one a few years ago that it tried to pretend to be an ordinary web browser,
but in the user agent string, they'd misspelled one of the words.
Gotcha.
So I just looked for that misspelling in the user agent string,
and if I see that misspelling, call exit.
You're done.
You're done.
Is there anything you learned, though, along this journey?
Like you mentioned writing your own software.
It may not be what everyone else might do.
But is there any lessons you've learned in particular of writing this web server that you've been able to apply to SQLite or to Fossil, which we'll talk about?
What have you learned doing it that may be a lesson
that you wouldn't have learned otherwise?
I can't point to specific lessons.
I do find that it does work well to control your own tools.
One, if you do a diff between alt-httpd
and the web server that's built into Fossil,
you'll find a lot of commonality there.
Okay, so they're kind of barred heavily between the two.
But what I've found is that when you control your own tools, you can go further and do things that you can't do if you're depending on somebody else for your tools.
And I won't use Alt-httpd as the example,
but rather Lemon, the parser generator that I use in SQLite.
Now, most people, when they're doing a language parser,
they'll bring up YAC or Bison.
But I'd written my own version back in the 1980s
because I was dissatisfied with the interface for YAC.
And I used that for SQLite.
And I've had it out there for open source for a long time,
and nobody ever noticed it until it appeared in SQLite.
But by using Lemon as the parser generator,
I was able to add new features to Lemon
to support language features in SQLite
that would just not be possible to do with YAC.
So, for example, when you use a new keyword,
just recently in SQLite, we added the materialized keyword.
But suppose there's somebody with a schema out there
and they've got a column named materialized.
If that became a proper keyword,
then suddenly when they tried to read their database in,
it wouldn't be able to parse the schema because it was using a keyword as a column name.
That wouldn't work.
So we have this feature in Lemon so that if it sees a keyword in a context where it thinks it needs an identifier and it can't use the keyword there, it will change the keyword into an identifier and use it as an identifier.
You can't do stuff like that in YAC, but because we control the parser generator, we can pull little tricks like that and maintain backwards compatibility.
And we were also able to optimize the code generated by the parser generator so that
it runs very fast since a big part of the time for an SQL database engine is actually parsing the SQL.
I like that principle because something I've learned over the years is
certain jobs require certain tools, basically.
And it's kind of what you're saying, but sometimes when you have the right tool,
hard jobs become easy.
And if you control your tool, then you can have the right tool to make a hard
job easy, essentially. Sure. And think back years ago, I mean, the concept of a tool and die maker,
you know, companies that had a big staff of tool and die makers, they could make their own
machinery and they could out-compete. If you had to buy your machinery from somebody else and it
just came as is, you had to make do with whatever they had.
But if you can make your own tools, you can fine-tune your processes and out-compete.
Well, it's not just the market being able to offer the tooling, too. It's all the effort
that goes into it. Survey the options, evaluate the options, test the options, deploy the options, maintain the options,
and then if that thing doesn't suit a future need,
re-evaluate the options and rinse and repeat the thing.
Yeah, you don't want to make all your tools.
I mean, I am using other people's operating systems and compilers.
What else, though?
Because, I mean, you told us last time,
you wrote your own editor, so you go to that depth.
Is there any tools beyond your OS and Bedrock that you do use?
And you're like, this is actually good enough for me.
I like Zed, or I like this browser.
What are some tools that you use that you don't feel compelled to write?
Well, I use commercial web browsers.
I normally use Firefox, but I'll use Chrome or Safari on occasion as well, or some of the other ones like Brave.
I certainly use the standard compilers. Linux, Mac, Windows, use all of those.
Did you write your own spreadsheet? Did you write your own?
No, no. I use NeoOffice, OpenOffice. Excel's actually really winning, even in enterprise today.
There's a lot of stuff about people trying to overturn the use of Excel
because of the way work has changed, and they can't kill it, basically.
It survives.
Well, it's so malleable and powerful.
It's very powerful.
I see a lot of people use Excel as they use it for making documents.
It's not just a spreadsheet.
It's a formatting engine.
And a database, you know.
And a database, absolutely.
So, yeah, you use the tools that are appropriate,
but I have my own text editor.
I have my own web server, my own parser generator,
my own version control system etc so i yeah um i keep keep threatening to do my own
email transfer agent and i've actually put work into that and that turns out to be a
really really hard problem you'd think that it's that's a harder problem than writing a database
actually because of all the legacy you have to support. But I'm really dissatisfied with what we have available in terms of email systems.
And so if you want to host your own email, that's kind of hard to do these days.
It's super challenging.
I mean, it takes so much work to do that.
We'd actually just log something.
I can't think of all the details, but they were like giving a walkthrough of how essentially
to host your own email and all the things you would have to do.
And I'm just like, no, I mean, it's just so much, you know, it's just so much. I put an enormous number of
hours into trying to come up with a single unified system that will simplify that in some way.
I don't have anything to show for that yet. It's a hard problem. Still working on it.
Well, the cool thing though, is that, is is the law of numbers, essentially. If you keep writing your own tools, sure, SQLite has been the winner of the tool.
It's what you built your company around.
It's where you and your team get your livelihood from.
But there may be the next big thing behind a tool you decide to make your own.
Like this editor.
Are you the sole user of it?
I think I'm the sole user.
Yeah, yeah.
None of the other team members use it.
They all use VI or Emacs.
Yeah, but you never know.
You never know, right?
You never know.
And I never expected SQLite to go viral like it did.
That was a complete surprise to me.
If I were you and I wrote an editor,
I would name it HIP, H-I-P-P. That's such a and I wrote an editor, I would name it HIP.
H-I-P-P.
That's such a cool name for an editor, right?
Yes.
What's it called?
Well, I call it just the letter E because it's easy to type.
Okay.
That's editor, yeah.
Yeah, E for editor.
That's easy.
Yeah.
E for easy.
Easy does it.
I think you should release that thing and just let the world decide, you know?
Okay. I think you'll be disappointed. I think you should release that thing and just let the world decide, you know? Okay.
I think you'll be disappointed.
I think you'll be disappointed.
I'm very easily impressed.
So you're not going to tackle mail quite yet because there's a lot there.
Oh, I've been working on it.
I've had no success at it. Yeah, you just haven't gotten a tackle.
It's a tough nut.
But you have tackled, as we've said and teased up, version control, source control management, software configuration management, Fossil.
Tell us the story of Fossil because you've been working on it.
This is not a new thing.
You've been doing this for a very long time, not as long as SQLite, but they're kind of symbiotic.
You're probably the only person since, I don't know, did the Mercurial people hang it up at this point?
Are they still working on Mercurial?
No, I think Mercurial people hang it up at this point? Are they still working on Mercurial? No, I think Mercurial's still
viable. They're still making
additions and releasing
new features and so forth. Okay, that's cool.
So there's not just you versus Git,
but there's lots of people that
just Git has won the
mindshare. There's Git,
and then there's Fossil and a bunch of others, yeah.
So tell us
what existed when you started Fossil.
Was Git there?
I mean, SVN was probably the mainstay.
Maybe it was before this.
Tell us the history.
All right.
So when I first started writing SQLite, everything was CVS.
And I know that CVS has a bad rap with moderns because Linus had some very bad things to say about it.
And, you know, most of the
criticisms of CVS are correct. I mean, it's not good. On the other hand, I'm unwilling to say
anything bad about CVS because I had to use the things that came before. And if you'd ever used
the version control systems that came before CVS, you'd think CVS is really great. So, but yeah,
it has its issues. And so we started out with SQLite and CVS because back great. So, but yeah, it has its issues.
And so we started out with SQLite and CVS
because back in 2000,
that's what everybody's using for everything.
And that went on for a while,
but I recognized that it was inadequate.
And Git had just started to come out.
It really hadn't gotten the traction it has now.
Mercurial was out and it was still an open question,
you know, do I use Git or Mercurial?
And this was a, that was a big, big debate back then. This was before GitHub. And I had been doing
some work with, on SQLite with some avionics companies. And I'd come to understand this
quality standard called DO-178B. And this is used, the quality standard used in avionics. And I thought, well,
I'm going to apply this to SQLite. And part of the DO-178B standard is version control
or source control management. And I looked at the requirements that they have.
And in my opinion, which doesn't really count for much, but my opinion was that neither Git
nor Mercurial really filled the bill here.
And I thought, well, I'm going to do my own.
The other one that had influenced me was called Monotone.
And Monotone, if you've never heard of it, I think was one of, as far as I know,
it was the first version control system that was Git-like in the sense that it used SHA-1 hashes to name everything.
And I was influenced by Monotone as well. But I wanted a version control system that would,
one, it would work easily from behind a shared hosting environment. This was before the age of
ubiquitous virtual private servers.
Back then, when you wanted to lease space on a server, they just gave you a shell account, and you had your home directory, and you put your stuff in your – they ran Apache for you, and it just pointed to your directory and did its thing.
So I wanted something that I could run out of just a simple shared hosting account like that. And nothing was available.
And I wanted something that would meet the standards of DL-178B as I understood them.
And there was nothing available.
So I thought, well, shoot, I'll just write my own.
So I played around with it for a couple of years. I started working on it about even before Git came out.
And then Git came out and I started working on it about even before Git came out. And then Git came out, and I kept working on it.
And I think it was about two years after Git came out that Fossil became self-hosting.
And the same principle as Git in the sense that you have immutable artifacts that get added in.
And we were using SHA-1 at the time as well. And you've got a
directed graph design and you commit things to it and other people can commit simultaneously
and everybody has a copy of everything. All of that's all the same. Now we have different names
for things, but it works very much the same. But we have some very different concepts and a very different focus. Git is very much
designed for Linux kernel development. And if you're a Linux kernel developer, Git is absolutely
the best version control system in the world. It is perfectly designed for that role. But SQLite
has a very different development environment. With Linux, you've got thousands of people
around the world working on this simultaneously.
And then they upload their changes
and it goes through layers of review and administration.
And Linus does not want to see every check-in
that's made by every hacker
that wants to contribute to the kernel.
He wants summarized and vetted patches to consider to go into
the main line. And Git's
ideally suited for that. But
SQLite development
is very different. It's a small team.
Everybody knows each other.
Everybody sees everybody else's work all
the time. And
Fossil is very much optimized
for that use case.
So with for example Git, when you make some changes, you make your changes, then you push them up to somebody else.
Where with Fossil, the default configuration is every time you commit a change, it automatically pushes your changes up so that everybody else can see them right away.
Is it still distributed or is it client-server?
It's still distributed, but when you're on network, it behaves as if it's client-server
because as soon as you do a commit, it immediately pushes your changes out to the server
if that server is available.
And so if your system catches on fire, you haven't lost anything.
I remember a few years ago, that actually happened to Linus.
He had caught fire or somehow went inoperable, and he lost a couple of days' worth of commits or something.
I don't remember the details of the story.
Wow.
Because he wasn't pushing it out to another server until he got ready.
Whereas with Fossil, that's kind of automatic.
That would never happen. And which approach you want to take, I guess, really depends on what you're trying to
do and what your development style is. As it happens, the Fossil development style exactly
suits what SQLite wants to do. And the Git development style exactly suits what the Linux
kernel wants to do. So apart from those minor differences, they're really kind
of the same thing. The storage is quite a bit different. Of course, Fossil keeps all of its data
in an SQLite database. So Fossil was designed to control the SQLite source code. And it uses SQLite to store all of its information.
So I'll let you and your listeners ponder that recursion later.
It's kind of double self-hosted.
Yeah, it's sort of – there's this little loop here.
But that's really worked out really well for us because – and I didn't plan this.
It just worked out that Fossil has become a great dogfooding opportunity for me.
Because Fossil is a big user of SQLite, when I'm working on Fossil, I see SQLite from the point of view of a user of SQLite, not as a developer.
And it's happened many times where developers come to me and say,
oh, we need this feature, we need that feature.
And I'm thinking to myself,
I try to be nice to people,
but I think to myself, stop whining.
You don't need this.
But then a few weeks later,
I'll be working on Fossil
and I'll see things
from the application developer's perspective
and think, you know,
it really does need that after all. And then I'll go back and put it in.
And then apologize.
No, apologize never. No, why would we do that?
We're a full team on Sanjit that's featured out there. Just kidding.
Yeah, it does. It really makes a huge difference to be able to experience SQLite from the application developer's perspective. It changes your whole view. And in fact, it takes me about a day to
switch between developing three products because I'm looking at the world from a very different
lens when I'm developing SQLite versus when I'm developing Vossal. Wow. So you can't context
switch back and forth very easily. Not easily.
It's hard.
It's a big context swap for me to do that.
I tend to spend days working on one or the other
rather than flipping back and forth between the two.
So that's been a very good thing.
The other big difference, I guess,
is Fossil does try to...
People talk about Git and Mercurial as they're distributed. Well, Fossil is distributed too in the sense that everybody has copies of all of the files.
But Fossil is non-distributed in a good sense of the word.
It's not just the source files that it controls.
It also controls your bug tickets, your wiki, your forum, your chat room,
and you can hyperlink between all of these things, and it manages them all together.
And it keeps everything in a single file on disk. So Fossil is non-distributed in the sense that
you only have one place to go to find all of your tools and all of your files. Whereas if you're using another system,
whatever that might be,
you've got this system for version control
and oh, I'm pulling in the wiki from here
and I've got that.
And oh, we're using this bug tracking system
and we've got a separate webpage for that.
You might have slightly different looks and feels.
If you're using Markdown as your markup language,
you've probably got three or four different dialects of markup that get involved.
Whereas with Fossil, it's all together.
It's all in one file.
And there's one place to go in the web to see it all.
Yeah, so is that one file per project then?
One file per project.
Okay, so if I have two, I then? One file per project. Okay.
So if I have two, I have a SQLite, and I also am working on Fossil,
they'll have separate files, like the two projects source code.
Yes, they are separate files.
Now, Fossil does have a feature that it keeps track of all of your Fossil repositories.
So one thing that I like about it is the Fossil All command, A-L-L.
So if I'm getting ready to go off network, take my laptop off network for some reason,
I can go on my laptop and I can say Fossil All Sync.
And it'll go and sync every single repository that's on my laptop,
pulling down all the latest changes.
Then I can go off network, do lots of work on
multiple projects. Then I go back on network and do fossil all sync, and it will, again,
sync everything that's on that laptop and push it back out to the cloud. So it does keep track of
all of your repositories, but each repository is itself distinct.
And is the way that it handles branching, merging, conflict resolution,
is that all, would that be familiar to Git users or not?
That's going to be familiar.
It does have the difference that Fossil retains the names of the branches.
That's part of the synced logic.
So with Git, I'm not sure how Mercurial works, but with Git, Git doesn't have branch names.
It only remembers the names of the leaves of the graph.
And it infers branches based on those leaves.
Fossil actually names every branch.
And every check-in, every commit, there's a tag on it that shows what branch are you a part of.
And so that's part of the historical record.
So everybody's talking about the same branch.
With Git, if you've got multiple people working on the same project, everybody's got their own master or main or whatever they call it these days.
But with Fossil, we use the term trunk. And there's only one trunk. And if you talk about trunk, everybody's
talking about the same thing. If we're talking about branch version
3.26.0, then everybody's talking about the
same branch. So the branch names are part of what gets synced.
But other than that, the whole idea is the same. You have
separate branches and
people go off and work on branches and then we merge the branches onto trunk. The thing is,
because it's hosted with relational database, we can follow branches forward in time in addition
to backwards in time. If you think about it with Git, if you know a check-in, it's really easy to find the check-ins that came before.
But if, say, you've bisected and landed on a check-in, or say a customer's coming and says,
hey, we're having trouble with this check-in, you can't easily find out what came afterwards,
what things were added to this check-in later in time. You have to go searching the Git log or do
some stunts like that, and the GUIs don't typically provide you with this information because it's hard to find.
Because the internal data structure, it has a pointer to the ancestors, to the things that came before, but there are no pointers going forward in time because the check-in is immutable.
And at the time of the check-in, you don't know what's going to come next. But if you store this information in a relational database,
then you can create an index and you can follow that index forward in time. And so,
given a point in time, we can see what's going on in all branches simultaneously,
both forwards and backwards. It's a very powerful feature to maintain situational
awareness. And I talk to Git users and say, oh, I don't need that. I've never used that.
And, you know, fair enough, but I never needed Bisect until I had the capability, and now I can't
live without it. Once you start using this powerful feature, being able to see what comes next,
what came after this check-in, it's hard to go back.
So you mentioned the
Git GUIs don't make it easy.
Does Fossil have a GUI itself?
Fossil has a built-in
web interface.
So if you're working from the command line,
you can type just Fossil space UI
and that will automatically
bring up a web browser
pointed at your repository.
So it's running, it's got a web server running there in the product, and it automatically
brings up your web browser and points it at the homepage.
And then you can click down through that.
And the web interface, I mean, Mercurial has the command hg serve, which is a similar concept.
But with Mercurial, hg serve doesn't automatically bring up your web browser.
You have to type hg serve, and then over somewhere else, you have to type a URL into your web browser to get it going.
And the web interface is not nearly as rich.
With the Fossil web interface, you can see everything you need to do.
You can see all your tickets.
You can see your you need to do. You can see all your tickets. You can see your wiki. You can get very detailed listings of branch history and diffs and blames and all of this. And so that is
essentially your GUI is the web interface. And the nice thing is that then when you set up a server,
if you want to, you don't have to have a server to use Fossil. You can do it peer-to-peer. But if you do set up
a server, you have the exact same interface on your server. You run this same web interface,
and you get exactly the same views on the server as you do on your local machine. And the way it's
set up, when you do Fossil UI, it's got a little mini web server running locally, but you can also run it from CGI or SCGI or whatever hosting mechanism you prefer.
Same interface, either way.
This episode is brought to you by our friends at Square.
For our listeners out there building applications with Square,
if you haven't yet, you need to check out their API Explorer.
It's an interactive interface you can use to build, view,
and send HTTP requests that call Square APIs.
API Explorer lets you test your requests using actual sandbox
or production resources inside your account,
such as customers, orders, and catalog objects.
You can use the API Explorer to quickly populate sandbox or production resources inside your account, such as customers, orders, and catalog objects. You can use the API Explorer to quickly populate sandbox or production resources in your account.
Then you can interact with those new resources inside the seller dashboard.
For example, if you use API Explorer to create a customer in your production or sandbox environment,
the customer is displayed in the production or sandbox seller dashboard.
This tool is so powerful and will likely become your best friend when interacting with, testing,
or playing with your applications inside Square.
Check the show notes for links to the docs, the API Explorer, and the developer account
signup page, or head to developer.squareup.com slash explore slash square to jump right in.
Again, check for links in the show notes or head to developer.squareup.com slash explore
slash square to play right now
so back to the i'm gonna hop us back to the branching and merging, if you don't mind.
One thing that I do often is throw stuff away, you know?
Yeah, you've hit upon the point of contention, haven't you?
Yeah, you did this in person.
Yes, so I wrote this famous article called Rebase Considered Harmful, which has created a lot of ire amongst people.
It is a difference in philosophy, and I try and understand other people's point of view, and I have come to appreciate the rebase point of view more as people have pushed back. So a lot of people use Git not so much as a version control system,
but as a distributed version to file system.
The difference here is subtle, but yeah,
if you're doing a distributed version to file system,
oftentimes you want to delete files,
which is kind of what rebase or throwing things away does.
And if that's what you're doing, that makes sense. It really does. But my view of version control,
which came out of this DL-178B document that I referred to earlier, is that you always keep
everything. There's no way to delete stuff.
Now you can shuttle stuff off into a branch that's labeled mistake or something
if it doesn't work out.
Mistake one, mistake two, mistake three.
We have lots of that.
Actually, well, one of the things is
because it's a relational database backing it up,
it's okay to have multiple branches with the same name.
Now that can get confusing to humans, but the database doesn't care. Okay. It's really cool
with that. So we have lots of branches named mistake actually. And you can move stuff onto
a branch after you've checked it in. You can attach and you do this without changing the
check-in in any way. You
just add a new tag to that check-in that says, oh, I want you in this branch, not the one I put you
in. So that happens a lot. We'll put something up there and say, oh, that was a boo-boo. Let's move
this off into the mistake branch. And if you go searching on the mistake branch, you'll find lots
of entries there. Just call it trash. Or You could call it trash if you wanted to.
Call it whatever you want.
Call it whatever you want. You can also
add a tag to these check-ins that
say that they're hidden
so that they don't show up on normal timelines
and things. You can still
dig in and find them if you're doing forensic
analysis, but they would be hidden from
Common View. And so this is just a
difference in philosophy is that we believe in keeping everything. And this is going to store all of history,
the good, the bad, and the ugly.
There was a situation I saw with Git in particular, which maybe in this case would be bad,
that someone had actually included copyrighted code into an open source project. And they were
faced with litigation essentially, or at least the threat at that time. And they were faced with litigation, essentially,
or at least the threat at that time.
And so they had to go into the Git repository
and perform some Git foo,
which required experts and people who could go through
all the different things, essentially,
more than your average Git user would do.
You had to get Witch Doctor, who knows the incantation.
Yeah, the Witch Doctor. Somebody who really knew Git. would do. You'd get Witch Doctor who knows the incantation. Yeah, the Witch Doctor.
Somebody who really knew Git.
We do have that capability.
It's a system called purging.
Or no, shunning.
Excuse me.
Shunning.
You can shunning.
Yes, you can shun artifacts.
And so if somebody checks in something that is copyrighted and you get sued.
Yeah.
Or a developer goes rogue and checks porn into your
repository or a private access token or something or whatever it is you can go back and shun it
and it's the same drill where you you need to bring in somebody with a large amount of fossil
food to make this happen but it does happen and we actually have had to do it once or twice but
it doesn't come up in your daily routine.
But it's possible.
It's reserved for emergencies such as the one that you do.
So it really depends on the development style.
I really push for, look, record everything.
Disk space is cheap.
Other people say, well, I to to work by myself and and get
everything perfect and then once it's all perfect then i will push it up so that everybody else can
see it i'm going to argue that's that's not the best way to do it i think that you need
to have the humility to push up your mistakes as well as your successes it makes it a performance
really right like a pull request can be a performance.
Yeah.
You've done all the work.
You've prettified this thing.
You've put up this great pull request.
Yeah.
You've explained it very well.
And it's a presentation.
And it can be very performative in that case where it's like, I'm going to perform for
my team rather than be who I really am, potentially the one who's bumbling and making mistakes.
And maybe that mistake was actually a smart thing, you know, or a really dumb thing, but
you never know.
But it can become, it can essentially inject the requirement of performance in the flows
of things.
And my view is I'm very much opposed to that because I would get sucked into that trap
very easily because I want to always make myself look good.
So Fossil is somewhat designed to force you to show your mistakes as well as your successes, which is important to me.
I have to do that for myself.
I don't think of it quite so much as an ego thing or a performance as it's signal versus noise.
I mean, why would I want to give you all my noise when I could just hand you my signal?
If you're doing noisy stuff,
you can do that off in a branch.
And then once you're ready to, to blend it in with your,
to blend with your coworkers,
then you merge it into whatever they're working on.
And the good point is there,
if you go on vacation for two weeks
or something happens to you
and you land in the hospital for a few weeks,
you know, I hope that never happens, but it could because it's on a branch and it's being checked
in and synced, your coworkers go into, oh, what was Jared working on? We got to take this over
for him while he's recovering. And they can do that. Whereas if it's in your own little private
branch, it's kind of dead for a while. Yeah yeah i've definitely seen that meme where it's like in case of fire and it's like get pushed then run out of the building
kind of thing because right and that wouldn't happen when it's possible because everything's
out that's right when the fire alarm goes off first type get pushed then exit the building
yeah i've definitely had those moments where i'm like dang i actually haven't pushed for a few days
i should go do that before my laptop dies and i regret it you know I've had those feelings so I
like that about Fossil I definitely would like to not have that feeling but I do also think there's
value in I guess maybe the I wouldn't call it the privacy but like the cheapness of being able to
just sling and and just and then be like, this doesn't have to ever go anywhere.
Because maybe it's not going to go anywhere.
Yeah.
In fairness, I think yours is the majority view.
Sure enough.
Yeah, I think it is.
But there's enough people out there that like my way of doing things
that we have a small but devoted following.
I believe it.
I like how everything's built in.
I think it's more difficult to buy in as a user because there's so much.
Like maybe I love Fossil's single file model and the things you're talking about, but I really hate the wiki or I really don't like the chat.
You know, here's the thing, and I encourage people to do this.
I wrote Fossil for SQLite and if it it accomplishes
nothing but support sqlite it's achieved its mission and it's done that very well and any
other use is just gravy so look even if you want to keep using git i'm fine with that you're not
going to hurt my feelings in any way but it's worth it to you to study what we've done and look
at the ideas and then take these ideas and ideas and move them to whatever other version control system that you're using.
Say, hey, they had this cool idea over here.
Why can't we do this in GitHub or GitLab?
Why doesn't GitHub's lab support this?
That will make your experience better, but maybe blended with your work style.
That was my next question was like, you like, how do you take some of these features
or really just ideas and transplant them
to the Git world essentially, GitLab, GitHub,
because it seems like something that's happened,
I think with GitHub or GitLab
and these centralized repositories,
these places where a lot of people congregate essentially,
which is great for the progression and innovation of our own software.
We've seen a massive uptick in innovation because of GitHub over the last 12 years or more even.
I think they're 13 years old. use Fossil or even if they believe in your ideas, they've got to essentially ostracize themselves,
eject from the norm, the social norm of where to code. And how do you share that code back to,
I suppose, that world? I guess you could do like mirrors, right? You can run Fossil locally and
do mirrors with GitHub or something like that, I guess, if you wanted to.
Yeah, we have like a GitHub mirror for SQLite that's completely automated. I mean,
every time somebody commits, it automatically goes intoithub and it's a funny thing we do that for a client that
is not actually using git but all of their all of their import infrastructure assumes that
everything's on github so so we export to github and then they import from github into their own
proprietary version control system and use it from there.
You don't have to give that world up then.
You can live in the fossil world, except for it's – how many letters then?
F-O-S-S-I-L.
That's six letters?
A lot of people abbreviate it with just F or something.
You said E for your editor.
I'm assuming F for just fossil, right?
You know, F push.
I guess you could do that, or FX or something.
People use different things. The key differentiator, I think, and one of the things that's really helped us to innovate in Fossil is the fact that it's a key value database, that limits what you can do.
So my idea is that, look, you could backfit a relational database into Git by just making another file in the.git directory. And whenever you want to use this relational database,
it would look at the Git log and say, well, what's happened since I was last updated? And then it would have to go back and,
oh, there's been three new commits since then.
Let me pull those in and parse out all the information I need
and build up my relational tables from that.
And then let you use the database.
But it would be completely backwards compatible.
It would not change anything.
It's just adding a new file to the repository.
And then once you had a relational database in Git,
you could very easily do things like say,
what check-ins came after this one?
It would completely eliminate the whole question
of a disconnected head.
You would never again have a disconnected head
because they would all be findable
using the relational database.
Richard, what if Nat Friedman
was listening to this show right now?
And he's like, you know what?
I like these ideas.
I want to hire Richard.
I want to borrow him, borrow his ideas.
Well, he couldn't hire me.
We could certainly talk and I would certainly be happy to give him these ideas and say,
run with them and you do not need to give me credit.
I would thrill if Git or GitHub or something
would improve the
usability so that people could
be more productive.
I'm not going to move off of Fossil. It's
ideally designed for the SQLite
development environment.
But if these ideas can be
imported to other design
methodologies, then that would be great.
So there's a fellow named Patrick DeVivo
who has a website, askgit.com,
and he has done a lot of work around,
basically I think he is retrofitting
a relational database around a Git repository's history.
He allows you to basically query Git as if it was SQL.
And I haven't looked at how he's doing it.
I think he might be doing exactly kind of what you're describing.
But I think the power that you're describing and having a relational database on your source
control history would allow for a lot of interesting mining and visualizations and connecting of
the dots that you're describing.
And he's doing some of that with Git, but he's having to add tools
in order to provide that kind of a thing.
Sure, sure.
But once you get the relational database there,
innovation tends to happen
because, hey, we need a wiki.
Well, shoot, we've got this relational database.
We'll just stuff it in there.
Right.
Or we need a forum.
We'll just stuff it in the relational database
it's sitting right there we'll just use it
if you build it people will come
and lots of interesting things will happen
if you were to do
something like that
you can even use a different relational database
other than SQLite and you won't hurt my feelings
use.db if you want to
you're not going to make me mad
so one thing you did different with Fossil we touched on it at the beginning of the show, is that you
didn't go public domain. You went BSE style license. Was that a reaction to something
that happened with public domain? Or why did you
decide to switch? Because it's still very permissive, but obviously it's less permissive than public
domain. I started out in GPL. And
early on, within a year, I got requests from proprietary people,
hey, we want to use this behind our firewall. And our lawyers say we can't use GPL because of the
viral nature of it. And you can argue that that didn't make sense there, but it's easier to change
the license than to argue with lawyers so um truth you know i i got
everybody who had contributed at that point at that point there hadn't been that many contributors
and i got everybody to sign a release to bsd and so we changed we re-licensed it to bsd and that
that just allowed more people to use it and and in different. Public domain, it turns out to be hard to do. I didn't realize
this when I started SQLite. I thought public domain would be really easy. I'd just say it's
public domain and we're done. But there are many jurisdictions that discourage that or don't
recognize that. And I didn't know this at the time. And there's actually a lot of paperwork
that you have to go through to release your code to the
public domain. Whereas we have the standard CLA, Contributors License Agreement, for people to
contribute for a Berkeley DSB for a BSD-style license. So it just worked out better to go with
a traditional BSD-style license than trying trying to public domain again. Is it possible SQLite will change to non-public domain considering that?
No.
And this is just force of tradition and legacy.
I think that it's always been public domain and we're going to keep doing it that way
just because at this point it's too late to change.
Maybe if I'd known now,
known in 2004 what I knew now,
or I guess 2002 when I did this,
if I'd known in 2001, 2002 when I did this what I know now, I would have done it differently.
But no, we've got too much legacy behind it now.
It's 20 years of tradition in public domain,
so we're going to do that.
I even went to the trouble of of there's a set of standard licenses, and they have codes.
I forget what this is called.
But I got the software blessing that is the head of every SQLite source code file.
I got that registered as one of the acceptable licenses so that the automated tools that were scanning things would see this and say, oh, that's okay.
We can accept that.
Actually, I had a whole show on the first one was you may do good, not evil, which really made it challenging for a maintainer to maintain the software.
Eventually, it went by the wayside. And it actually had a massive change in, I guess, their contribution and others to it because of the whole, what is good, what is evil?
How can you really, you know?
It seems black and white in terms of opposites, but it was just difficult to actually put into practice. Yeah, and the blessing on Esculite is not a requirement.
It's literally a blessing.
It says, may you do good and not evil.
It doesn't say you must.
That's the difference there.
Truth.
Yeah.
Good point.
It's like grace versus law there.
Yeah.
There you go.
Absolutely.
Well, one thing you do say in regards back to the license, you said, quote, you are free to steal bits of the fossil source code to use in other projects, including proprietary projects. That means that you're not really holding these ideas to you and others can use
these ideas essentially. Absolutely. Encourage other people to use it. So let me throw a startup
idea at you and you tell me if it's good or bad. You're asking the wrong person but i'll give you my opinion okay it's one word two syllables
three syllables fossil hub you know there's this thing called um that's already been done
it's it's called um oh why can't i call the name of it i i uh chisel no that's a good name and uh
it's hosted by um roy keen but the thing with Fossil, it's really designed for self-hosting.
We make it really easy to set up your own Fossil server on a $5 a month VPS or on a spare Raspberry Pi that you happen to have lying around.
It takes very little hardware to run Fossil.
I know some of these other systems, they say, oh, you've got to have at least a $40 a month VPS in order to support this.
It's so heavyweight, but it's very low resource.
And so you can just plop this up there.
So a single executable, you plop it on your machine, a two-line CGI script gets it running, and it just does everything for you.
And so the motive for having a service like GitHub for Fossil is greatly reduced.
Because if you were to just take raw Git or raw Mercurial and want to set up a collaborative development site like GitHub, that's a lot of work.
GitHub provides a very valuable service.
With Fossil, the amount of work to set this up is greatly reduced. And so the need for that is
also greatly reduced. Now, what people have told me though, is that for some people who live in
other countries, coming up with $5 per month in hard currency for a vps is a hard problem
and for them having access to a free repository like that is is a big deal but for those of us
who are fortunate to live in the u.s or our other western countries it's probably easier just to set
up your own and then think of all right let me me just, I'm coming back to this subject, but think with me just
a second.
If you talk with people that like to go backpacking, do you have any friends or do you like to
go backpacking yourself or do you have friends that do that?
Yeah.
And you go out in the wilderness and you're on your own for five days and people ask,
why do you do that?
And people say, well, it's the freedom.
It's just a lawyer being outdoors and having drinks. Think about this. and people ask, why do you do that? And people say, well, it's the freedom.
It's just Lloyd being outdoors and having drinks.
Think about this.
Freedom means taking care of yourself.
That's what people like about backpacking and wilderness adventures is they go out
and they're responsible for themselves,
every aspect of their lives.
They're carrying their house on their back
and all of their food.
That's what they like.
Freedom means taking care of yourself.
And Fossil tries to promote that.
It gives you the tools to make it easier for you to take care of yourself.
Because you can take this one standalone binary, plop it on a server, add a two-line CGI script, and suddenly you've got a complete developer website up and running.
Can you do
that with other systems? Absolutely. But there's a lot more moving parts, a lot more you have to
install and a lot more to maintain. To say that's pretty cool. Think of Fossil as your
ultralight backpacking tent. There you go. Camp anywhere. It's not as nice as a Hampton Inn, but you're taking care of yourself.
There's your new tagline, fossil.
Not as nice as the Hampton Inn, but you're taking care of yourself.
Taking care of yourself.
I like it.
That's the essence of freedom is taking care of yourself.
Yeah.
But there's also balancing that out as community, I think. And so the thing that GitHub has that's even, I think, better than Git,
more powerful than Git, is that's where it's the hub part, right?
Yep.
And everybody, I'm going to use your analogy and kind of abuse it to a certain degree,
everyone wants to climb a mountain, but eventually they come back down to the base camp.
You said back to back, now we get a base camp.
You want to hang out with people and you want to see what they're doing.
Is there any way with Fossil to at least federate
or have a directory or like,
here's my cool open source stuff,
here's my Fossil instance,
here's Adam's Fossil instance,
he's out there over there.
Let's get together and
collaborate because that's what I think. I think that's the magic on GitHub.
Yeah. Federation is interesting, Jared.
Sure. I agree with you. And now if you talk to the people at GitHub,
they will be quick to tell you that their company is not about Git. It's about Hub.
Yeah.
Absolutely. And I agree a hundred percent. It's a place for people to gather and collaborate.
And they're quite open about the fact that,
well, they started on Git,
but they stay with Git simply because that's what everybody uses.
If Git were to vanish tomorrow
and everybody were to go to Mercurial or Monotone,
GitHub would change.
But it would still be the same company because it's about the hub.
Yeah.
And so, yeah, I think it'd be really cool if GitHub allowed you to have fossil repositories.
Yeah.
That would be interesting.
I don't think that'll happen.
How would that work then?
And, you know, cast some vision for how that might work.
How could you have a repository on GitHub that was not a Git repository?
What would it take to make that happen
behind the scenes?
I don't understand their infrastructure enough to
really say, but
I know that SourceForge
allows different kinds of repositories,
don't they?
Yeah.
How do they do that? I'm not sure. I think you can actually
have fossil repositories on SourceForge,
if I'm not badly mistaken.
I've never done that myself.
The underlying data model of Git and Fossil is the same.
You've got commit objects, and you've got file objects, and the commit objects link together to form a directed graph.
And you walk the graph to pull out the pieces you need
so the underlying data model is the same now that the the details of the file formats are
completely different but the overall concepts are the same so it seems like you should be able to
use the same infrastructure to build a github with fossil yeah you probably have to introduce
an abstraction layer somewhere in there
that says, here's my interface,
and I'm going to put Fossil on one side of it
and Git on the other,
and it's going to unify to what their frontend does.
Exactly.
Frontend not meaning their web UI,
but everything that's in front of that layer.
So there would be some work involved, but...
Yeah, a lot of work,
which is why I think it'll probably never happen, but...
It would be just as easy then to fossilize Git, right?
To borrow some of the ideas of Fossil that you talked about, the relational database, some of the different principles and practices that you live upon that if they agree might carry over to maybe you make the backwards and forwards history i mean because how many times do people get stuck
behind some git issues that sure seems to be solved by some of the things you've made simple
with fossil or the running out of the building on fire git push you know like and i mean there's
certain things like streaming the git repository to github or whatever like there could be some
ideas that you've you've laid claim to
that could be translated,
fossilized Git maybe.
I think that would be a better solution
because what I hear a lot
is people, they look at fossils really cool,
but it doesn't have rebase.
That's the number one complaint.
Well, just take the cool features out of fossil
and land them in Git
and then you've got rebase.
And all of your old tools continue to work.
All of your build infrastructure that depends on Git continues to work as it did before.
But you've got cool features like Git space UI, and it brings up a web browser and points
it at your repository.
It gives you a cool timeline.
Or Git all sync that goes around and finds all of
your git repository and syncs all of them that would be cool honestly i mean you know you're
essentially at a repository level for most commit i mean every commit really like you're not yeah
unanimously across all of your git repositories inside of your code directory which i think is
probably standard for most developers you got your home directory your user, and you got a directory in there called code or source
or something that you'd put all of your source code in.
Then you've got multiple directors beneath that
which were basically individual repositories.
But Git, my Git, doesn't know about any of those other things.
It just knows about its own single repository.
Right, and wouldn't it be cool to be able to sync them all
with a single command?
Yeah.
Yeah, that'd be really cool.
Especially if you're going off network.
There's nothing worse than getting on the airplane with no Wi-Fi and suddenly remember, oh, I failed to sync the one key repository that I need to make this work.
And you can't work for those four hours or whatever it is.
Exactly.
Sucks to be that person.
Yeah. four hours or whatever it is like well exactly sucks to be that person yeah yeah so um i think
that's that would be a great a great way to move forward i really do and i'm happy to contribute
ideas to anybody who wants to undertake this anybody who's listening to this who wants to
build these go look and see what fossil has you don't have to agree with every with my view of
how things should be done but look at the ideas and steal them. You don't have to agree with my view of how things should be done.
But look at the ideas and steal them. You don't even have to give me credit.
Fossil-SCM.org if you're listening. It'll be in the show notes, of course, but that's a good place to start. Yeah. Take the ideas and run with them. I love it. Let me just throw this out here. A year
ago, we used Markdown. Markdown has become the de facto language for documentation and so forth.
I needed to draw diagrams in my Markdown documents, stick diagrams for architecture diagrams and stuff.
And so I took the legacy language from the 1980s Bell Labs called PIC, P-I-C, and I created my own implementation of it that works on the web.
And it's called Picture, P-I-K-C-H-R.
And so in the middle of a markdown document, you can have just a little bit of code that does these elaborate diagrams.
It's a really cool feature.
Picture was originally written for Fossil, but I put it out in a separate repository, which is mirrored on GitHub, with the hopes that other people who have their own Markdown engines would pick it up and integrate it into their Markdown implementations as well.
It conforms with the Markdown standard for fenced code blocks.
So it's not a language extension it's well it's an extension
in the sense of
it's an allowed extension
that's
specified in the
in the markdown documentation
so
if you want to look for ideas
please
please look at that
I wish you would adopt it
yeah
well Richard
it's good to have you back
I mean it's been too many years
I think
I think we should make this
more frequent if possible
I love just hearing your ideas
I love hearing
really I think your spirit you know the the programmer spirit I think you bring you know more frequent if possible. I love just hearing your ideas. I love hearing, really, I think, your spirit.
The program of your spirit I think you bring.
And the freedom you bring, the ideas you bring, this aspect of freedom, this aspect of blessing, just this aspect of giving, really.
I love that about who you are.
And I appreciate all that you've given this world and all the ideas you've shared here today.
And it's been awesome.
Thank you.
Thank you for having me on the show.
All right.
That's it for this episode of The Change Law.
Thank you for tuning in.
We have a bunch of podcasts for you at changelog.com.
You should check out.
Subscribe to the master feed.
Get them all at changelog.com slash master.
Get everything we ship in a single feed.
And I want to personally invite you to join the community at changelog.com slash community.
It's free to join.
Come hang with us in Slack.
There are no imposters and everyone is welcome.
Huge thanks again to our partners, Linode, Fastly, and LaunchDarkly.
Also, thanks to Breakmaster Cylinder for making all of our awesome beats.
That's it for this week.
We'll see you next week. Game on.