The Changelog: Software Development, Open Source - Picking a database should be simple (Friends)

Episode Date: August 9, 2024

Database aficionado, Ben Johnson, joins Jerod to answer the age ol' question: which database should you use? Answering that isn't always easy, which means it's time to play the "It Depends" jingle & w...eigh (some of) the options.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Changelog and Friends, a weekly talk show about materialized views. Thanks to our partners at Fly.io, the home of changelog.com. Launch your app near your users. Fly makes it easy. Learn how at Fly.io. Okay, let's talk. What's up, friends? I'm here with Dave Rosenthal, CTO of Sentry. So Dave, when I look at Sentry, I see you driving towards full application health, error monitoring where things began, session replay, being able to replay a view of the interface a user had going on
Starting point is 00:00:53 when they experienced an issue with full tracing, full data, the advancements you're making with tracing and profiling, cron monitoring, co-coverage, user feedback, and just tons of integrations. Give me a glimpse into the inevitable future. What are you driving towards? Yeah, one of the things that we're seeing is that in the past,
Starting point is 00:01:13 people had separate systems where they had like logs on servers, written files. They were maybe sending some metrics to Datadog or something like that or some other system. They were monitoring for errors with some product, maybe it was Sentry. But more and more what we see is people want all of these sources of telemetry logically tied together somehow. And that's really what we're pursuing at Sentry now. We have this concept of a trace ID, which is kind of a key
Starting point is 00:01:38 that ties together all of the pieces of data that are associated with the user action. So if a user loads a web page, we want to tie together all the server requests that happened, any errors that happened, any metrics that were collected. And what that allows on the back end, you don't just have to look at like three different graphs and sort of line them up in time and try to draw your own conclusions. You can actually like analyze and slice and dice the data and say, hey, what did this metric look like for people with this operating system versus this metric look like
Starting point is 00:02:08 for people with this operating system and actually get into those details. So this kind of idea of tying all of the telemetry data together using this concept of a trace ID or basically some key, I think is a big win for developers trying to diagnose and debug real world systems in something that is, we're kind of charged the path for that for everybody. Okay. Let's see you get there. Let's see you get there tomorrow. Yeah. Perfectly. How will systems be different? How will teams be different as a result? Yeah. I mean, I guess again, I just keep saying it maybe, but I think it kind of goes back to
Starting point is 00:02:43 this debugability experience. When you are digging into an issue, you know, having a sort of a richer data model that's, you know, your logs are structured. They're sort of this hierarchical structure with spans. And not only is it just the spans that are structured, they're tied to errors, they're tied to other things. So when you have the data model that's kind of interconnected, it opens up all different kinds of analysis that were just kind of either very manual before, kind of guessing that maybe this log was, you know, happened at the same time as this other thing, or were just impossible. We get excited not only about the new kinds of issues that we can detect with that interconnected data model, but also just for every issue that we do detect, how easy it is to get to the bottom of it.
Starting point is 00:03:23 I love it. Okay, so they mean it when they say code breaks, fix it faster with Sentry. More than 100,000 growing teams use Sentry to find problems fast. And you can too. Learn more at Sentry.io. That's S-E-N-T-R-Y.io. And use our code CHANGELOG. Get $100 off the team plan. That's almost four months free for you to try out Sentry. Once again, sentry.io. We are back with yet another It Depends episode. That means I have to play my jingle. It depends.
Starting point is 00:04:03 You know, there are no silver bullets, So the best way that we can help you build great software is to equip you with knowledge. Much of that knowledge can only be gained through experience. And that's why on this It Depends miniseries, I sit down with experienced devs to discuss their decision making process. Today I have with me Ben Johnson Ben, welcome back. Hey, thanks for having me, Jared. It's good to be back. It's great to have you. For some reason, your name always comes to mind when I think about databases. Is that weird? You're just living rent-free in my mind, I guess. I guess so. I don't know. I think it databases? Why do you focus on that particular part of the technology stack? I mean, that's a good question. I mean, I was a UI developer. I did data visualization for a big part of my career. I actually, I started off doing like Oracle databases and I thought I was like really going to be an Oracle DBA early on in my career. But then everyone else that you're
Starting point is 00:05:26 competing against has 20 years of experience at the time, and I just had to find a different route for a while. I think there's something raw about databases that's kind of like the lowest level kind of abstraction you get to. I think it's kind of interesting. It's kind of the most important part, too,
Starting point is 00:05:42 at the end of the day. If you mess it up, then you really... You mess it up and you have problems, right? The security of course is a big issue, but it's also like the most lasting part of most valuable applications is the data and you know, the application code and the features come and go, but the underlying data is valuable. Probably even after the business is gone or the application is no longer in use. It's like that data has an inherent value to it. So it's definitely down in there and oftentimes the most lasting part. So today we want to talk about databases. And this is an It Depends where we kind of share decision-making processes.
Starting point is 00:06:21 How do you pick this? How do you decide that? Knowing that everybody has a different context. And so we can't just simply say to you, just use this, because that might not be true all the time. However, longtime listeners of the changelog know that I probably am going to say just use Postgres. And, you know, it depends doesn't apply here. But Ben, you might say something different, because I know you're an SQLite fan. Was that going to be your stock answer? Just use SQLite.
Starting point is 00:06:47 I would actually go with just use Postgres generally. Oh, you would? Yeah. I think if you don't know databases very well, there's probably the most amount of information out there for how to use it, how to set it up, how to debug it, that kind of stuff. You get a lot of Dewey tools. You get
Starting point is 00:07:03 all that kind of stuff. I think it also depends on Dewey tools, you get all that kind of stuff. I mean, I think it also depends on the community. Like I think PHP does a lot more MySQL. So maybe in that world you do MySQL, but yeah, I think Postgres, honestly, you can't, it's hard to go too wrong with Postgres. There's obviously sharp edges as well, but it's probably the least worst option out there. All right, that's our show.
Starting point is 00:07:24 You've got it. Thanks for having me on. Well, maybe we can adjust that because there are so many specific use cases and so many scenarios for applications where the type of database you're using really does matter and you can play to a database's strengths and get a lot of benefits.
Starting point is 00:07:40 But maybe we adjust it to say, just start with Postgres and you're probably going to be a safe starting place or MySQL. But I agree, Postgres has the mind share, it has the tool share at this point, it has the momentum as an open source database. Although SQLite has a ton of momentum right now, but I think there's more caveats to SQLite depending on what you're up to. And we can probably get into that as we talk here. But just the breadth of the types of databases
Starting point is 00:08:09 is somewhat overwhelming. I mean, here we are talking about three different databases, but they're all pretty much the same. I mean, they probably share like 90% overlap. And then the devil's in the details. Yeah, and a lot of them,
Starting point is 00:08:21 even like SQLite has support for MySQL style SQL and Postgres style SQL, so you can kind of swap out parts of them, even SQ Lite has support for MySQL-style SQL and Postgres-style SQL, so you can swap out parts of them. They definitely try to overlap. And because there are extensions and plugins, there's crossover into different types. But if we're going to lay out some of the major types of databases, because before you pick a specific database,
Starting point is 00:08:44 sometimes you have to even decide what kind of style of database is going to play to my strengths. I made a list here, and this is certainly not comprehensive. But if we were to talk different types of databases, we have relational, graph, document,
Starting point is 00:08:59 key value, columnar, time series. Of course, there's vector DBs, which are all the rage right now because of semantic search i didn't see an xml database in there or object oriented database those are some old school ones oh yeah what did i miss there oh yeah oh dbs yeah i actually really like that concept inside of working with a really a language if you can map your data store directly on your object graph and you're just talking about like hydrating and dehydrating it at the end of the day, isn't that kind of what you
Starting point is 00:09:31 want in an object-oriented language? Yeah, I mean, that's kind of what ORMs do. I mean, they just do it on top of a different model. But yeah, I mean, there's like this holy grail idea of just like you write your application and the objects just magically persist and you don't have to worry about them. Why can't we have that? I mean, I want that. No, yeah, for sure. You know, I mean, I think that there are I think, you know, there's obviously like recursive
Starting point is 00:09:56 or, you know, like self-referential types and some complications in there. I think, though, like one thing I noticed when I was doing a lot of development with Bolt is you kind of need like a schema layer but beyond like just a schema layer, it's nice to have like kind of a data language layer because you do a lot of stuff where you're like, oh, I need to pull down a certain set of data. I need to do some kind of export transform or whatnot that you don't necessarily want to tie to your application language.
Starting point is 00:10:20 It might be like a migration you might do. So in that case, it is kind of nice to have some separation between your generally relational model and your actual application model. Right. That leads me to a whole other line of thinking, which is beyond database choice. It's like, how much of your application logic should reside inside of your database? And I feel like there's been a pendulum swing in both directions over time, where it used to be store procedures
Starting point is 00:10:48 and all the things in there. And we found problems with that, operational problems, all kinds of things that you could say, well, that's not ideal because of this. And then I think Ruby on Rails, at least in the web dev space, really swung it in the opposite direction of your data store is a dumb thing that you treat
Starting point is 00:11:07 as a input-output mechanism that will store things on disk. And you put everything inside your application code. Your consistency rules, your foreign field, your relationships, all that stuff is in app code. And that way you can actually even just swap out the backend and not even worry about it. And I feel like that was a move too far in the other way because of the reasons you said all of a sudden you want to like use it outside of the context of your ruby code and you're like oh
Starting point is 00:11:34 where's all my consistency rules where's all my you know constraints and all that well they're over there in your ruby on rails app so you can't use it in any other context you lose it and now you have data problems on the back end. So that's another thing is like, where do you put stuff? Where do you fall on that usually? I mean, I think the high level things like foreign key constraints, some checks, check constraints maybe. I would probably just put in the database. Because, yeah, you do have a lot of users or clients for a database, not just your own single application.
Starting point is 00:12:03 So I think it's tough. I mean, I think putting too much into your application layer, again, like if you ever want to rewrite or change it or move somewhere else, like you're moving everything. But I mean, I used to do store procedures back in Oracle. And I would say like the one thing I really loved about store procedures is that they're just wicked fast. Like they're literally, you're putting your code right next to your data.
Starting point is 00:12:24 And like, it's hard to explain just how fast a store procedure will run. If you really need the speed, that can be great, but again, they're terrible to maintain. This is 20 years ago, so it was like, there was no Git repo where we versioned everything. It was like, oh, we're just going to
Starting point is 00:12:39 upload this giant SQL file to replace our store procedures and hope it works. So that's terrifying. Yeah. I wonder if, I'm sure there's people who feel like they've found a good middle ground and I'm most familiar with Postgres. So like writing PSQL functions and using extensions, but also like keeping that stuff in version controlled, maybe tested, I don't know, places where you can... Because for me, the distance from my code
Starting point is 00:13:08 to a stored procedure was always the problem. It's like, now I have to connect to a thing over there and then update the stored procedure. And it was always just a weird disconnect there that I felt was going to cause problems, whether it ever did or not. But I'm wondering if people are building... Similar to the folks who took nginx to the limit and like made app frameworks like right there inside nginx
Starting point is 00:13:32 you know modules and stuff and i wonder those people who are like you know postgres or whatever you know pick your database store procedures for life you know just maybe kind of like coding it up yeah i mean one thing i like about sql SQLite is that your code is right next to your data. So you're almost writing stored procedures just in the language of your choice. Right. So you don't have that latency to go between the two. But as far as Postgres, I mean, I think I hate writing SQL-based stored procedures. I think it's a pain.
Starting point is 00:14:01 So I just generally move stuff to the application layer. What about other functionalities? So we've had Paul Copplestone on recently at Supabase. And he calls them Postgres Maxis because they really leverage the database. And they're providing all kinds of services on top of it for things like background jobs. I think PubSub, of course, is there in Postgres. They do row-level access control, so really taking advantage of the security things
Starting point is 00:14:31 that you can do inside of there. I'm curious if that kind of stuff intrigues you. Does that sound like you're taking it too far in terms of what you're going to do with your database? I think it's interesting to play around and see where you can maximize, you know, different performance characteristics, I would say. I don't feel like from a usability standpoint, it's always the best. I mean, I think that Postgres is a pain to set up a lot of times to
Starting point is 00:14:54 begin with, but sure. So I think adding more to it just scares me a little bit, but I mean, yeah, I think that's, that's cool. Try new things out. See what sticks. Well, friends, I'm here in the breaks with one of my new friends over at 1Password, Martin Shosh, software developer at 1Password on the SDK team. 1Password now has SDKs as well as their CLI that allows you to build secrets management integrations using Go, JavaScript, or Python, and they're available right now. So, Martin, how can developers use these SDKs today? Give me some examples. Yeah, so the CLI was built more for managing your one password account and accessing it from the terminal and writing various scripts for local automations.
Starting point is 00:15:41 But the SDKs really go a step beyond that, where you can build these automations into other pieces of software. You can run them in cloud functions. You can build them into your natively running desktop apps, which now are also able to leverage functionality such as loading data from 1Password, rotating secrets in 1Password, creating new items and more.
Starting point is 00:16:03 Yeah, so in addition to this awesome new functionality, you're going to give developers to leverage 1Password in such unique ways. I think it's also worth noting how you built these SDKs. You have a core Rust library that generates these various SDKs. What's the backstory? When we started the SDK project, one of our goals was to really build the SDKs in a scalable way where a relatively small team can maintain multiple SDKs at the same time. And we can add support for more languages and also add more functionality to them as time goes on. To achieve that level of scalability, we designed the SDKs in a way that they all leverage a shared Rust library that's written once and it has all the features of all the SDKs inside of it. Now to
Starting point is 00:16:55 make this library accessible in each language, we generated a wrapper for that library in each of the supported languages. This wrapper code is automatically generated. So this gives us even more speed, agility when adding new features to the SDKs because we just add the feature to the SDK core library and each of the SDKs automatically gets updated to expose the new functionality in all of the languages. That's so cool. Okay, the next step is to go to 1password.com slash changelogpod.
Starting point is 00:17:29 They've given our listeners an exclusive extended free trial to all the developers out there to use 1Password for 28 days. That's not 14 days, but 28 days. They doubled it. Make sure you go to 1password.com slash changelogpod to get that exclusive signup bonus or head to developer.1password.com to learn about 1Password's new SDKs available right now. Their amazing developer tooling, their CLI, their SSH and Git integrations, their CICD integrations, and so much more. Again, 1password.com slash changelogpod or developer.1password.com to learn more. How many different types of databases have you used throughout your career?
Starting point is 00:18:14 I listed off a bunch of them. Have you used all those? I think most of them. I haven't done a ton with graph databases, but I've done document databases. Yeah, I've worked in time series before. I've done columnar, like for analytics. And I think that that, the way I think of it in my head, I think document was an interesting kind of road we went down of you kind of denormalize your data and it makes it a lot faster just to grab one big chunky object
Starting point is 00:18:36 instead of doing one and then n plus one query after that to grab all the children. But denormalization has its own issues as far as you update in one place and it doesn't update in all of them necessarily. So I think that, especially with the work that Postgres and a lot of databases like SQLite have done as far as JSON embedding inside the rows,
Starting point is 00:18:55 I don't see a big need for document databases these days. I think you can do a lot of that stuff inside relational databases. I would tend to agree. I have used MongoDB on a production project. It was probably 10, 15 years ago now. And it was very much because I was convinced by one of their sales demos.
Starting point is 00:19:17 So it was when they showed this layout of Magento, which is a PHP e-commerce framework, very popular then, probably still in use in many places now. And they showed the table structure of that particular piece of software. And it was gnarly in the bad sense of gnarly.
Starting point is 00:19:38 I mean, there were so many relationships, so many tables. This is like, how many joins you have to do to pull together your shopping cart was kind of what this thing was. Wouldn't this all make a lot more sense if it was a single document and they showed what it would look like inside of Mongo in a document-oriented data structure?
Starting point is 00:19:56 Then you just pull out the document for the shopping cart and you're rocking and rolling. I was like, that's pretty compelling. I think it was exacerbated by the fact that Magento's structure was particularly hein that's pretty compelling. I think it was exacerbated by the fact that Magento's structure was particularly heinous, in my opinion. Probably because it grew over time, as many of these things do. That's how your database tables can get out of control.
Starting point is 00:20:15 And I thought, yeah, that makes a ton of sense. I think for an e-commerce site, a document-oriented database made sense. And so I went for it. I was building an e-commerce thing for a client and went with MongoDB. And it was relatively, I think it fit pretty well. The problem that I came across over time was I didn't really have any MongoDB chops. I have a brand new thing
Starting point is 00:20:36 that I don't really know how to administer. And so that's where I get a little bit fish out of water with something where I feel like this thing fits the data structures and so then my application code becomes simpler but now my operations are either more complicated, more expensive, do I have to pay somebody else to do it? And so in that case, I ended up being kind of upset
Starting point is 00:20:56 that I did it because then I later learned about Postgres' JSON stuff and I was like, oh, I can kind of have the best of both worlds if we just can shove a few non-normal things into an otherwise relational database. And I do agree with you that generally speaking, I think document-oriented, you can probably get away with not going with a document-oriented first solution today.
Starting point is 00:21:19 You can do a lot of stuff too with materialized views where you're essentially building a physical table out of query so you know i think there's and it automatically updates itself so i think there's a lot of cool stuff you can you can kind of play around with right to get around that i mean the graph database side though i think is a little trickier i think there are extensions for like graph language stuff within sql but it's always heinous. You can do CTEs, which are common table expressions, and they're recursive, and they're impossible to debug. But they can work. That's an option. But I think that's a fairly rare instance
Starting point is 00:21:54 where you have stuff that's so relational. It is relational. If you need to do six degrees of Kevin Bacon in your application for some reason, then I think a graph database makes sense. Yeah, your typical social network makes sense for a graph database. Because if you think about followers and followings and friendships and these kind of relations, if it's all about that,
Starting point is 00:22:20 then the way I heard it explained is if you have edges and nodes in a system, if the edges are more important than the nodes are, which the edges would be the connecting points, then you're probably well served by a graph database. In the case of a social network, that's pretty much what it's all about, right? It's like who's connected to whom where. Then something like Neo4j or other graph solutions
Starting point is 00:22:42 have made sense. I've never used one of those, and so I can't speak to it personally. Yeah, I haven't really either. I think the funny thing, though, is all these essentially boil down to B plus trees as the actual implementation. So it's kind of all a matter of language at the high level. They're all kind of key value stores underneath. So I find it's interesting, too, because, like, most of it you're optimizing
Starting point is 00:23:05 to minimize the number of queries so you don't have so much latency. So that's why you can send a single query to get a bunch of relationships instead of sending, you know, a ton of queries to get those. Whereas, like, once you actually get to running something locally, like a Bolt or SQLite,
Starting point is 00:23:21 like, you don't have that overhead, so you can make a bunch of different queries and you can essentially kind of have kind have the best of all those worlds. You could basically write a graph database with just a SQLite, since you're so close to the data. Right. So how much of that is file structure on disk, binary blob formats, that whole deal,
Starting point is 00:23:41 layout of the data on disk, and how much of that is client-server? Because it seems like when you say embedded, we're removing that client server connection, which oftentimes is a network connection. It could also be a socket. But that connection is going to be latency, right? Yeah. I mean, generally, there's the physical latency, just going between this box to this box or
Starting point is 00:24:03 this region to this region, which can be significant when you have 100 queries. But then, yeah, like Postgres, you can run locally. You can run just over a Unix socket, and it's quite fast. SQLite runs just in process, so there's not even a process barrier to go between. I mean, you can have a kernel for locking and whatnot, but you're really as close as you can possibly be to the data. So it's pretty fast.
Starting point is 00:24:28 And that's what you're building with BoltDP as well, right? This was an embedded key value store for Go. So yeah, the data was actually memory mapped into like a read-only map. So you'd actually interact directly with this memory map, which essentially pulls the data up from disk into the OS page cache. So as long as you have, you know, if you can fit most of your data into memory, at least the hot parts of your data, it's basically like you're just interacting with the speed of the memory, which is, again, quite fast. And what were BoltDB's, like, perfect use cases?
Starting point is 00:24:59 And then where would you get towards, like, that's probably not best for something like this? And using that as a proxy for these embedded key value stores. Sure, yeah. I mean, I think BoltDB was good when you have some kind of simple structure that you're trying to use. And you don't need a lot of things like indexes or anything too complicated. If you're storing just some basic objects or basic rows, you can do a lot with just converting that to JSON and storing that in a blob and then decoding that when you read it out.
Starting point is 00:25:29 You could use protocol buffers, anything like that to kind of encode your data. It's good if you really want a really super lightweight dependency. If you want it to be pure Go, that goes a long way. I would say most people are probably better served with SQLite. It has an actual schema on top of it.
Starting point is 00:25:44 I know there's a lot of applications now that are actually using it as their file format. I think Audacity is one of them. So you can actually just pop it up and just read, like look at your data with a SQLite, you know, CLI, which is kind of cool. So I would say generally, I would probably lean towards SQLite for most use cases.
Starting point is 00:26:01 Well, we're talking key value stores. Thoughts on Redis? I don't use a ton of Redis. I mean, most of the use cases. While we're talking key value stores, thoughts on Redis? I don't use a ton of Redis. I mean, most of the use cases I see it for are more like a caching layer. And I know it does a bunch of other things too. You can do kind of queues and you can do sets and all kinds of stuff.
Starting point is 00:26:16 I think those are probably fine, but I think it seems like it's rarely used as a primary store. It's more like a memcached kind of thing. Right. So I think it's fine for that use case. I don't know about its durability, you know,
Starting point is 00:26:28 guarantees. I'm not sure about its transactional guarantees. That's another one you get into. And like, the more you learn about like transactional guarantees and what the defaults are on things like Postgres and MySQL, I think they're atrocious.
Starting point is 00:26:40 Like, most people don't understand isolation levels, really. So like when you actually look at, actually look at what guarantees you actually get, they're pretty limited. That's one reason I do like SQLite. It has a really strong isolation level. And it's the only one you can do. Can you say more about isolation levels? Sure. Isolation levels are around where when you read something
Starting point is 00:27:00 from the database, if other transactions are going on at the same time, sometimes the transaction level will mean if you read it one time and then someone else updates it and you read it again in the same transaction you may get the same version you may get the new version so there's a lot of like weird little um edge cases where you can get into where you might you know maybe fetch a list of objects and then you run a count for however many objects and those two may differ depending on your isolation level and whatever whatever else is going on so it can be tricky if you're not using something like really strict like serializability or um snapshot isolation is another strong one
Starting point is 00:27:37 that's generally pretty good but i think postgres uses read committed if i remember correctly which is like one of the lower ones like the least uh strong one of the least strong isolation guarantees yeah what it's called yeah yeah and you can use things like select for update as well to to give yourself some extra like um some locking around things and whatnot what select for update what's that do so select for update if you're going to select a list of data uh like do a query, and then you basically want to say, I'm going to update parts of this data after that. It'll actually take a lock on those rows, or maybe even the table, so that they aren't changed from underneath you.
Starting point is 00:28:14 But it does block other people from using those as well. Gotcha. Where does one go to get that level of knowledge about these different things? I mean, writing databases helps. Well, we don't all have that much time, Ben. No, that's fair. Write a few databases, then you'll understand it.
Starting point is 00:28:31 Yeah, after your fifth database, you're right. I was hoping for a website or something where I could just read a table that says, here's what you should use. There's definitely a lot of people that do blog posts about internals. On the Fly blog that we have, we've written a bunch on SQLite internals. On the Fly blog that we have, I've written a bunch on
Starting point is 00:28:45 SQLite internals and how that works. I would say, if you look at Kyle Kingsbury, who goes by Afer Online, he does a bunch of writing on how he basically tests production databases and breaks them. That's kind of what he's known for. So he'll go in
Starting point is 00:29:01 and he'll actually go and find where they may guarantee a certain isolation level or there are certain guarantees and they don't actually hold up. Then they'll write a whole blog post dumping on that. Nice. And then the companies go and fix them. It's great. So, you know, I think that once you get into like distributed systems, especially, it's just it's kind of hard to keep in your head, like all the different clients going on, what their views of data are, and how they interact with each other. So Kyle Kingsbury has a website, Jepson. That's his software as well. That's the software to test these different databases.
Starting point is 00:29:36 But on there, there's a list of consistency models that he'll show on there. And that's a great resource to kind of dive in and to kind of understand the relationship between them. Because there's kind of dive in and to kind of understand the relationship between them. Cause you get, there's kind of two different camps. There's kind of like more traditional databases and like their, um, right. Isolation more or less,
Starting point is 00:29:51 I would say that's where you think of like read committed or read uncommitted and snapshot consistency. But then there's also things when you get into like eventually consistent systems about what they can read. And it's more kind of like a read consistency side. So there's a lot to read out there, honestly. And it's kind of, it hurts your brain a lot. I would just say like, if you're not sure,
Starting point is 00:30:12 generally try to have the highest isolation level you can, and you won't get a bunch of weird little bugs later on you can't figure out. And is SQLite's isolation level so good because of its embedded nature or because they've coded it in such a way that it's that good or both? I think it's more of a simplicity. They only allow a single writer, so it basically guarantees serializability
Starting point is 00:30:33 because you can't have other writers at the same time. You also get a consistent view of the data. It's a bummer, though, right? It's a bummer that you can't have more than one writer at the same time, right? I mean, I would say generally it's a bad idea to have long-running writes anyway, regardless of the
Starting point is 00:30:47 system. You can get into deadlocks and you can get into all kind of lock issues. I like the idea. You can generally do writes very, very fast in SQLite, so they seem like they don't feel like you only have a single writer. You can write a bunch. It seems like it's parallel, but it's not.
Starting point is 00:31:04 It's just so close to parallel that it rounds to zero kind of a thing. But it's not actually zero, so to speak. But it makes the model much simpler to think about. Simplicity. Is this something that you like in software, Ben? A little bit, yeah. I'm a big fan. There's just so much over-engineering, I feel like, for a lot of this stuff. I think I've railed on this for a while but like i feel like people have these extreme ideas of what they need as far as like their uptime or their durability like everyone thinks that hey
Starting point is 00:31:35 i should never ever ever lose data which is it sounds like you shouldn't right but like there's never a guarantee like you could lose your database then you could lose all your backups and you could lose this and that. So you're really just adding nines onto your durability over time. And one of my favorite—I bring this up, and I really don't mean to dump on these people. I think they do a great job. But one of my favorite examples is GitLab, where they lost six hours of data famously years ago. Okay, I'm kind of recalling.
Starting point is 00:32:06 And it was very public. But they're a public company. They got through that. It was fine. It wasn't the end of the world. You can't do that with all data. But I don't think people actually think about how impactful some level of data loss is.
Starting point is 00:32:21 I know that sounds weird. And they just try to over-optimize to make sure that they never, ever, ever lose data. Well, certainly the law of diminishing returns comes into effect, right? Yeah, exactly. You continue to exert effort
Starting point is 00:32:34 as you try to get that down to zero and money and time and all the things that effort requires, but you are only now squeezing out very minuscule gains at a certain point, where you can get huge gains to start with. And so what is that happy place where you can say, you know what, six hours? It was similar to a decision that I made a while ago, which I can't
Starting point is 00:32:56 think of it specifically. I remember talking to Adam about it. It had to do with our website. And maybe it was Gerhard as well, when he asked me like, me, what happens to Changelog's business if Changelog.com goes down? And I said, well, for how long? Because we could be down for 24 hours and our business is not going to disappear. In fact, our MP3s are served elsewhere. Of course, we couldn't publish new episodes,
Starting point is 00:33:20 but if we're down for 24 hours, we're not going to be happy. I don't want that to happen, but we're not going to die as a business. Now, if our website was down for 24 hours, we're not going to be happy. I don't want that to happen. But we're not going to die as a business. Now, if our website was down for 30 days, people would wonder if we literally died. So there is a level that you have to define what kind of thresholds matter for us in our use cases. And I think, as weird as it sounds, Ben,
Starting point is 00:33:44 saying some data loss is okay, coming from a database guy. Refreshing, I guess. Makes me feel better. Yeah, and honestly, I wrote a tool called Lightstream where you can continuously stream updates up to S3 for SQLite. So you basically have this super small window of data loss
Starting point is 00:34:04 of maybe a second or two. But honestly, that's even overkill for a lot of people. There's even documentation on the website of like, hey, if you just want to use a cron job and back up hourly, here's how to do it. It's simple and it's hard to break.
Starting point is 00:34:19 I think there's great options when you don't need that really high level of data loss guarantee, I guess. I think it's easy for us you don't need that really high level of data loss guarantee, I guess. Right. I think it's easy for us all to kind of jump to the maximal side of anything. Because I immediately think, well, that wouldn't work for Amazon. Because every second they're down, they're literally losing hundreds of thousands, if not millions of dollars in sales. And so then I'm like, I'm not building a solution for Amazon.
Starting point is 00:34:45 I'm building it for me. And I don't know why that is that we immediately go to like, maybe it's a purist thing. I think probably, but yeah. I think the funny thing too is when you get into high availability where you have multiple servers
Starting point is 00:34:58 and you want one to fail over or whatnot, a lot of times you can make it so complex that you actually lower your availability where something goes down and it doesn't fail over right or you might even lose some data in there because of how it fails over. So honestly, sometimes it's just like
Starting point is 00:35:13 having a database that dies and then you just bring up a backup might be the best thing. It's probably fun. You might actually save yourself downtime and trouble by having simpler solutions. What's up, friends? Intel Innovation 2024 is right around the corner.
Starting point is 00:35:32 Accelerate the future. Registration is now open, and it takes place September 24th and 25th in San Jose, California. This event is all about you, the developer, the community, and the critical role you play in tackling the toughest, and the critical role you play in tackling the toughest challenges across the industry. Ignite your passion for AI and beyond, grow your skills to maximize your impact, and network with your peers as they unleash the next wave of advancements in technology. Here's what you can expect. Understand the emerging innovation and trends in dev tools, languages, frameworks,
Starting point is 00:36:06 and technologies in AI and beyond to empower you and the solutions you're building. Get in-depth technical experience doing hands-on workshops, labs, meetups, and hackathons to collaborate and solve problems in real time. You can explore featured partner and Intel solutions. They have partners there, startups there, customers there. And Intel is showcasing the latest in products, services, and solutions across keynotes, tech sessions, and the show floor to help you meet your development needs. Collaborate with experts, learn and have fun, engage in interactive sessions to connect, get certified, gain unique ideas and perspectives, build long lasting networks, and of course, have fun and get inspired. Hear from leading industry experts, technologists, startup entrepreneurs,
Starting point is 00:36:53 and fellow developers, along with Intel leadership, CEO Pat Gelsinger, and CTO Greg Lavender, as they take you through the latest advancements in technology. Don't miss this chance to be at the forefront of innovation. Take advantage of early bird pricing right now until August 2nd. Register using the link in our show notes. Or to learn more, go to intel.com slash innovation. Once more, that's intel.com slash innovation. Or go to the show notes and click that link.
Starting point is 00:37:29 Well, you'll like this, Ben. I wrote this on monday for changelog news i was covering a story called why csv is still king which of course there's the details in there but you kind of get the point from the title of their post and uh one of the reasons they went through the history of csv one of these interesting accidental standards, like nobody wrote, nobody designed this thing. It was almost like JavaScript 10 days, you know, in a lab and now out it comes. This just became a thing and remains a thing. And this whole point of this post was like, and it ain't going anywhere basically. But one of the things they said is it's good enough for many situations and it's dead simple to use it's just dead simple and so that got me thinking more and more about simplicity and of course there's two sides to simplicity one side is like it's not clever you know it's not it's not impressive it's simple yeah no one puts csv
Starting point is 00:38:20 on their resume yeah right and yeah i mean you're not to get a job because you know how to do CSVs. We even have like a term simpleton. Like that's a, let's explain somebody who's not very deep, right? They're a simpleton. And so nobody wants to be called that. And I remember James Buck, who was prominent in the Ruby community,
Starting point is 00:38:39 worked at 37signals, was core contributor on Ruby on Rails, and he wrote the Capistrano deployment tool, which turned out wasn't super simple, but I think he wanted it to be. And one time he said, everybody thinks simple is, paraphrasing, not quoting him,
Starting point is 00:38:55 everybody thinks simple is unimpressive because they think it's easy. They think simple is easy, but simple is actually the hardest thing to accomplish in a complex world. And so it looks easy, but the hard part was making it simple so that it actually looks easy and so it actually is impressive but it's not impressive it's one of these weird deals right yeah it's always weird when you like you might have tried a thousand different ways of doing something trying to get
Starting point is 00:39:18 down to that that simple essence and then when you finally get to it and explain to somebody they're like oh yeah duh like that's yeah exactly Yeah, exactly. You didn't get the whole journey. Right. And the solution was obvious, but it was only obvious once you went through the journey and made it obvious, but to the person you presented to. Anyways, what I wrote was the old saying in real estate, the three things that matter in picking a property is location, location, location. Well, I said the three and most important factors in determining the desirability of a solution,
Starting point is 00:39:48 implying software solution, of course, are simplicity, simplicity, and simplicity. I kind of think that's true. Yeah, I would probably agree with you on that. It might be like the highest thing that you can achieve in software is simplicity, which is probably why you like SQLite. Yeah, no, it's great to debug
Starting point is 00:40:05 and I think that's a lot of it too is like none of these solutions are perfect and when they go wrong like can you just open up a file and edit it and like see oh it's missing a double quotes or something like that. Like you can't do that with protocol buffers. Right.
Starting point is 00:40:18 It just says you're just SOL honestly. Yeah. And to that point I've been using Postgres for many, many years and I'm proficient with point, I've been using Postgres for many, many years and I'm proficient with it. But I've never, I know where the data folder is, but I've never
Starting point is 00:40:32 gone in there and poked around. I know lots of people have and so I'm not saying that that particular part of Postgres is complex, maybe it's not. It's just a thing that's been a black box to me and I think that does speak volumes about their abstraction layer. But I've also used SQLite quite a bit.
Starting point is 00:40:48 And I got no problem just opening up a SQLite file in either the SQL command or using an editor or whatever. Obviously, I'm not going to open it up in Z and read it from there. Maybe you do, Ben, but I'm not quite that far into the matrix yet. Everybody needs a hobby. Yeah.
Starting point is 00:41:03 And so there is something about that that's just like just being able to it's just a file on disk. And that goes back to even I think some of the virtues of a Unix philosophy
Starting point is 00:41:12 or maybe it's Linux. Everything's a file. Is that part of the Unix philosophy? I know it speaks to it. Everything's a file. There's a simplicity to that. And of course it has
Starting point is 00:41:21 its drawbacks. You know, like it's not perfect. But it's also kind of nice in a lot of ways just have that simple mental model around it so sql i definitely has that going for it what are the drawbacks of sqlite though i mean they have to be some oh they definitely are i mean i think people that are used to more of a graphical user interface like it's there's not a great way to do that for remote databases. Honestly, that's one of the biggest things I find that people hit.
Starting point is 00:41:51 I always use CLIs, so it never bothers me, but that is definitely a big one. Like you mentioned around concurrency, you can't have multiple writers. There's obviously some solutions around disaster recovery you can do, but essentially it is just a file on a disk. It can be on its own. You can't just replicate it with just simple SQLite. So there's definitely some trade-offs. Right. So in comes Lightstream.
Starting point is 00:42:13 You built that for that purpose, right? Yeah, for disaster recovery, yep. Just trying to push it up somewhere so that you can basically run an app on a single server and not worry about it just crapping out, and then you lose all your data. So you can set it up. So you kind of restore immediately and get all your data right back.
Starting point is 00:42:31 Have you spoken with the SQLite folks like Richard Hipp and his team about like, do you think that I would think that something like that would be part of what they offer then to like just completely knock out that particular drawback? I think they, I did talk to them pretty shortly after the Lightstream stuff came out.
Starting point is 00:42:48 I got a little conference call with them. Super nice, great people. I think that they tend to have a focus more on embedded devices and single server or single system uses. I think a lot of their, they have the SQLite consortium as well, which is a bunch of companies that pay money in to help support the ecosystem.
Starting point is 00:43:09 I think a lot of it is more like device manufacturers and things like that. I don't think that they have a strong incentive to go outside of that right now. They aren't trying to serve that particular use case, but you want to use it that way. Yeah. I liked writing stuff in Bolt.
Starting point is 00:43:26 It was super fast, but I just wanted a schema and indexes without building those myself inside Bolt. So SQLite was a good in-between. How far do you think SQLite could go in a web server, a dynamic web app scenario? I mean, I think it really depends on your language. I write in Go mostly, and it's really fast. So, I mean, I can serve hundreds, if not thousands,
Starting point is 00:43:51 of requests per second out of a SQLite database on pretty minimal hardware. But I know Ruby and things like that, they tend to go a lot slower and are more CPU-bound. So I'm sure you probably get some limitations around that. But maybe you could just scale up the number of processors. I'm not sure. But, I mean, I think it's probably beyond the scale of 90% of websites out there.
Starting point is 00:44:10 I think you're probably fine. That reminds me of something Brian LaRue told me on JS Party a couple of months ago about dynamism inside of a web page. And they've done some actual work on this. And I can't remember the exact percentage he gave. We can go back and pull that out if we need to. But something like 90% of all elements on a page are completely inert. It might have been higher than that.
Starting point is 00:44:35 Meaning they're just written once and it's just like, it's the head of your page, it's the footer, it's this. Most of those things, it's just, they're inert. And very few elements are dynamic in any way. And I think probably, you know, 90% of web apps out there
Starting point is 00:44:54 are mostly inert, you know? Like, they're doing stuff. Yeah, probably a lot of it, yeah. But not the way that we design for such scale. Yeah.
Starting point is 00:45:02 And honestly, I really miss, I mean, I know we're kind of going back to server-side rendered applications, I really miss, I mean, I know we're kind of going back to server-side rendered applications, which I love, but when all the React stuff came around or whatnot,
Starting point is 00:45:11 every time I went to a webpage and it had some fancy JavaScript stuff going on, I just knew the back button wasn't going to quite work how I wanted it to or some certain little things that just always drove me nuts. So I miss just like basic web apps. That pendulum has begun to swung back
Starting point is 00:45:28 for sure. I know Remix I think does a bunch of server-side and I think React does as well. React themselves are moving server-side as well to provide more of a full-stack solution. That's had a lot of issues because of just the nature of how React
Starting point is 00:45:45 started and what it is and the user base of React. It's been very difficult for them to make that transition. And so I think there's opportunity for newer component libraries that are server-side in nature or full-stack in nature to start with to actually gain
Starting point is 00:46:01 some foothold because simplicity and React at this point are not in the same ballpark. They just aren't. I try to learn React like once every two years. I'm like, eh. Just go back to write and go. The basics aren't too bad
Starting point is 00:46:17 but things do get complex pretty quickly. But anyways, we were talking SQLite and scaling. You all have put on some work to do some horizontal scaling as well, like moving it around to different regions and having, like if I had a web app
Starting point is 00:46:33 with app servers geographically distributed, aren't you trying to also take my SQLite database and move it around and have it replicated around the world? Yeah, so we have an open source project called LightFS where we essentially implement a file system layer and fuse so that we intercept. Basically, it's a pass-through file system, essentially. So all your SQLite writes and whatnot
Starting point is 00:46:59 go straight through to the database, but we can essentially detect where transactions start and end so that we can then kind of wrap up those changes into a separate file and then ship those out to other SQLite or Lightstream, sorry, LightFS, too many lights, LightFS nodes, and they can then apply those changes. And it's all done kind of at a file system layer and like a physical layer. So you can use any extensions you want on top of that. It's not specific to any of those. So as an app developer, I don't have to necessarily think about it.
Starting point is 00:47:31 I just deploy and say, you know, put my, I was going to say dynos, but that's the Heroku thing. What do y'all call it at Fly? Put my machines. Machines over here at Fly, but yeah. Yeah, you can spin up machines in different regions and then they just automatically can connect up to the primary and stream down changes.
Starting point is 00:47:48 Are people using that? Yeah, we've got quite a few people using it. Nice. Yeah, and if it's your use case and you need low latency stuff around the world which can go a long way, then I think it's a good fit for people. It's a lot simpler than setting up Postgres
Starting point is 00:48:02 and a bunch of replicas and things like that in there. So I was checking out the LightFS repo on GitHub. By the way, of course, everybody knows Fly.io is a sponsor of the changelog. This is not a sponsored episode. We had been on for years before he ever worked at Fly. It just so happens there's lots of crossover and things that we're interested in and are sponsors.
Starting point is 00:48:20 It's a small world, too. Yeah, exactly. There's a disclaimer there. I was looking at LightFS on the old GitHub there, and it was like, latest commit seven months ago. It's a small kind of worked. And I think getting too fancy with any tooling can cause the sound issues. Yeah, well, you move away from that simplicity model. The other thing that I always think about with you, Ben, is just your willingness to declare something finished, or at least that you're done with it.
Starting point is 00:48:57 Moving on from BoltDB, your strong stances on open source but not open contribution. There's an expectation setting that you do that I really appreciate and I wonder where that comes from like do you most people don't have the guts
Starting point is 00:49:13 to just say that kind of stuff no I think it's just a lot of burnout mostly you're just sick of it yeah I mean I just realized like with Bolt especially
Starting point is 00:49:21 like I just got to a point where I got so burnt out trying to maintain it especially at like a certain scale. Any changes could potentially affect performance characteristics. You just have to do so much testing on it that probably hadn't been set up to the level that I really needed it to be. So every change involved just so much time. And Bolt really thrived in the era of the launching into the stratosphere of the cloud-native stuff, didn't it?
Starting point is 00:49:46 Go, systems, Kubernetes. I'm not sure if it's in Kubernetes, but things around it. That's in SED, which is in Kubernetes. It's in SED, yeah, exactly. And the amount of success and money and valuations and money raised and stuff were just going through the roof and like BoltDBs and all of these different things, wasn't it? Yeah, Go kind of went crazy with the cloud native stuff.
Starting point is 00:50:14 Which you probably didn't see coming. No, not at all. Honestly, I wasn't ever trying to write Bolt to be like the Go database. I was mostly just trying to learn about databases when I wrote it. Right. That's how you know so many isolation levels.
Starting point is 00:50:28 So LightFS, in good shape. I haven't played with it, but I'm definitely interested in the concept. Of course, our production app is already Postgres, so waiting for a good use case to try out a geographically distributed SQLite and just see how it all works, because it fascinates me. It seems like, I don't want to say a square peg round hole specifically, because I feel like that's usually a bad idea, but it definitely seems like kind of like a stretching into an area where even the SQLite team, like you said, aren't super keen on it. What was your guys' driving
Starting point is 00:51:01 force behind this move? You know, I think there are a lot of people that are interested in using SQLite. Honestly, the two biggest things for my complaints as far as Lightstream were that it didn't have like a failover system. So if you went to do a deploy, you had to take down your app for a second or two and then roll it back up. And then the other one was just read replicas.
Starting point is 00:51:22 So people might want to have read replication out to some distant area and just didn't support that. So that's kind of where the driving force was around that. But it's a fine line, though. I mean, SQLite is kind of known for simplicity. So adding any complexity, definitely. There's a certain level that people find acceptable, and it's kind of a gray area of where that is.
Starting point is 00:51:43 Right. Probably depends on each individual's taste you know yeah whereas i'm sure like if you built this into some more complex product people would be like okay well just that's fine but like people are very very um focused on simplicity within the sequel like community you take something that's simple and make it complex people are upset but you take a little bit complex and make it more complex we'll buy that oh yeah yeah you can yeah people buy money or pay money for that that's simple and make it complex. People are upset. We take a little bit complex and make it more complex. We'll buy that. Oh, yeah.
Starting point is 00:52:06 Yeah, you can. People buy money or pay money for that. That's funny. What are you working on now then? Doing a lot of stuff at Fly. I mean, still doing some SQLite work. But yeah, I'm VP of product here at Fly now. So kind of stretch my hands out
Starting point is 00:52:21 on different projects and whatnot. Right on. Hung up the nights and weekends open source stuff. Yeah, pretty much, yeah. I don't do as much code these days at Fly, so I've got to find a little side project or something to nerd out on. Right.
Starting point is 00:52:35 Like the RQ Lite guy. He's a manager at Google. I used to work with him, but he gets all his pent-up engineering energy out by working on RQ Lite, like a distributed SQLite system. I haven't heard of that, RQ Lite. Tell me more.
Starting point is 00:52:48 Do you know more about it? No, it's a RAF-based system. It's more of a client-server model than something like LiteFS, which is more like a direct SQLite file system-based. But yeah, he's great. He's a great guy. That is cool.
Starting point is 00:53:04 We didn't actually do SQLite at the time when we worked together, but for some reason we both went out. You're both interested in it. Made distributed SQLite implementations. Yeah. Have you looked into any of the vector stuff? I've only been listening about it. I know there's PGVector.
Starting point is 00:53:19 There's probably SQLite Vector. Yeah, there's SQLite Vector extension as well. I haven't really dug in a ton to that stuff i kind of researched a little bit when the ai stuff first started coming out but now i haven't really i haven't found like a great use case that like i love ai stuff for but yeah so i haven't really dug in i like infra stuff i like writing infrastructure code so i think that's kind of uh yeah kind of where i stick to you're still happy with go yeah i love go yep i haven't had any wanderlust i would i don't know some things i like like are like zig i thought was kind of interesting i wish it was
Starting point is 00:53:55 just more mature but i like the idea of like specifically allocating your memory very intentionally i guess so by that interesting um i liked rust like the language but the actual like async implementation i just i got so infuriated by that i just gave up on rust what triggered you the most about it it's it's like it's its own language where like it actually compiles down to like a finite state machine so it's not so you can't actually do like recursive calls and async rust and a bunch of other weird limitations. And then there's a bunch of weird naming stuff of pin and unpin and sync. And I don't know, there's just way too many everything around Rust.
Starting point is 00:54:35 It felt like so much cognitive load was just remembering all these little rules. And I didn't actually enjoy writing code. Not simple enough for you. Too complex. Yeah. Well, if you were just more clever and wise, Ben, you could handle the complexity simple enough for you. Too complex. Yeah. Well, if you were just more clever and wise, Ben, you could handle the complexity. There you go. That's the problem.
Starting point is 00:54:49 I feel like you can get like 95% of the things in Rust with just like, like Go has like the race checker and like there's other ways you can kind of like emulate some of that stuff. It won't get you like the perfect, you know, Rust safety, but I think you get pretty close. The only thing on my
Starting point is 00:55:06 list that we haven't talked about yet is mixing and matching. So oftentimes you're picking when we go back to databases you're picking a database
Starting point is 00:55:15 and I think that it's common to believe that you have to just pick one and go with it and like there's no rule in the
Starting point is 00:55:24 rule book for programming that you just have to have a singular it. There's no rule in the rule book for programming that you just have to have a singular database. I know lots of companies have multivariate or whatever you call it, multiple data stores depending on what they're up to. Pretty common at least to have something like a relational database and then also have something else depending on what you're up to.
Starting point is 00:55:42 Oftentimes that is a key value store often used as a caching layer, but can be used for other things as well. Have you ever gone multi-database in projects of your own? You've seen people do it, I'm sure. Yeah, I mean, or at least projects I've worked on have gone multi-database. I feel like
Starting point is 00:55:57 there's so many consistency issues, though, you tend to hit, just trying to keep everything in sync, and that can be its own headache. So I think unless you have a really good use case for it, or like the performance is like significantly better, or maybe it's just data that like, I don't know how to describe this exactly, but it's like data you don't really care as much about. So like a lot of times, if you have like metrics, for example, yeah, like you can throw those in time series and it doesn't have to sync up with your, your relational data or whatnot. So yeah, I think time series is a great example of something where you can get something
Starting point is 00:56:28 that's, you know, 10 times faster than the relational equivalent. So it makes sense. Although you have, you know, timescale, which works in Postgres. Everything works in Postgres, right? But, you know, I think there's certain use cases for specific types of databases like that. Yeah. One that comes to mind just because it's open source and we've spoken a couple times with the creators of it as plausible analytics and they use postgres for their standard data but then the actual analytic data they use click house and so then it's calmer i believe never used ClickHouse. I think that one's open source.
Starting point is 00:57:06 It might be open source-ish. You never know anymore. No, I mean, people like it, but I know there's a company behind it. I'm not sure what parts are open source for you. Yeah, it's just the whole open source project plus business thing has gotten very gray in the last couple of years as the sands of time are shifting underneath us.
Starting point is 00:57:26 Redis was once open source, isn anymore elastic of course we haven't even talked about uh elastic but i think click house still is pure open source and then has probably a hosted service for you it's my guess that would make sense yeah i've heard it's good though people like it well ben anything else about databases that we haven't discussed i mean there are lots of other things about databases, I'm sure. But anything that is on your mind or you think would be helpful for folks before we call it a show? I mean, I think your advice of just picking Postgres is probably a good thing. No, it's supposed to be an it depends, Ben. No, I said just start with Postgres.
Starting point is 00:58:00 Start with Postgres, yeah. But I mean, I think there's so many areas you can kind of delve into. And I think there are definitely use cases for people that need something faster or whatnot. I mean, it's interesting, like a lot of these kind of niche databases came because you can get a 10x performance if you relax certain constraints. Like if you don't need certain isolation level,
Starting point is 00:58:19 for example, like you can really go a lot faster. So I think kind of delving in and kind of understanding your data and what the needs are and what constraints you have can really go a lot faster so i think kind of delving in and kind of understanding your data and what the needs are and what constraints you have can really help you kind of pick out which database works for your situation what performance needs you actually have good advice good advice indeed and last question for you ben how do you make software simple i think you have a i think it's good to start with a vision of what you're trying to make and then just stick with that. And then instead of trying to figure out, I don't know,
Starting point is 00:58:50 I think I have like an allergy to like writing more docs. So like if there's like weird edge cases and whatnot that it's going to create, I try to avoid those so I don't have to document them. So if I can give someone just the simplest command to just do something that they want and keep the doc simple, then I think that's a great way to go. So do you believe that the simplicity needs to exist at the interface more so than at the code? Meaning like, do you hide the complexity from the user or do you design out the complexity? Like, how do you go about it? I think you have to design out the complexity like how do you go about it um i think you have to design
Starting point is 00:59:25 out the complexity honestly like i think that any kind of weird complexity in your code is going to seep its way out into the ui because you're gonna have to account for it you know when things go go wrong or whatnot and how do you go about designing out complexity do you take a lot of walks do you draw i do take a lot of walks actually it helps a lot do you have a whiteboard or do you know actually i don't really whiteboard that much um i think i just do a lot of walks? Do you draw things off? I do take a lot of walks, actually. It helps a lot. Do you have a whiteboard? No, actually, I don't really whiteboard that much. I think I just do a lot of iterations. I tend to write kind of like the domain, like the application domain out
Starting point is 00:59:54 without so much concern of the underlying dependencies, whether it's a database, whether it's like a file system layer or whatnot, and just try to understand kind of what those entities are that I'm working with. And, you know, kind of, it's almost like normalization, like in databases, like figuring out kind of where your tables
Starting point is 01:00:11 split up and how they relate to each other. You kind of start from that and then find ways to simply build in, you know, your persistence or your interface via HTTP or CLI or whatnot. Yeah. I'm going to go back to James' Buck one more time because I think he said two things that really stuck with me and I'm going to reference them both in the same show.
Starting point is 01:00:32 The first one was about simplicity that I said earlier. And then the second one he said is that when he designs an API or when he's building an API, and we're talking about not like a HTTP API, but a function name, parameters, a library, etc. That he actually starts with using a fake one that he wants to use. And so he will just call a function that doesn't exist and pass it what he wants to pass it as the user of that thing. And he works backwards from there to create all the things behind it that would actually make that
Starting point is 01:01:05 api exist and so it's very similar to what you're describing there yeah that's a good way to do it too for sure yeah so i submit that to you and to our listener as a way of at least try it out see if it works it's similar to tdd in certain ways i think he was talking about tdd when he said that anyways all right good stuff appreciate it Appreciate it. We have officially, it depends on databases. Turns out, just start with Postgres, but don't necessarily stop there
Starting point is 01:01:32 depending on your particular use case. Anything else, Ben, before I let you go? No, I think that covers it pretty well. Thank you. Cool, man. Well, I appreciate the work you do. Appreciate you coming on the show
Starting point is 01:01:43 10 times now. And I'm going to go write a sequel query to see if you are our most guested person maybe you might be up there because you've been on so many different of our shows i can't think of anybody else who would be hitting double digits but by the time i'll ship this i'll throw it in the outro and we'll see if we can crown you the most frequent change log guest how do you feel i mean you're at least in the top five for sure uh good yeah i like talking to you guys i mean i think i don't think it's that you guys invite me on so much it's just more that i'm old and i've been
Starting point is 01:02:13 around long enough around a long time well there might be some truth to that because i haven't had you on for a while yeah i remember meeting you guys at like i think the first or second gopher con and like 2014 15 so 14 or 15 yeah i remember that as well so we were all around yeah early days yeah um similar ages yeah so you haven't been on the show since uh july of last year you were on the solo gopher with uh with chris and ian on go time your first appearance goes back to the changelog 170 2015. It looks like it was just me and you on that show. BoltDB, InfluxDB, and KeyValue Databases.
Starting point is 01:02:54 So there we are, a decade ago. Man. Talking databases, and here we are. That's cool. Still here, yep. All right, good stuff. Appreciate it. And that's all.
Starting point is 01:03:03 We'll talk to you all on the next one. Cool. Thanks for having me. Bye, friends. I did run that query to see who has the most guest appearances on our shows, and Ben is indeed in the top five with this 10th appearance. He's tied at third with Gerhard Lazu, who will certainly pass him up shortly with an ex-Kaisen, but they're both behind Ron Evans, who has appeared
Starting point is 01:03:32 11 times, and Matt Reier, who has the most 14 guest appearances on our pods. Now, Matt is also a go-time host, but if you think he's cheating, no, hostings don't count. If we were counting all change dog appearances for all time, well, Adam and I would utterly destroy everyone else. But yeah, that makes sense. Okay. This has been our fourth installment of the It Depends miniseries. If you dig it, let me know in the comments, what topic or experienced dev we should feature next. Thanks again to our partners at Fly.io, to our beat freak, the one and only Breakmaster Cylinder, and to our friends at Sentry for hooking our listeners up with a hundred bucks off a team plan by using code CHANGELOG when you sign up. Next week on the Changelog, news on Monday,
Starting point is 01:04:17 Andreas Kling and Chris Wanstroth talking Lady Bird on Wednesday. That'll be a good one. And a fresh episode of Changelog and Friends on Friday. Have a be a good one. And a fresh episode of Change Logging, friends, on Friday. Have a great weekend. Share the show with your friends who might dig it. And let's talk again real soon. The answer is it depends. It depends. It depends.
Starting point is 01:04:48 There's a big it depends. I feel like it depends needs its own little theme tune. Problem is it depends on how you view it. I guess it depends, but like, yeah. Well, it depends. It depends on which country and which language. Some people won't work with you either. It really depends on the moment that you're in and what's just happened.
Starting point is 01:05:06 I suppose it depends on the individual. It depends. It depends on how sort of automated you want to be about it. Yes, it depends. Tradeoffs. It kind of depends, right? It depends on the month, I guess. It all depends.
Starting point is 01:05:17 I mean, again, I hate to say like it depends, but I do. I think. Well, it depends. It depends what? I guess it depends on which TikTok you're on. So the answers to my questions are always going to be it depends it depends what what i guess it depends on which tiktok you're on so the answers to my questions are always going to be it depends right i kind of figured that there would be a uh it depends as there always is what do you think about that that's probably kind of a it depends yeah it's very much and it depends so it depends it really depends on just like what it depends uh sometimes um but like
Starting point is 01:05:48 it really depends on what i'm coding and it depends on the drive size it depends i heard that a few times so it's kind of an it depends all the way down depends on the graphic it depends look sexy well i don't know it depends a little bit. This is why lawyers' favorite phrase is, it depends. But I think it depends if it's a simple... And I guess it depends on the... The obvious answer for everything is it depends, right? Of course. Yeah, I would say it depends.
Starting point is 01:06:15 Honestly, it depends on the... So, it really depends on... I sometimes do. It really depends because... I think, like, it depends. Like, there could be... It just kind of depends. Like... It depends. I mean... And think, like, it depends. Like, there could be... It just kind of depends. Like...
Starting point is 01:06:25 It depends. I mean... And there it depends, I guess. The answer is, as almost always in engineering, it depends. It depends. It depends. Yeah.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.