Software Huddle - SQLite, Turso, and the State of Databases with Glauber Costa

Episode Date: October 22, 2024

Today we have Glauber Costa on the show, who's the CEO and founder at Turso. They provide a managed SQLite service with some really interesting capabilities that's changing some of the application pat...terns you can do. He shares a lot of really good technical stuff on Twitter. He worked in the kernel, he worked on high-performance databases at ScyllaDB, and now he's working on Turso. He also has a great and interesting podcast, the Save File, which is about developers and religion. Glauber had some great thoughts on the future of databases, including what the future of NoSQL is like and whether we'll see vector databases as a separate category or as a feature of general-purpose databases. We’ve seen arguments both ways, but he was the most effective at changing our mind.

Transcript
Discussion (0)
Starting point is 00:00:00 What's your relationship like with the SQLite team? Are they just kind of off? Especially the main guy. Is he like off on his own? Like sort of not interacting with folks? Yes, pretty much. Do you have a back? Really?
Starting point is 00:00:09 Yeah. We never, like we would welcome a relationship. We never went out of our way to create one. I mean, and they're very, like, so we talked to a lot of people in his outer circle. But we never happened that that we had the chance to to check too much sequel light is having a moment right now and it seems like you're you and pecker and and turso are like one of the reasons why but between turso and light stream and you know aaron francis and kensi dodds and a lot of people just like advocating for sequel
Starting point is 00:00:42 light it's like what's dhh and dhh talking a lot about SQLite as well, right? The Primogen and others. So like, I wish I could say we're one of the big reasons. I think we're more of a follower. I think what happened there is that we saw this and then we decided to move the company in that direction, right? When we spoke last, I think we were just pivoting to Thurso. Our company also did not start as a database company in a much similar vein as Scylla. How do you all think about
Starting point is 00:01:11 marketing to developers? Because I think you have done a good job about that on a few fronts, I guess. What have you learned in your last couple of years at Terso doing this? Hey, folks, this is Alex. Today, we have Glauber Costa on the show, who's the CEO and founder at Terso. They provide like this managed SQLite with some really interesting capabilities. That's like changing just sort of like what application patterns you can do. So we talk about those.
Starting point is 00:01:34 I love following Glauber on Twitter because he shares a lot of really good technical stuff. He worked in the kernel. He worked on high-performance databases at Scylla. Now he's working at Terso and just has some interesting thoughts. He's got a great podcast too, The Save File, which is about developers and religion,
Starting point is 00:01:49 which I think is really interesting. So check that out. He has some awesome thoughts on, hey, where's NoSQL at? Where are vector databases going to be? And what's that going to look like in the next couple of years? And really changed my mind on some of that stuff.
Starting point is 00:02:01 So I think it's worth listening to that. If you have any comments, if you have any other guests that you want to show, anything like that, feel free to reach out to me or to Sean. With that, let's get to the show. Glauber, welcome to the show. Well, thanks for having me, Alex. A pleasure to be here. Yeah, absolutely. So you are the CEO and founder at Terso. I'm excited to talk to you. We talked 18 months ago on Software Engineering Daily, and I think you're one of the best follows in the database community, but also the technical dev tool founder type community and things like that. But I guess for people that don't know you, maybe give us a little background on you
Starting point is 00:02:35 and Turso. I'm happy to do that, and glad to be talking to you again. So I, as you mentioned, I am the founder and CEO of Tur again. So I, as you mentioned, I am the founder and CEO of Thurso. Before that, my career is actually pretty simple because I never job hop too much. So I spent 10 years in the Linux kernel, most of that time at Red Hat, not exclusively, but most of the time at Red Hat. And this is where I met my co-founder. Opeka comes from the same background. He was the maintainer of the memory management subsystem for Linux at the time.
Starting point is 00:03:12 I've done all sorts of things. I've done file systems. I've done storage. I've done virtualization. I've done containers. And we got to be great friends. I later joined as employee number three, and he joined as employee number four, a startup that originally was doing something called the Unikernel, which might have been ahead of its time. There are a couple of companies, one in particular that I recommend
Starting point is 00:03:36 called Unicraft. They're trying to do this again today, more successfully than we've done it in the past. But the product didn't quite work. And after two years, the company pivoted to a database called Scylla, which is in the NoSQL petabyte scale space. So Pekka and I just got catapulted into the database world. And it wasn't something that we did by design. It was just something that happened to go in that direction. And I did have a one-year stint at Datadog.
Starting point is 00:04:10 So after seven years in my case, eight years in his case at Scylla, I spent a year at Datadog. It was great for a variety of reasons. Datadog is a fantastic company. But it was the right time for us and the right opportunity for us to found the company. And here we are at Thruso. Yeah, absolutely. I think you told me that story that Scylla started as a unicurl company before and I had forgotten that. Because I know Scylla is great.
Starting point is 00:04:37 Yeah, I had forgotten that they started off as unicurl. And I forgot that you were employee three and four for Pekka there. So that's super interesting. I also think it's... Technically, I wasn't the third, but the person, one of the people that were there ahead of me quit quite early and was never at Scylla. So I kind of, you know, grandfathered myself in as number three.
Starting point is 00:04:58 That counts, yeah. And I also think it's fascinating to see, like, you're super technical, like, you know, down in the kernel and doing databases, all that stuff, and then pivoting into founder, CEO, which is like, you still have a technical company, but you're doing marketing and biz dev and all that sort of thing. I think that's a transition there. It is a challenge. I think it is one of the biggest challenges that I face on a personal level, right? Because you just have to go into a different... The challenge there is that...
Starting point is 00:05:31 And I heard is actually one of the things that we are very blessed with at Thurso is the amazing roster of investors and advisors that we have. We have as investors in our company, for example, Guillermo Rauch, the CEO of Vercel. We have Jordan Tigani, the founder of MotherDuck. We had Yoran, the founder of Tiger Beetle. All of those folks are as investors in the company. And we have one advisor in particular that very early on, he gave me this advice
Starting point is 00:06:00 and it was transformative. It's like a lot of founders who are technical founders, they go into like found technical companies, but then because now you're doing something like you're not a business person, so you end up acting on a caricature of what you believe a business person should be and then you just do everything wrong. Because say, look, this is how business should look like. This is how this should be done.
Starting point is 00:06:35 And you forget that as a developer, you will never in a thousand years buy a product from a person that is behaving like you are now. So this advice was given very early on, not because I was doing that. On the contrary, he was saying, just don't do that because this is the mistake that I see a lot of founders making. And man, what a great advice because we do have this instinct. We understand that, look, this is new. This is uncharted territory. I have to be different. I have to act differently.
Starting point is 00:07:02 But you also, I mean, through this advice, I learned very successfully that you can't forget that you are, at the end of the day, representative as a developer of the product that you're selling. Then, you know, put yourself in the position of this buyer. And like if somebody comes up here with a lot of jargon, as an engineer, would you buy that? No. So don't do that. Right. So it's a good line for us to be walking. Yeah, for sure.
Starting point is 00:07:28 And yeah, I think marketing to developers and also just cracking dev Twitter is a hard thing to do. And I think you've done a good job of that. So I think it's impressive. Well, better than Cracker. Exactly. No, you both, you know, both of you,
Starting point is 00:07:42 in addition, you both get on Twitter, but then also put out some really deep technical stuff, which is like, hey, you're nerd sniping some people with some of that stuff. But then also, like, I think have a pretty good and fun just sort of vibe on Twitter as well. Appreciate the recognition here. Yeah, for sure. So I want to talk about Terso. And it's interesting because, like, SQLite is having a moment right now. And it seems like you're you and pecker
Starting point is 00:08:05 and and turso are like one of the reasons why but between turso and light stream and you know aaron francis and kensi dodds and a lot of people just like advocating for sequel light it's like what's dhh and dhh talking a lot about sequel light as well right the primogen and and others so like it's the i i wish i could say could say we're one of the big reasons. I think we're more of a follower. I think what happened there is that we saw this and then we decided to move the company in that direction, right? When we spoke last, I think we were just pivoting to Thurso.
Starting point is 00:08:37 Our company also did not start as a database company in a much similar vein as Scylla. The story, of course, is different in many ways. I mean, it has parallels. Our product, I think, in fact, was doing a lot better than the Unikernel product. We were users of SQLite, but then we noticed, I mean, it was just palpable to us how much more people were talking about SQLite
Starting point is 00:08:59 versus a couple of years ago. And then we started trying to understand why and trying to understand, okay, is there... People would come to our Discord and the thing that they would ask the most, I mean, our product was not SQLite, but it would use SQLite. And the things people would ask the most were about SQLite and the user we're making of SQLite. So it became clear to us that, look, maybe this is a better direction. And we quite successfully built Durso as a result of that. So look, obviously, we always try to talk about
Starting point is 00:09:32 SQLite, but I think we're riding a wave here. We're not creating this wave. This wave was already there. We just identified and trying to follow what the market people will call the secular trend, whatever that means. Yep, yep. Okay. So I guess walk me through the timeline. So you had your previous product. It had SQLite under the hood. And just for, I assume most folks are familiar,
Starting point is 00:09:58 but SQLite, hey, it's basically a file that you just open within your application. But you all forked it into LibSQL, right? Which basically enables more of like a client server model of LibSQL. Is that the best way to describe that? Well, no, that was not even, that was one of the things we wanted to do with LibSQL, but LibSQL was not even that. I mean, LibSQL, you know, if I was starting today, that will be, by the way, as a company builder, this distinction, even the marketing, the naming already is something that we will be better off without.
Starting point is 00:10:35 Because we have to talk about LibSQL and then we have to talk about Thurso and those are two different things. But if I was starting today and, of course, knowing the future, I would have just used the same name and called everything Thurso. But what happened is that we put this fork of SQLite out, and we didn't necessarily have the intention of creating a product out of this. Because remember, as I said, we were seeing that Leap SQL essentially worked and functioned as the last drop in the bucket. Because we were seeing that the things that people liked the most about what we're doing was SQLite. All of the things that people like, all of the characteristics of our product were like the fact that,
Starting point is 00:11:21 wow, it just is a backend that I do an NPM install and it works and it's already working. There's nothing to set up. All of those things that people love were stemming directly from SQLite. But we knew the SQLite by itself didn't have all the things we needed. And we wanted SQLite to be more than it was. That's the reality, right? And there were other projects that tried to enhance SQLite. There was LightFS, there was Lightstream, there's a lot of things like that. But all of those projects operate
Starting point is 00:11:57 with the assumption that SQLite cannot be changed. And SQLite cannot be changed because it's in their website. It says we're open source, but not open contribution, right? So we're not going to be able to change this. And so you end up building around SQLite, right? And what we've done is that, look, for us, this is not going to work for the ambition that we have for this project, for this company. This isn't going to work. To get where we need to be, we need to fork this and be very intentional about how we change SQLite. So we forked it and we saw a lot of interest. We saw 1,500 GitHub stars in the first week of the fork. And that is the moment, like, again, those things were in the backdrop.
Starting point is 00:12:38 We're noticing that a lot of the interest was because of SQLite. And then we forked SQLite and it sees a tremendous amount of attention. What that tells us is that, look, there is a lot of interest for SQLite, but there is even more interest for the concept of SQLite and what SQLite can do. And there is space for the market for somebody that thinks about this problem in a different way. So let's do it. And then we started adding to this fork things like replication, the client server, fork things like replication, the client server,
Starting point is 00:13:08 other things came later, like embedded replication. We do backups to S3 natively. So there's a lot of things that we added to this fork. Some of them we don't use at Thurso, but they're there, right? Yep, yep. I guess like how much active development
Starting point is 00:13:21 is there on SQLite proper? Is that like something that you see like, you know, they're updating that allite proper? Is that, like, something that you see, like, you know, they're updating that all the time? Is that, like, kind of a, you know, steady print? It does get updated a lot. The thing is that SQLite is a fairly small team, right? So just at the moment, our team is bigger than SQLite. Obviously, like, it's an unfair comparison for a variety of reasons. One of them being the fact that we are a cloud service.
Starting point is 00:13:49 So a lot of our effort goes into maintaining the cloud service, not just writing the software. But it's a fairly small team. So Collide is essentially a one-man show. There are other people who contribute as well. But to the best of my knowledge, there are two other people who are recurring contributors. But that
Starting point is 00:14:12 limits, of course, the amount of progress that you can make, but it still sees very active development. What's your relationship like with the SQLite team? Are they just kind of off? Especially the main guy. Is he off on his own?
Starting point is 00:14:27 Sort of not interacting with folks? Yes, pretty much. Do you have a back? Really? Yeah. We never... We would welcome a relationship. We never went out of our way to create one.
Starting point is 00:14:39 So we talked to a lot of people in his outer circle. But it never happened that we had the chance to chat too much. But for us, again, it's more on a personal level. Everything I hear about Richard Hipp is that he's a fantastic person, obviously a fantastic engineer. The result is there. So I think there is a lot for us to gain from that, but it's not something crucial. It's not crucial for a roadmap. In fact, we do intend for our fork to become even more aggressive in the near future.
Starting point is 00:15:17 Just recently, as an example, we added so far one of the features that was the most ambitious feature in terms of how much it diverged from SQLite, which was vector search. If you use Durso, for example, today on device and we have a use case that has been published on our website today about it, from a company called Kin, the website is mykin.ai.
Starting point is 00:15:42 They're doing essentially an AI assistant, fully private on device with doing vector search in a RAG model through LibSQL. Because LibSQL allows you to do this, right? Because we just added vector search natively. And you can do this in SQLite through extensions, but extensions are limited in what we can do. The way we work is that vectors become part of the query syntax. You can just create vectors and it integrates perfectly with your data
Starting point is 00:16:10 without having to install extensions, without having to do any of that. So we want to be able more and more to make those changes. We keep hearing from people that, hey, SQLite is amazing, but the way the schema changes is quite weird.
Starting point is 00:16:26 Can you make that better? So we actually already made better in some ways, but there is a limit to what we can do without making large changes. So we have it in our roadmap to make more and more aggressive changes, essentially to bring, we believe, SQLite to its full potential, the model of a database and a file to its full potential. Yeah. How important is it for you to maintain upstream compatibility versus, like you're saying,
Starting point is 00:16:54 hey, if you can do some certain things better, maybe even schema changes in a way that breaks with upstream? I guess, how do you balance those considerations? So, so there, you know, it's, uh, we have to split that question in, in, in pieces, I think for me to answer competently, because what does it mean to maintain compatibility, right? Just, uh, if it maintaining compatibility with the file format, absolutely sure. We want to do this. And, uh, in fact, if you, if you look at our vector search, if, if you look at our vector search, if you look at our vector search, obviously SQLite cannot read and do vector search on the vector search that we added. But SQLite can still read that file and those vectors will be shown as blobs. Right. But so SQLite reads.
Starting point is 00:17:40 If we import the SQLite file and you add a column with blobs, again, you can do this and still the same file format. So everything is the same file format. That, I think, is very important to us. The way SQLite is built, though, is, I believe, very weird for modern standards. I mean, it has a place in history and it does have technical advantages. But SQLite, if you look, it is a directory full of C files. And then the process of compiling SQLite actually does not generate a binary, it generates a C file. So what you do is
Starting point is 00:18:14 that you get all of those, they call this the amalgamation, you get all of those C files and you compile that into this one big C file, they can now insert into the build system of whatever tool you have. And this is a very interesting way of consuming software that makes sense for SQLite. Allows, for example, for you to maintain compatibility with every single kind of system out there. But maintaining this build system, for example, for us is not important. Now, maintaining compatibility with the language, again, it's important to support the things that SQLite does, but we want to add more. So it depends on what exactly we're talking about.
Starting point is 00:18:56 Yeah. And if I remember correctly from last time, most of the new features that you all are doing in LibSQL are written in Rust. Is that still the case? Yes. Vector wasn't, right? So, for example, there are some features and we're still seeing like what is the best way to solve this problem. I want to give a, you know, just a preview here. I don't know if you want to add the link later. But my co-founder, Pekka, has a personal
Starting point is 00:19:25 project called Limbo. And the reason it's called Limbo, it's like this state of confusion. So don't read too much into that yet. But Limbo is essentially a from scratch rewrite of SQLite compatible within those constraints that I just told you, fully in Rust. Because, for example, what we always try to do is add those features to the extent possible in Rust, and then you add a couple of hooks into the C library that are just calling into your Rust code. For Vector Search, it was so invasive, this just wasn't possible.
Starting point is 00:20:00 And some of our users, we have a couple of users and design partners, they're starting to ask for things that are quite complex. And it led us into us experimenting. This is very experimental. It's not a direction that we committed to yet. But we are experimenting with the idea of, hey, look, I mean, SQLite is actually not that big, which is why for us the idea, the, the idea of the fork was, was, uh, good to begin with. What if we just rewrite it? Uh, what if we use like a modern architecture? What if we do full rust? And the beautiful thing is that we, this is actually the first time,
Starting point is 00:20:35 Alex, the first time that I'm talking about limbo, uh, this name is going to change if, if, if it becomes an official Thristle project, but it may not. But I'm moving to the direction of making it into an official Thurso project because this is the first time I'm talking about it in public on a podcast. And Pekka tweeted about it a couple of times, and that's it. And if you go to the GitHub repository in which that is being hosted, it has over 20 contributors doing very deep contributions. One of them we actually hired to work on Torso.
Starting point is 00:21:10 You know, if anything, if nothing else, it already worked to help us find this fantastic person. But it has almost a thousand GitHub stars and over 20 contributors without us never having talked about it, right? So I think there's a lot of,
Starting point is 00:21:23 there is a future, a possible future that is growing in likelihood that this code just becomes like LeapSQL. And LeapSQL is something that started as a fork but ended as a rewrite. I love the analogy of the sheep of
Starting point is 00:21:40 Theseus. You know, Pekka doesn't read because, you know, he's not very smart, so that's how it works. So he because you know he's not very smart so that's how it works so he he did not know who these years was so we went uh we we we couldn't adopt the name just yeah yeah oh man that's super interesting and it'll be interesting to see like what what happens even just like category or description wise with with torso in you know if sort of sequel light and and limbo start to diverge a little bit, it's like they're mostly compatible for a while. But now it's like, when does that just become enough of a different thing? Yeah, that's prettyandra that was fully compatible. And over time, it diverged. And you want to maintain compatibility with things that
Starting point is 00:22:31 are coming from that database. But Scylla added many things that were not present in Cassandra. I remember that Cassandra has this big YAML file, and Scylla would still read the YAML configuration file, but it would just ignore a lot of options because it just didn't make sense. Scylla, as an example, had a lot of intelligence to automatically tune the database to your workload. And Cassandra has a lot of those parameters for the size of the heap and the size of this
Starting point is 00:23:00 and the size of that, that a user had to tune. Scylla would ingest justify it, but ignore that. And then it started like some of the features were being asked by some of our users that Cassandra just didn't implement. So we implemented first and they never did. So yes, it does diverge. But even today, for example, I suspect a user that is using Cassandra can just go and migrate to Cilip.
Starting point is 00:23:24 And this is something that binds you forever when you embark on a journey to either fork or rewrite or do things like that. And I think our story with SQLite will always be similar. I always want people to be able to get a SQLite file and ingest it in Turso. And I want people to be able to export a Turso database as a SQLite file. I, and I want people to be able to export a Thurso database as a SQLite file, I think this is always going to be the case. Once more, if you are using something that SQLite doesn't have, so that there are limits to how that happens, it gets shown as a blob or the column is disabled, then you have to
Starting point is 00:24:04 go into the specifics depending on on what's being done but the level of compatibility has to be always there yep yep absolutely um cool okay so i want to talk even just like patterns you're seeing with with tertio because i see like a lot of like interesting patterns i mean some are just like straightforward application patterns but now you can use equalite and some are like hey new patterns that you can't do with other databases. So just in terms of like what you're seeing with folks, I guess, you know, one of the original downsides of SQLite is, hey, if you have multiple compute instances or something that need to access it, it's going to be tough because it's usually going to be on a local machine. But now you have the client server model. How many of your customers are using it that way
Starting point is 00:24:47 as just sort of like a replacement for MySQL or Postgres, but even still have a similar three-tier architecture or something like that? In terms of numbers, so if I go look in terms of number of users, this is the most common use case. We have a very generous free tier, so people show up. We have a hobby plan that costs $9 a month. And in fact, by the end of the year,
Starting point is 00:25:12 again, another announcement here, by the end of the year, we have all the intention. We'll see if we can make it happen, but we actually want to reduce that price even further. And so if you go look at that cohort of users, I think a lot of them are using this as really just a SQL database. And it could be MySQL, it could be Postgres, it could be whatever, but it turns out it's SQLite. So the advantage is there is that it's super simple. You can still do the local testing, right? You still retain that.
Starting point is 00:25:41 You can commit the SQLite file to your Git repository. Whether or not you actually should do this is up to you. But you have this thing that works locally, that you develop locally, that you commit locally, and helps you in your CI tremendously. But then, as you said, it has those disadvantages that I cannot access over the network, as you said, it has those disadvantages
Starting point is 00:26:05 that I cannot access over the network. And then Thruso fixes those disadvantages because Thruso is accessible over HTTP. That's always the case. So you can send an HTTP request, the request comes back and there you go. And then this is backed up by the Thruso service. So you have your backups,
Starting point is 00:26:23 you can restore to any pointing time in the past. The more you pay us, the more retention you have. So that's a difference in the plans. But even for the free plan, you can always roll back to any instant in time in the last day. So you have that. And you can have many API services connecting to the same database because you're connecting over HTTP. So there is no disadvantage. So there's a lot of users using that.
Starting point is 00:26:51 In terms of numbers, this is the most common use case of Thurso. In terms of revenue, that's not the most valuable use case of Thurso. That will be a different one. Okay, sounds good. So let me talk about, as I was sort of looking at your site, and I have just like seen your marketing over the while, the second one I think of is like embedded replicas, which is, you know, I have a central sort of SQLite somewhere, but then, you know, maybe each, each, each of my users mobile device is replicating some subset of that data likely onto their machine, or maybe I just have different instances around in different places that has like a specific customer's data
Starting point is 00:27:28 or something like that. So tell me about embedded replicas and what you're seeing, what that enables there. So embedded replicas, first of all, it was a great description. Congrats. I think it shows that, you know, at least our message is getting across.
Starting point is 00:27:41 There was only one thing incorrect with what you said is that, well, it's not one thing incorrect with what you said. It's not even fully incorrect that we replicate a subset of data. The embedded replica is a full copy of a database. But the reason this is actually not even completely incorrect is that another pattern that we see a lot in Turso, and it is, in fact, the pattern that we see the most value being derived from Turso is multi-tenancy. So Turso allows you to create 10,000 databases on the scalar plan, unlimited databases on the pro plan. So you can just create as many SQLite databases as your heart desire. And that means that you can create, you know, just segregate the data that you have in multiple
Starting point is 00:28:27 databases. And if you think of your data set as the collection of those databases, then yes, like you can have a user database and then you replicate the data for that particular user to their mobile device. So in that, you know, from that vantage point, you are replicating a part of the data. But from the point of view of a database, we replicate the entire database. Gotcha. Mobile device is one of the use cases.
Starting point is 00:28:52 The other use case is just like if you do this, for example, you no longer need a Redis cache or anything like that because now we can have a database, a copy of the database inside the server that you're accessing, right? And you can have microsecond level reads into that database. So we see a good mixture of both. I just mentioned in the beginning of the episode, this use case that we were publishing today with Kin, and their use case will be more of a mobile sdk in fact today we are announcing uh we didn't have uh we had conceptually the idea that you could replicate your mobile device and some very stubborn people doing this but one of the things that we're announcing today uh in in our launch week is the mobile sdk so as you see there would
Starting point is 00:29:43 be we announcing a ios native client, an Android native client, and community-contributed React Native and Flutter. So you can build mobile applications like that with SQLite as just a file if you want. That's fine. Or with the ThrusoSync engine, right? So you can have a replica of the data in your mobile device. Gotcha, gotcha.
Starting point is 00:30:04 And so just so I understand, most people are combining that sort of like database per tenant feature with embedded replicas. So they're just replicating their customer, like that specific customer data down? Or do you see people that, hey, maybe it's a smaller database that they need to get and they'll just replicate the whole thing for sort of everything? What patterns are you seeing? I don't know if it's most people. It's certainly not most people because as I said, the majority of users
Starting point is 00:30:29 use Torso in a very simple way. The high profile use cases tend to mixture those things in one way or another. Yeah, yeah. I know, okay, so the database per tenant one is like super interesting.
Starting point is 00:30:43 I remember being like really excited about that when that came out. And then I saw a few folks like Dax, I would say Okay. So the database per tenant one is like super interesting. I remember being like really excited about that when I came out and then I saw a few folks, uh, like Dax, I would say, say like, Hey, one thing that's tricky about this is if you have 10,000 databases, what happens if you have a schema change or want to do some analytics? As I see, like y'all have some stuff for that. I guess like what, I guess what's your answer to that? How much did you have to build or support around, around those use cases?
Starting point is 00:31:05 Definitely an account to follow. Definitely a person to pay attention to. But I don't think it sees the full picture in this case. And when somebody smart doesn't see the full picture, usually what that means is that you're missing context on the use cases. So I'm happy to give that context. For the most interesting cases we have, and this is not an issue at all, the reason for that is that in the most interesting cases that we see of multi-tenancy,
Starting point is 00:31:52 my user's user is the one driving the schema and not my user. So let me give you a couple of examples of places where Truso is doing super well. Truso is doing super well with AI application builders. So in fact, it's clear and clear that Drusso is becoming like the database of choice for AI builders. And why is that? If you're building an application, and this is just an example, of course, of what you can build with AI. If you're building an application where, in fact, let me refer to a use case that we published yesterday on our blog on Monday. So on Monday, we published this use case with a company called Adaptive Computing.
Starting point is 00:32:38 And Adaptive Computing is essentially allowing you to generate the full application from a prompt. So the website is ac1.ai. You can prompt something and say, hey, give me an application that does this. And they're not targeting technical people. They're targeting people like my wife, who's a baker and wants to build an application for bakers. She doesn't code. So she might want to prompt the entire application. And look, every single one of those applications is completely different. They don't have the same schema. They don't need to have the same schema. But then
Starting point is 00:33:11 as the user prompts, to be a full stack of a fully functional application, they need a database. And Thurso is just providing those databases. And the reason Thurso works fantastically with that is that those databases, by being a file-based database that is fully serverless, doesn't have code starts and all of that, look, you're not going to beat the cost of this. Even the databases that are just doing scale to zero, it's hard to compete for databases that get
Starting point is 00:33:43 used sometimes with this idea of, hey, this is a file. If you query, I serve your file. If you don't query, it costs nothing. So we're going more and more in the direction of massive multi-tenancy with the Thruso platform. And look, other example, you want Claude or Devin or any of the Terminator prototypes out there to be generating code for you and you're going to execute that code automatically in your shared database. Good luck with that. No way you're going to get Devin direct access to your production
Starting point is 00:34:19 database. But in true so, you have all of those databases, you have a bunch of databases, the databases are completely split. The schema might be defined, again, by the application. So you let the application do, you let the sub-application do whatever they want. Note that I'm not saying that Dex is wrong. I'm just saying that he misses
Starting point is 00:34:40 the context from that. The thing I would say is, like, this is, like, such, like, a new use case. Like, I would have just never thought even how to do this in just, like, Postgres or MySQL or something like that. So then the fact that this is possible, like, it's possible people are using it for things that you just, like, wouldn't have imagined before,
Starting point is 00:34:58 you know, in some ways, so. Yeah, like, we have people creating, like, databases by the second on Thurso. Just, hey, something happens. An event happens here. Here's a SQLite file to back up the database. And a lot of the application code that is connecting to the database, again, is automatically generated application code. It's a problem very similar to big data.
Starting point is 00:35:23 And I remember we talked a lot about it when I was at Scylla. The issue with big data is that the unit of data, like the megabyte of data in big data is worth nothing. The data is only valid because you have a lot of it. And now this is how big data works. Example of something that people do with big data, like session tracking on websites, right? The fact that the user clicked the button doesn't mean anything. I mean, that information has no value. If you lose it, nobody will notice.
Starting point is 00:35:59 But the fact that you have a million users and 70% of them click on that button, that starts to have value. And the use case here is very similar. Each of those databases, we've been calling them more and more like ephemeral databases. Each of those databases that people create on Thruso, they're not worth a lot. They're disposable. They're ephemeral. They're going to go to the trash. Most users will never go back to them.
Starting point is 00:36:24 But the fact that you can now have a million of them, that's worth something. So that is a use case that we're seeing a lot. Now, again, as I said, DAX is not wrong. And for SaaS applications that want, still like I have a shared schema and I want to just segregate the data per user, either because it's simple at the query side
Starting point is 00:36:46 or because I want to encrypt this data with different keys and I'm doing like some medical thing or financial data. That happens. And what we try to do for those use cases is build the tooling to make this problem less of a problem. I think this is always going to be a part of the trade-off. A part of the trade-off is like, yeah, now that you split your data into multiple databases, you have this trade-off to make. But we have, for example, schema databases. When you have a
Starting point is 00:37:14 schema database, you can connect child databases to the schema database. Then you change the schema of the schema database, and that is pushed down to all of the child databases off that schema database. That's one way. Now, analytics, we honestly, at the moment, don't have a great solution for. We recommend users to just write twice, which again, is a fairly common pattern, actually, and have a separate database where you do your analytics. This database might not even be through.
Starting point is 00:37:41 So in fact, DuckDB is a much better database for analytics. So if you want to keep everything local and trendy, like you can have your data duplicated in both Thurso and DuckDB, and then you do your analytics on DuckDB. Yep, gotcha. You mentioned like sort of automatic or like incremental updates to S3. You mentioned something with WebSQL earlier of backups
Starting point is 00:38:02 or something like that to S3. Is there a way to like either get change data capture or those snapshots so that I can feed those into my analytics systems very easily? Or what does that look like? We have some form of change data capture in Thurso. It needs more work, I think, until
Starting point is 00:38:17 we can stamp it as a feature that we're behind. But we have a change data capture feature. Yep. Okay. Cool. Okay, cool, cool. Last sort of like use case I want to ask about that I see about and especially saw a while ago is just like, hey, this edge database idea, right? Where I have my database somewhere
Starting point is 00:38:36 or I have users in US East 1, but I also have them in Brazil and in Toronto and Singapore and all these different places. So I replicate them out that way. Is that still something you're seeing a lot of? I guess, where do you stand on that use case? Yeah, we see that a lot. In fact, we don't talk too much about it these days because I think the concept of edge became very confusing for a variety of ways. It really did. Yeah. If you go
Starting point is 00:38:59 look at the definition, again, if you go look at the definition, what we do with embedded replicas is edge, right? Because we're pushing the data all to the mobile device and you have those copies of the database that you control. So this is a form of edge, but especially in the JavaScript community, people came to understand edge in a different way, tied to specific JavaScript runtimes. So we toned down on the message of Edge a little bit. While you're describing this something that today on our platform we call global replication, and global replication is something that we still do. And again, the advantage there is this, that you can have SQLite being so cheap and so easy to operate, those replicas become quite affordable. So you can have those replicas across the globe and we provide them in our infrastructure.
Starting point is 00:39:52 But honestly, we see less and less people using this because in many situations, like if you're using a platform like Cloudflare or Vercel that are serverless platforms, you need something like this. But for the people running these on servers, they actually get more benefit by replicating the data inside their server, which is the last mile instead of almost the last mile. So as we announce embedded replicas
Starting point is 00:40:19 and as we saw more people using this in platforms that are not serverless, I actually think this is becoming, little by little, the preferred way in which people do replication with Thurso. But we still have global replication. That's still a use case that we see a lot. Gotcha. Okay, you mentioned earlier that on the scalar plan, you can do 10,000 databases or something and then unlimited on higher plans. I guess like what are the axes you charge on in terms of like 10,000 databases for $9 a month or whatever? It sounds wild. I guess like what is costly for you?
Starting point is 00:40:54 Is it, it must not be the actual database instances themselves, it's the replication. I guess like what, yeah, what do you sort of cost on? Yeah, so a lot of people, first of all, the cost doesn't matter, right? Because at the end of the day, you need to have a package that people want to buy. And a mistake that I think a lot of people make is think from the costs up and think, oh, this cost me this much. Therefore, I have to put a markup of 30%. This is not how any business should work.
Starting point is 00:41:25 But understanding the cost is obviously important. So neither here nor there. The answer is in the middle. But some people think that, for example, a database costs us nothing because it's just a file. And that is also not true. Because you have, for example example you have independent backups for those databases and different uh as you know as true is it's all the same bucket but like different directories within your s3 buckets and you need to manage those things so there is a cost for us to
Starting point is 00:41:56 get a database but this cost is low enough uh so that uh when you when you're talking about the pro plan for example that costs 500 a month and and is targeted towards people doing those things in production, we can just say, look, maybe if you create a billion databases, we're going to have issues. But that's fine. If you're creating a billion databases, you're probably using the platform a lot in other metrics as well, and it will work out. But for people, you can create a million databases and that cost is still not a cost that will bug us too much, given the fact that you're already paying a premium by being a subscriber of our program. Then you have storage. And the thing that costs the most is the compute, and that's true across every database, compute and memory, which is why our model works so well.
Starting point is 00:42:46 Because I can respond to that one single request and only account for whatever you use for the one single request. And by the way, in our platform, we don't talk about compute and memory. We talk about rows that you touch, which is something very easy for a user to understand. It's rows that you've written, it's rows that you've read, reads are way cheaper than writes, reflecting the fact that SQLite is a database that reads a lot better, reflecting the fact that for every write, if you want to do pointing time restore, you have to keep those copies and do all of that. So those are priced accordingly in a way that like reads are cheaper. You touch a row in SQLite, you pay for it.
Starting point is 00:43:31 You don't touch it, you don't pay for it. And there you go. Yep, yep, very cool. Okay, I want to, you know, since you've been, you're not only like working for a database provider, but you worked in the database space for a while. I want to talk about just like the market right now and what you're seeing out there. I guess like I want to start off with vector search because you all added that to SQLite. How hard was that
Starting point is 00:43:52 to add? How different is that from like sort of the traditional things that databases do? And yeah. It's not, it wasn't very hard, which again, like my view, for example, taking a step back, I, again, I'm obviously biased with that, but I'm not a big believer in NoSQL anymore. I think that is one of the reasons I left Scylla and started my own journey. It's not that NoSQL doesn't matter. If you are selling a NoSQL database,
Starting point is 00:44:24 it will obviously respond to me with, but my revenue grew this much this quarter and et cetera, which is probably true. But what happened is that 10 years ago, everybody was looking at NoSQL. Whatever the equivalent of early 2010 DAX would be, would be looking at NoSQL. In fact, I think DAX was working with NoSQL. I discussed this with him the other day in 2010. So everybody was looking at NoSQL and there were reasons for that.
Starting point is 00:44:56 And a lot of people who are cynical think the reason is just hype. It's never just hype. If it's just hype, it doesn't go as far. I mean, NoSQL went pretty far. But there were reasons that led people to look into NoSQL. And as a quick summary, I think the reasons were just that hardware just wasn't powerful enough. You had to scale out every single mom and pop shop website if you wanted to have any resemblance of,
Starting point is 00:45:25 unless you were doing something just for like your local church or club or chess club or et cetera, anything that sells anything online, we need some form of scale out. So you would reach out for no SQL. And, and, and that is not the case anymore.
Starting point is 00:45:40 Right. It just, you know, you can do a couple of terabytes of data on SQLite and that's fine. I think it's part of the reason why you're seeing this resurgence of SQLite, which doesn't mean that they're not use cases. They're so huge for which a NoSQL database is so much more
Starting point is 00:46:02 specialized that you see large advantages of using that. So I think it becomes a fringe. I view vector search the same way. The other day, one database, by the way, that I love in the vector space is called TurboPuffer. I don't know if you had the chance to meet Simon. Is that like the S3-based one? S3-based and et cetera.
Starting point is 00:46:26 Like they have this this beautiful beautiful architecture. Again, the engineer in me just reads that and smiles. It's a beautiful architecture. It makes perfect sense. Simon is an unreasonably smart person.
Starting point is 00:46:43 I'm seeing that they're succeeding and it's great. And the demand is clearly there. But I think it's going to be similar to what we saw with NoSQL. Most people using Vector Search, most people using Vector Search or any kind of application that you're building are probably better off without a vector database. If vector search was incredibly hard to implement, then vector databases will be justified. Vector search is not very hard to implement. Again, most of our problems were how to make this work with this very unique
Starting point is 00:47:22 and ancient SQL-like code base in C. It was not like how to implement vector search per se. So it's the commodity issue. It becomes a commodity. Every single database out there these days has vector search. Having vector search is no longer something that you can differentiate on. But I believe there will still be those use cases for which you have so many vectors and you have such precise and interesting requirements that a vector search database is just fine. Yep. Yep.
Starting point is 00:47:55 Oh, man. OK, you totally sniped me because I was also going to ask about the future of NoSQL, given your work on Scylla. And because I've been seeing sort of the same things like hardware is getting so big. There's also just a bunch of distributed SQL options. Yeah, that's interesting. I do want to sort of the same things like hardware is just getting so big. There's also just a bunch of distributed SQL options. Yeah, that's interesting. I do want to talk about the vector store. Like, I think you're probably right. And it's not super hard to implement. But what about just the sense that it requires, you know, almost like full text search,
Starting point is 00:48:15 it requires a different set of resource requirements than your traditional OLTP. You don't think so? It's so like vector search is way easier than text search. In fact, it's just not that hard. You don't think so? It's so like vector search is way easier than text search. In fact, it's just not that hard. And there's nothing like the way, again, the way we implement vector search in LibSQL, it's a column where you add the vector. So you can have a, here's how it works. You can have your SQL columns, right? And then you, it's a text, it's an integer or whatever,
Starting point is 00:48:52 or is a vector. That's it, right? And what does vector search means? It means that you're going to have, you are given a vector. What is a vector? It's something that one of those AI models generated. So the model generates a vector. So here's an example.
Starting point is 00:49:11 You have a database of movie synopsis. So for every movie synopsis, you get the synopsis, gives it to OpenAI or Claude or whatever. And Claude or the model will spit a vector back to you so you will store in your database this and this like this this is the metadata this is the name of the movie this is the etc uh and this is the vector uh and then and then you're going to do this for every single synopsis then i user types on a prompt something like hey hey, give me a movie that has Scarlett Johansson and she's flying. Whatever ridiculous example you can come up with. That generates a vector.
Starting point is 00:49:53 Now, what you're going to do is that the brute force method is that you're going to go over every single one of those vectors. And using a mathematical function, which is very easy to implement as well, like you do it in a couple of hours, using this mathematical function, you will see which vector is the most similar to the one that you gave. And then you say, okay, so now this is all the metadata that I know about that movie. And now you found the movie and everybody's happy.
Starting point is 00:50:20 Now this is slow because you're doing a brute force search, but look, what do I mean by slow? You can do like maybe 100,000 of those searches with brute force in 200, 300 milliseconds on Leap SQL, right? So that already covers like so many of the AI applications that we're going to see. But sure, it is slow. So what you do is what you do every time you want to. What do you do, Alex, if you have a column that is not your primary key and you want to search for that column? You create an index, right?
Starting point is 00:50:50 So there is an index that has more or less the space distribution of those vectors. And that index just gives you that in a faster way. So that's it. That's vector search. I mean, there's nothing magical there. But aren't those i mean both like the the embeddings themselves i feel like in some cases they're going to be bigger than the rest of the row altogether you know if you're talking about like a 2000 dimension
Starting point is 00:51:13 vector um that's gonna be like fairly large and then also just um you know indexing there there are indexing methods but they're not quite as like i would say just like naturally suited to things as like a b-tree is for for other you know so oh yeah sure uh the the index is a specialized index but again it's not a very hard problem right yeah and it's like a memory intensive isn't it but isn't it like a resource intensive one i yeah you so i guess my my thesis and again i'm further from it than you are was just that it's it's more like full-text search just in terms of the resource requirements of like hey it's harder than just a straight up b-tree you know but you don't think it's that it's like actually it is harder it might be so so i i want to add a a comment to what you
Starting point is 00:52:02 said so yes those columns are those columns can be big. Turns out there is a theoretical result that is fantastic. And the Nobel Prize in Physics was recently awarded to people in computer science, which is amazing. And I wish we would understand at some point we will better why it is that way. But there's a theoretical result that's super interesting that like you have those vectors with like 700 dimensions. So 700, what is a vector? It's just a bunch of floats. Every float is something like four bytes. So, yeah, I mean, again, it's not gigantic.
Starting point is 00:52:42 The vectors are not megabytes, but they are decently sized. The interesting result is that if you replace those four bytes with a single bit, it works decently well. Interesting. So all you need, and we actually announced yesterday the GA of our vector search that has been beta so far. And one of the features that made it into the GA that wasn't present before is this. It's called one-bit quantization. So one-bit quantization is amazing. It is a mind-blowing result that essentially tells you that at the end of the day,
Starting point is 00:53:20 all you need is a bit. And the result, if you're storing your vectors as a bit, is decent. But if the model is aware of the fact that you're going to store it as a one-bit quantity, it gets even better. And what does that mean for the database to support one-bit quantization? It means that the function that you use to look for the similarity is not the same function that you would use to look for similarity.
Starting point is 00:53:52 So it's a different mathematical function that you use to scan and match for similarity. The model, it's best if you use a model that is aware of 1B quantization. But if you are using 1B quantization, this is actually a super tiny column to begin with. So it's not the end of the world. Okay, that's really good to know, because I feel like I've been trying to figure that out. And I hear people talk about quantization, and I'm just like, well, how do you do it?
Starting point is 00:54:22 How do you actually shrink those in an accurate way? also like, how much do you lose from it? Yeah. Quantization is like, go, go over, go over your vectors. If it's positive one, if it's a zero or negative zero, that's it. It's like that simple. Okay. That's simple. And, and you lose something, but you don't lose that much is sort of like what you're saying here. And, and if the model, and if the model is aware, you lose even less. Again, and more importantly, the brute force scan takes you quite far. You need to get to a couple of millions
Starting point is 00:55:00 of rows before... Our indexes until that point, the indexing is actually slower than brute force, right? Especially for small vectors like on the 1B quantized vector and things like that. So look, again, I'm not saying that those databases are not useful because again, same as big data, can you store a petabyte of data on SQLite? You can, but on Postgres, you may be able to. I mean, Postgres is pretty powerful, but would I recommend you to store a petabyte of data on SQLite, you can't. But on Postgres, you may be able to. I mean, Postgres is pretty powerful.
Starting point is 00:55:26 But would I recommend you to store a petabyte of data on Postgres? You're not going to even find a petabyte device to store the data on. So like go use Scylla, right? Vector search will be a similar thing, you know, at those very, very large use cases or use cases with very strict requirements of how much you need to take to search for a vector. Like when things are really strict or big or both, a vector database probably is better. But it's the same situation as NoSQL in which, but the difference is that NoSQL had its time in the sun where everybody was reaching for NoSQL, and I think vector search will not.
Starting point is 00:56:09 And the reason for that is just because it's a lot simpler to implement than a time series database, right? The time series are very, very hard to implement. Vector isn't really. Interesting. Okay. And then, so you mentioned the LM models that aware of that you're doing this quantization. Do you have recommendations we have that we've been working with them for a while. And they generate models that are one bit aware. And again, the recall is pretty good.
Starting point is 00:56:54 And the recall in vector search is essentially like, look, if you do a brute force search, you will find the most precise vector that matches your vector. The indexing is not like a B-tree. This is a difference. Like vector indexing is approximate. So the algorithm is called ANN,
Starting point is 00:57:14 which is approximate nearest neighbors. So you will not necessarily find the best vector, right? So the higher the recall, the higher the chances are that you will find the best vector so if you have a data if you have an index before recall at 90 90 of the time it does find the best vector uh and if not it finds a a vector that is within a boundary uh of of the best vector uh so this this is the metric that people are usually looking for when talking about, which is another thing.
Starting point is 00:57:47 I mean, for a lot of use cases, you don't need a very high recall, right? Just if you have 95% recall, that's okay. So vector databases may be better. They likely are better to get you to the less mile of recall. But as I said, the algorithms are all the same. They don't have proprietary algorithms to do this indexing. Not very hard to implement.
Starting point is 00:58:14 Yeah, yeah. I'd be curious about the recall on a brute-forced one-bit quantized model as compared to well as compared to like the full you know four byte one like what like if i'm hey i get a brute force and i get to actually look at everyone because it's actually like very small but i've quantized it down to one bit is that do i am i still getting like really high recall like that'd be a cool uh comparison to look at that yeah yes yes yes yes and I think the reason for that is that, again, now I'm going to go on a tangent
Starting point is 00:58:50 here. I'm not a specialist in the math of all of this, but the impression I have is that the space of vectors is so sparse, right? And I think this is why 1B quantization works, that the space is so sparse that you don't
Starting point is 00:59:07 have a lot of vectors. Think about it. If you had a lot of vectors clustered together in the same point, you will not be able to do 1B quantization with the accuracy that we do. So that was my intuition, is that, yeah, if you will do 1B quantization and you do a brute force, you're going to get pretty awesome results, especially if your model is aware of that. Because probably what the model that is aware of that is doing mathematically is making the space even sparser and making sure that the classification lines are sharper. So, again, not the words of a mathematician here, but... No, that's probably right.
Starting point is 00:59:45 Because when I think of how they do that nearest neighbor stuff, I just think of a two-dimension thing and try to do lat-long. And I'm like, well, if all of them are in New York City and you quantize those, you're going to blur it a lot. But really, it's like, hey, we're talking 700 dimensions. It's like you're way out in space, way somewhere else. It's like you're not even in the same universe as some of these other points. Okay. And some of those models have thousands of dimensions.
Starting point is 01:00:08 I think the biggest dimension that we support is 16,000, right? So it's supported. But most of the... Because remember the machine learning is older than the current version of generative AIs and LLMs. But the models that LLMs use,
Starting point is 01:00:24 which is what everybody's using today, they generate less than a thousand dimensions, if I recall correctly. But you can have vectors with much more than that, like with tens of thousands. And at that point, I think the space is just so sparse that that's why those things work. Yeah, yeah, okay, wow.
Starting point is 01:00:39 This is, I feel like I've been asking and reading around about this for a long time, and this is by far the most convincing argument I've heard that like, hey, this is really going to be within just like general purpose databases rather than specialized stuff. So thank you. Thank you for that.
Starting point is 01:00:54 My pleasure. I think on this though, like this is a, you know, the blog post that y'all had yesterday about the vector search and some of those details around that is like super effective, I think, for like reaching developers and just talking about, I guess like, how do you all think about
Starting point is 01:01:10 marketing to developers? Because I think you have done a good job about that on a few fronts, I guess. Like what have you learned in your last couple of years at Tershow doing this? Yeah, I want to call out this person by name, as I said, which is one of our advisors, Adam Franco, probably one of the best startup advisors for developer-focused startups.
Starting point is 01:01:29 And honestly, like I said, the lesson he gave us is just be yourself. And two things that he told us, and those things may appear that we don't know what we're doing, but we're actually doing those things by design because this is coming from advice that we trust a lot, and it turns out that it worked. One of them we went through, but for the people who tune in now and didn't get the beginning of the episode, it's like, look, a lot of founders, they are aware of the fact that they don't know business. They are aware of the fact that you're new to being a CEO you're new to managing a company, you're new to the process of sales so you start talking into fluffy marketing you start talking into a language that you as a developer would never buy from
Starting point is 01:02:17 but you forget this because you're going into this so just don't forget it and understand it just be yourself. And the other thing that Adam told us, and I think this, again, was transformational for it. As simple as it is, and I view as a corollary of this, which is developers don't like to follow corporate accounts. Ask yourself how many corporate accounts on X or whichever social network you're very excited to follow. Zero or close
Starting point is 01:02:50 to zero, right? Yeah. It's like Superbase, maybe, if you like memes. But mostly not. They're good at that. I'll grant you, Superbase is super... They're the king of memes. And it works. It's great great uh again it does work
Starting point is 01:03:06 and then you have like those mcdonald burger king accounts they're fighting each other as well sometimes which is fun it's great it's okay but like you're not looking forward to that right uh what what and and the reason for that is that it's hard for developers to trust a corporate account the assumption is that the corporate account is lying because likely it is uh so what you do is is they said like just uh just uh you know be yourself and and you go and you talk about the product and what what happens naturally i mean this is not something what happens naturally and i think the reason that this works is that and you know i have i have my my Twitter account for a long time. I'm not making an effort now to be more popular or less popular.
Starting point is 01:03:53 It's just that I use it more because I talk about torso more. But the proportion in which I talk about other stuff is about the same. And so, I mean, it's my account. I talk about a bunch of stuff. I talk about controversial stuff is about the same. And so, I mean, it's my account. I talk about a bunch of stuff. I talk about controversial stuff. As you know, I recently started a podcast about religion. I was an atheist for 20 plus years. I no longer see the world this way.
Starting point is 01:04:16 And I posted about it, got like 100,000 views on that tweet, right? This is me. This is not a facade this is not like and because it's my personal face that you know even a liar there are limits i guess to unless you're sam beckman fried or something like that but there are limits to which a non-psychopath person will go so so a developer knows that this is true and then when i talk about torso like if i am saying something false if i am saying something like this reflects on me personally, right? So I'm honest about it.
Starting point is 01:04:49 And honesty is really what developers are looking for. Like I talk about Turso, I will highlight the good things about Turso. But when Turso has an issue, we'll acknowledge it and we'll talk about it openly because it's me on the line. So I think this is what has been working super well for us. Pekka is on Twitter all the time as well. He usually has threads that are very successful in which he's talking about general concepts, about latency,
Starting point is 01:05:16 and he's talking about things that people want to follow. I am more personal than Pekka. Pekka's presence on Twitter has always been focused on like technical subjects. I do a little bit of that as well. But again, I, I, I joke on Twitter, you know, I'm funnier than Pekka just because it's Finn. So, you know, he doesn't have those like emotions and things like that, that we do. So, so I I'm joking all the time. You know, he's talking about those technical things, but every now and then,
Starting point is 01:05:45 like, torso is naturally inserted on those conversations. And I think people just like that. So, you know, get this as a free advice that I got for, for a lot. You're getting this for free.
Starting point is 01:05:58 Yeah, for sure. I would say like, be technical. Like you're just going to nerd snipe a lot of people. Like if you're, if you're marketing to devs, I think they like that and be, be a real person like i think that's all
Starting point is 01:06:07 right you mentioned uh your podcast the save file which which i love i think it's really good um i just listened to the episode with josh siri and like aaron francis talking about hannah and like all like there was some really good episodes strongly recommend that uh i guess like what have you learned from from hosting that podcast well what have i learned from, from hosting that podcast? Well, what have I learned? I learned a lot about people that I knew about a side of them that I just did not know. Which was a, so a lot, a lot of those people, like a lot of those people, I just did not know where they stand in, in that line of like, whether you're not a religious person or not. And the reason I did not know is that I, you know, it's just not something that matters for our profession. So you don't ask, but like life is more than that.
Starting point is 01:06:50 Like you want to have a personal connection with people. So I appreciate having that connection. Like this episode with Aaron was the one that that's done the best to me where he is talking about losing his daughter. And like, it's such a, such a, you know, developing this connection with, with people that I truly, truly appreciate and like and respect for me has been the highlight of that. It's super hard to
Starting point is 01:07:12 run, as you know. It's super hard to run the show like that and the logistics and I have a company to run. So it's not, unfortunately, something that I can devote a lot of time to. I've been slowing down the episodes because there's a lot torso things for me to do. But I'm not going to say I learned how hard it is because I had already been warned by people. It's just super hard to run. It's a lot of work. Yeah.
Starting point is 01:07:38 Yeah. Very cool. Well, I love the story. I have one personal question to ask you. In the most recent episode, you said that you lived in Moscow for five years. I did. Did you grow up in Brazil? Is that right?
Starting point is 01:07:48 I grew up. I moved around, man. Tell me how you got to these. Was it just like personal preference? I want to go check out these places. Was it a job? I guess, how did you get to these different places? Well, again, I was born in Brazil by birth.
Starting point is 01:08:02 So that was just... I shared this before, so I'm just going to share it again. I had a lot of trauma from that. I'm just not a person that connects with that culture a lot. So my childhood and my teenage years were very, very, very, very rough. And I never really had a true sense of belonging. So I moved to Moscow essentially because I had a job offer,
Starting point is 01:08:26 but I had a job offer because I was looking for ways to move to just somewhere else. And Moscow was just where I got this opportunity. I mean, I looked, I was like, look, I want to go anywhere.
Starting point is 01:08:40 Like I just, just get out of here. So I got this opportunity. I wanted that. That wasn't my first choice, but like I got an opportunity there. So I got a job offer and moved. Uh, and, and, uh, it was great in a lot of ways. I lived there for five years.
Starting point is 01:08:55 And then when the war started in 2014, I decided to leave. Uh, I did not want to go back to my home country for the same reason that I, you know, that led me to leave. So I just at that time, I lived for around six months as a digital nomad, like going from place to place. So we actually spent quite a large amount of time in Paris because my wife used this time to become a pastry chef. She's been to the Cordon Bleu in France. I gained a lot of weight at the time, you know, and we spent some time in South America, in Colombia, in Chile. I became a permanent resident of Panama.
Starting point is 01:09:34 So I lived in, you know, and I applied. I was trying to get a visa to go to the US, but at the same time applying for permanent residency, the equivalent of the green card in Canada. I expected this process to take a long time. So I was actually setting up shop in Panama. I was going to be in Panama as, that was what I was intending to be for like a year or so, but we got lucky and the Canadian permanent residency for us was processed in five months, which is insane. I mean, those things, I know people for whom it took years. So we came here and I initially to Toronto in 2020. I moved to a small town called London, which is where I am right now.
Starting point is 01:10:21 Okay. Okay. Sounds good. Yeah, that's a cool story. Yeah, I heard you say that in Moscow. I was like, man, I wonder how he sort of went to all these Sounds good. Yeah. That's a cool story. Yeah. I heard you say that in Moscow. I was like, man, I wonder how he sort of went to all these other places. Yeah. Oh, it was great.
Starting point is 01:10:30 I mean, you have to understand that I was there as... I was never there with the intention of immigrating to Russia forever. If I had that intention, I would probably, I don't know, just go hack some NSA servers because that's one way to immigrate to Moscow, right? So a lot of things are very different when you know you're in a place temporarily versus
Starting point is 01:10:56 permanently. Very cool. Well, Glarber, thanks for coming on. Again, I love following you on Twitter. I think you're a great follower. I've learned a lot and you're quite funny. Congrats on the launch week. You know, Vector Search yesterday, Mobile SDK today. Got a couple other things coming up in the week. If other people want to find out more about you, more about Terso, where should they go?
Starting point is 01:11:17 I'm on Twitter, however you call it, at GLCST. It comes from Glauber Costa. So G-L two first letters of Glauber C-S-T the consonants in Costa. And you can find Turso at turso.ai or turso.tech. So just the
Starting point is 01:11:37 turso.ai is easier, shorter and sweeter. So just go get us, go follow us. We also have a Turso account, Turso database, a Discord channel, which is tur.so slash Discord. That's the short link. So, you know, our community is almost 5,000 members strong now. We have a lot of traffic and always happy to chat with anybody. Cool. Thanks for coming on. Best of luck to you and Turso going forward.
Starting point is 01:12:02 Thanks so much for having me, Alex.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.