Software at Scale - Software at Scale 49 - State Management with James Cowling

Starting point is 00:00:00 Welcome to Software at Scale, a podcast where we discuss the technical stories behind large software applications. I'm your host, Utsav Shah, and thank you for listening. Hey, welcome to another episode of the Software at Scale podcast. Joining me today is James Cowling, the founder of Convex, a global state management platform for web developers. Welcome. Thank you, Seth. Nice to be here. Yeah. So before this, we worked together at Dropbox for a little bit where you did all sorts of stuff. And the one flagship project that I remember, of course, is Magic Pocket, right? pocket right moving all of our data from aws to dropbox's like own in like on-prem data centers so and and before that you did like a phd in distributed systems and like distributed file

Starting point is 00:00:53 systems and stuff so can you tell us a little bit about your background and how did you get into computer science and the kind of problems that you did earlier on in your career yeah absolutely so yeah so i was at dropbox for eight years and we have left there. But so before Dropbox, I've always just been interested in building stuff, interested in engineering in general. I don't necessarily mean just software engineering. I just like building things with my hands,

Starting point is 00:01:17 anything practical and pragmatic and got into computer science somewhat accidentally. It's like working with computers at school. And I think in Australia, when I got into computer science somewhat accidentally. It was like working with computers at school. And I think, you know, in Australia, when I got into tech, it wasn't actually very glamorous at the time. You know, it wasn't a lot of demand for computer science degrees. I just went into it because it was fun and then ended up just staying with the program.

Starting point is 00:01:41 So I ended up going to grad school, went to MIT, did a PhD in distributed systems with a woman called Barbara Liskov, who was, I think, the first woman in the world to get a PhD in computer science and one of the all-time great computer scientists, the Turing Award winner. So I learned a lot from Barbara. And I think, really, if I was to think what the number one thing I learned from my time in grad school

Starting point is 00:02:07 and from working with Barbara particularly was the value of abstractions and I just use this word all the time but I think an abstraction is something that encapsulates a problem space so an abstraction is you take something complicated and you wrap it in a box

Starting point is 00:02:21 and you have an API on top of it and the abstraction allows you not to have to understand what's going on below the surface and so a high quality abstraction allows you to build composable systems allows you to build relatively complex large-scale systems in ways that are that are feasible and i think it's just not um readily easily possible to build large scale systems without having a good understanding of the value of abstractions. And so if there's anything that's linked my kind of career, it's that caring about that area. And I've worked a lot in systems, storage, data centers, databases,

Starting point is 00:02:57 protocols, consensus protocols. I worked a bit in the crypto scene for a while, and now I'm working in web dev to a large extent. And so, you know, you might ask, like, why the career change? But I don't really perceive it as a career change. I see it all as applications of the same ideas, and we can get to more detail there. Yeah, you're basically simplifying maybe complex systems

Starting point is 00:03:24 for somebody else into like this box right and so many businesses just based on the idea like stripe you know i don't even know 100 billion dollar business it's just simplification of all of these different rules that you have to follow in one like api that you have to call like famous seven lines of code that you need to use to like integrate with stripe yeah i mean when i you know and what's interesting when I was in grad school I you know did a lot of research work and in hindsight a lot of that work was over complicated and a lot of that work was how to build systems slightly more efficient and slightly faster and slightly more secure and slightly more reliable and that's great you know

Starting point is 00:04:01 there's a lot of interesting deep intellectual challenges in doing that work. But what I find really gratifying is really working on more existential problems. So how can you do some work that allows, make something possible that wasn't previously possible? to build a product they couldn't ordinarily do. And we'll go into more about Convex, but that's really our mission here is how can we allow front-end developers, how can we allow product developers to build dynamic interactive applications without having to have a backend team, without having to need to understand what's going on beneath the surface

Starting point is 00:04:44 through leveraging these high quality abstractions. And we spoke about how Convex is a global state management platform and how you can basically abstract out a lot of things. So you're sure that I don't have to think about database indexes and stuff like that. Is that what you're saying? Yeah, hopefully.

Starting point is 00:05:02 Well, I don't think I'll go as far as saying everything about indexes, but what I'll go as far as is if you just allow me to rant a little bit. So I think as a community of infrastructure people, we've sometimes failed our users. And because infra people often, I've said this before, but often build solutions that kind of encapsulate the shape of the underlying problems or the underlying implementations and don't necessarily represent the shape of the problems. And you can see this in what's very interesting and partly what motivated us to start Convex. We came out of Dropbox. We started looking at startup ideas and building products of our own. And we actually found it more difficult these days

Starting point is 00:05:46 in some respects to build a company, a bit of product than it used to be, right? There was a time when there was the LAMP stack and yes, you had to go and deploy a server that ran PHP and MySQL, but there wasn't a lot of concepts you had to know. And these days, before you get started, you have to understand Kubernetes or containers or Docker.

Starting point is 00:06:04 And these things are oftentimes horrendously complicated, right? And so, you know, oftentimes you're taking systems that work really well at Google. You know, I don't know how many employees Google has, multi-hundred thousand employees with, you know, huge engineering teams. And building systems that work great for their scale, for their level of technical sophistication but you know when someone builds a single page web app they have to start thinking about containerization frameworks it's quite silly really and so we um we started running into problems that's if i you know i've um you know we've worked together at dropbox and i've built um you know multi exabyte storage systems and million query per second databases and these things and i had trouble

Starting point is 00:06:45 getting started building um basic applications because i had to worry about a whole lot of things i didn't want to worry about anymore and so the observation here really is you know how how can we make these problems go away how can we the abstraction the developers the web developers developers in particular should be thinking about is not containers and it's not really in many respects thinking about isolation levels certainly not thinking about you know serialization anomalies or you know inconsistency or data synchronization right so how can we provide a set of abstractions that solve the problems on behalf of web developers so we have this phrase global state management it's a phrase we made up, right?

Starting point is 00:07:26 And so those words there are kind of chosen for a reason, right? So the second one is state. So it's state versus data, right? So oftentimes we've talked about databases and thinking about data, and data often looks like data on a server. How do you interact with that, right? But when you're building an application,

Starting point is 00:07:45 generally what you're thinking about really is the state you have locally in the application. So I have a chat application. The state is the chat data right there. You have a to-do list. You have a task management app. The state is what's rendered to the developer, right? And it's all well and good

Starting point is 00:08:00 if the data is correct on the back end. But in the process of rendering it to the end user, it goes through a cache or a CDN and it's stale and it's inconsistent and the wrong data is shown to the user. It doesn't matter, right? The user experience is poor. It doesn't matter what's going on the back end. So we're saying it's state and it's not data. We care about state.

Starting point is 00:08:20 It's global, right? So we care about not just managing a local session. We want to be able to manage global data across multiple users. You can build fully reactive dynamic applications across multiple users. And it's management. It's not just reading and writing. It's the ability to subscribe to a query, to a complex query. We can go into more details on how Convex works, but able to subscribe to data and have it

Starting point is 00:08:47 just automatically re-render and be correct. And traditionally, you would just use something like a queue system or like a Kafka or something to subscribe to changes, and you'd have to hack that together on your own. And what you're saying is all of that should kind of be abstracted out for you and you should just be able to like call a bunch of like library functions it's it's really yeah it's

Starting point is 00:09:09 really interesting you phrase it that way because literally the thing that inspired us to start this company is we had various other ideas we were working on in the data space and we're spending time with customers you know figuring out what what they needed and a lot of times customers like yeah this sounds great sounds great, sounds great. But what I really want is someone to give me managed Kafka. And so the obvious question is, well, why do you have Kafka at all? And the answer was oftentimes something like,

Starting point is 00:09:38 well, I don't know, we have to queue messages so the database doesn't fall over or we have to have an RPC service. And all of a sudden, it was pretty clear that people were building and operating services they really didn't want to build and operate just to get their job done. And so, I mean, I can go into, if it's a good time now, I can go into a little bit of detail

Starting point is 00:09:53 about what Convex looks like. So Convex is, so it's a platform for managing state, and it has three components largely, right? So you have the client libraries, and I should say before I get started, the initial target market for Convex is reactive web developers. So web developers building apps in React, right?

Starting point is 00:10:14 Dynamic apps in React who want to have a serverless development experience. There's a lot more things in future for Convex, but this is just the initial starting point. So there's the client library, which you pull down via NPM and you use to interact the convex. And then on the back end,

Starting point is 00:10:33 there's a database and an execution platform. So your data is stored in the database, right? But importantly, the queries you use to query that data are TypeScript functions or JavaScript functions, right? So you have, you know, a client application, you write a query, it's in TypeScript, you call that query, right? So it runs on the data on the server and you get the results back. But because it's React, you can bind the results of that query to a hook, like a React hook, so a use query hook. So, you know, you can check out the convex docs and make this a lot clearer because it's hard to describe

Starting point is 00:11:08 these things without text. But you can write code that's something like messages equals use query fetch messages. And the fetch messages query written in TypeScript will fetch you all the messages in the chat channel. And then any time those messages change, Convex knows when the data has changed underline that query and will feed you new results over a web socket and re-render those components. So critically, because we're executing those queries, so you write your queries in TypeScript, we execute them in a V8 isolate server side against the database. We know exactly what ranges of data that query reads and writes.

Starting point is 00:11:46 And we know when the result of that query will change. So the very basic example and the example we have in our documentation, if you're writing a chat application, fetch all the message in the chat channel, right? So all of a sudden your query has a dependency on the last 10 messages in the chat channel, right? And if a new message shows up, we know that query has a dependency on the last 10 messages in the chat channel. And if a new message shows up, we know that query's been invalidated, and we just send you all the new chat messages,

Starting point is 00:12:14 and they automatically refresh in your browser. So we have this development model, basically, where you're writing your queries in TypeScript. You don't have to think about database transactions beyond this. They are executed atomically, they're executed serially. There's no inconsistency. We handle caching automatically for you because we understand when data is stale and when it's not. So as an end developer, you can, in a very short period of time, go from a static application

Starting point is 00:12:40 or a local application to a globally dynamic application just by importing convex and writing a few functions. I think definitely I don't have to worry about data anymore as far as I understand. But more importantly, I don't even need to have an EWS account, right? It seems like you're managing all of the server-side code running and all of that for me. Is that correct? Yeah. And what's so interesting?

Starting point is 00:13:09 I mean, look, I have a ton of respect for AWS, right? It's a hugely successful platform. But, you know, anyone who's had the misfortune of having to log into the AWS console and see what is probably 200 services there and seven of them deal with identity. Anytime you want to deal with anything complex it's very complicated and there's people who specialize just in in figuring out you know aws services and how to use them right the first thing you have to do is choose a region to deploy in and normally you don't know right and then you have to worry about replicating

Starting point is 00:13:40 data between regions and this this is all huge barriers to people who just want to get stuff done they don't want to think about do they want to use you know postgres or mysql and aurora or ids and how this all works right so convex basically pull down the npm package type you know convex init and you're good to go you can you can basically have this done in you know three to five minutes uh i have a blog post on the Netlify blog on how to transform an app into a global dynamic app. I think I say in 10 minutes or less. Yeah.

Starting point is 00:14:14 I've seen that you can basically do like a Git push and then you have Netlify oversell in the front end, Convex on the back end, and you're really not needing to manage anything at all. These services are both kind of best at what they do. And're really not needing to manage anything at all right these services are both kind of best at what they do and you don't have to manage your own like monitoring and stuff like they all just provide stuff like out of the box for you like so so what's what's maybe confusing to me next is then how do i know what the performance is going to look like right

Starting point is 00:14:42 like um it's nice that convex is like everything for me on the back end like it'll refresh the chat messages for me for example but like how do I know how quickly this is going to happen or how quickly will like a change propagate like is there like a mental model I can develop if I'm using this app yeah and it's an important question because you know I mentioned a little bit earlier about, you know, the value of abstractions. And abstractions need to, you know, I think the abstractions that are successful are both ergonomic but also sound. But ergonomic means they're easy to use.

Starting point is 00:15:14 They map to the problems that you have. But sound means they don't fall apart at the edges. So if you have an abstraction that works great, you have a library that works great, makes all your problems disappear, but all of a sudden when you hit some corner case you have to understand how it works beneath the covers all of a sudden the abstractions basically disappeared and now you're understanding the implementation details so it's very important for us that the system scale and just gets the job done so so you know the boring answer is that we have a dashboard

Starting point is 00:15:42 that you can go look at graphs and see how things perform. We have indexes rolling out very soon to speed up queries, etc. So there's some kind of boring, ordinary answer that you'd expect. But interestingly, I think what's the most important is the way the system is designed, it's designed to scale. It's designed on principles that allow the system to scale up. And so what kind of design decisions lead to systems that scale?

Starting point is 00:16:13 So there's stuff like the fact that our queries are one-shot queries. So we execute deterministic queries that execute end-to-end without interactivity. And we run these queries using optimistic concurrency control and a snapshot of the database so basically

Starting point is 00:16:30 we can go a little bit into the implementation of convex it might be hard to describe but basically you know you use you call your query you write your queries and you know like i said typescript register them with the service they're stored in convex you call your queries and, like I said, TypeScript, register them with the service. They're stored in Convex. You call your queries. When the query comes in, we pick a timestamp. We run your transaction as of a snapshot. Once you've completed that transaction,

Starting point is 00:17:00 we check for any conflicts with any concurrent transactions. It's optimistic concurrency control control so we retry as necessary and um and return the results but we can do very effective caching so so all this stuff can be distributed quite easily right there's a central timestamp oracle beyond that all this stuff can be descent decentralized onto multiple back-end services but we can also do caching really quite effectively as well because we know the dependencies for each query. So we know when you run a query, we know what data it's touched,

Starting point is 00:17:34 we know that it hasn't changed. A lot of those responses just come directly out of cache. And it's probably a little bit more detailed to get into on a call, but then we can distribute those caches really quite close to the user as well. So that's one aspect to performance. Another aspect of performance that I think is sometimes maybe overlooked, oftentimes when people think of performant web applications, performant end user applications, you think a lot about the latency between the client and the server, right? Which matters, right? You don't want to have, you know, hundreds of

Starting point is 00:18:08 milliseconds between the client and the server. But the reality is the speed of light, yes, you know, it takes some time, but it's not that slow, right? And what really does kill application performance is not necessarily a single round trip. It's these requests of waterfalls, right? It's when you when you when you run one query you get a result back you look up the result an answer for that query you do it the subsequent query and if you have you know you can see this in the in the in the chrome you know uh you know inspect panel you know this waterfalls right um these are the things that they can kill performance and so uh in platforms like uh firebase for example when you're talking directly

Starting point is 00:18:46 to the database there's there's not a lot of easy ways around these requests waterfalls right because you're issuing single queries against the database right in convex you're calling a function that could be thousands of lines right so you can issue a single request to Convex, fetch the data you need, render it to the application in one round trip. Many times loaded directly out of cache. So you get to skip all of the intermediate work. And even potentially your hooks could be doing something interesting like queuing up a bunch of requests, sending all of that together. And all of that stuff can be managed um on either side yeah i mean saying that it reminds me that there's quite a little bit of magic beneath the surface right and so these are the kind of things we don't like um you know talk

Starting point is 00:19:34 a lot about in the you know um immediately for developers because we want this to be a a very smooth kind of easy to understand development experience but part of what is required to make a very very compelling experience is investing a lot in how this stuff works you you know this from from from dropbox this is something that dropbox actually did a great job of when you upload a file to dropbox there's a lot of really clever stuff going on to make sure that syncs very fast it was peer-to-peer sync and and um oftentimes you start downloading the file before the first one is finished uploading, etc.

Starting point is 00:20:07 So there's a lot of clever stuff going on beneath the surface to make it a very streamlined experience that most developers didn't know was going on. There's a lot of similar stuff going on in Convex. So one example of this is if you have multiple components in a session. So to pick an example, let's say you have multiple components in a session. To pick an example, let's say you're making an Asana clone, a task management clone. You have a bunch of tasks over here and a bunch of projects and users, etc.

Starting point is 00:20:41 You want these to load relative to each other. You don't want to have a product show up. These could be separate requests, but you wouldn't want a product to have, say, a product show up. These could be separate requests, right? But you wouldn't want a product to show up or a task to show up without the respective project. And you wouldn't want a task to show up that refers to a user where the user doesn't exist yet locally. And so anytime a developer's writing an app that has multiple interrelated components,

Starting point is 00:21:04 it can be quite complicated making sure that synchronization is done correct. Because oftentimes, those requests will return in different orders. And you'll see, you know, you'll present either an incorrect view to the end developer or the application will fail because maybe there's a foreign key relation that's dangling because the component it's referring to doesn't exist yet. In Convex, we actually capture all requests from all components over a single WebSocket, and we track all the components that are on the browser at the same time,

Starting point is 00:21:32 and we ensure that as we send results to the client, they're all serialized in a correct serial order across all components. So when the browser's getting these updates streamed over the web socket, they're shown every single request that's received represents an actual correct snapshot of state on the server.

Starting point is 00:21:54 So you don't have to think about these problems. And hopefully most of the time developers don't think about these problems. They just know, hey, I can just render these components and Convix just works. But it's investing in little things like this that make it an actual compelling developer experience and thinking through how do you build

Starting point is 00:22:08 really simple models for stuff that's oftentimes quite complex to reason about or to implement behind the scenes. So talking about that, is it fair to say when you build a chat app over Convex, you don't have to worry about out-of-order messages? Convex would handle all of that. Not only do don't have to worry about out-of-order messages. Convex would handle all of that. Not only do you not have to worry about out-of-order messages,

Starting point is 00:22:29 just say you have a Slack application where if you have one query fetching all the chat channels and one query fetching all the chat messages and one query fetching all the users, you don't have to worry about ever seeing a message for a user that doesn't exist locally yet you don't have to ever worry about seeing a message for a chat channel that doesn't exist because

Starting point is 00:22:51 every component is going to be updated in a sensible serial order and I'm sure now that I think about Slack there'll be so many edge cases where they're going to think about oh a channel that you've been kicked out of but you're still receiving messages on the web socket about it there'll be all sorts so many edge cases where they were thinking about, oh, a channel that you've been kicked out of, but you're still receiving messages on the web socket about it.

Starting point is 00:23:07 There'll be all sorts of edge cases. This is just something I'm really passionate about. And this is, you mentioned earlier, the storage system magic pocket we worked on. It's a very large distributed storage system at Dropbox. It stores all the files. And this was just a really important design constraint there. We wanted the system to be very, very simple.

Starting point is 00:23:25 And you can look at it and be like, wow, that's all it is? Like if you look at the actual design, you can go, that's it? But it takes a lot of work to make things simple. And yes, and if you don't have those abstractions right and you start chasing the tail of these corner cases, if you have a client-side exception

Starting point is 00:23:43 every time you receive a message for a channel that doesn't you know it doesn't exist yet the applications get very complicated very fast yeah um speaking of all of this like since you've been talking to customers you've been learning about what problems they have you said that so many customers are like they just want to manage kafka and you start building out your product you start hiring more engineers what is like your longer term vision shaping up to be for this like what is like the end goal if you achieve world domination 15 20 years from now with convex like what would you have built yeah so i mean i think ultimately what what we want is for for people not to have to think about backends in general right it's you know it's the ultimate vision is the developer writes their code locally on their laptop,

Starting point is 00:24:29 they test it locally, and they click deploy and they go to bed or they go to lunch. And they know it's going to run exactly the same in production as a globally distributed application as it did locally in their laptop, and then it's going to be performant and it's going to work. This is kind of the dream, and this is the big challenge companies face right now, the difference between your dev and prod, right? And why it's oftentimes so hard to build scalable applications. That's our dream.

Starting point is 00:25:00 You know, right now, our target is web applications, right? Obviously, we want to target mobile and et cetera. And there's a lot of interesting applications of Convex to areas you might consider more back-ending as well. And what's actually interesting is when talking to job candidates and potential customers, somewhat maybe surprisingly, the folks who often just latch onto this idea the fastest are often infrastructure people.

Starting point is 00:25:31 It's often people who are quite senior infra people who are not protecting their turf anymore, and they're like, oh my God, I've had to deal with this problem so much. It's such a pain having to build services for my constituent developers in the company there's so many people wanting us to build stuff for them that we can't right and so i think folks who have been on on the other side of this on on on the back end side oftentimes you know very readily agree that it'd be great to have abstractions just make these make these problems go away for developers and speaking of that right are you imagining that people will have like a local

Starting point is 00:26:10 convex type thing running in dev or does everyone just use like the cloud system and like local development and production and i'm trying to understand like specifically how would that work yeah right now they're i mean right now there's a local instance of Convex for us when we're developing. We haven't released a local Convex instance. That's something we'll do. But we do want to, as best we can, flatten that distinction between local and production.

Starting point is 00:26:39 So I think, I'm not sure if you've used, say, Netlify or Vercel before. I mean, for listeners who may not have used these, these are serverless hosting platforms, and I've used them a lot, actually, especially with Convex, but also for personal projects. They're awesome, right? So this is basically you have an application. Let's say it's a Next.js web app,

Starting point is 00:26:59 and you just want to deploy it, and you don't want to think about CDNs, and you don't want to think about routing. You don't want to think about cdns and you don't want to think about routing you don't want to think about application servers um you know you you have a git repo like a github repository for your for your project you connect that repo to to one of these services and then uh you know you push to the repo and it's done it's in production right and literally you can you can you can do this in less than a minute you know i've gone end to end from taking a you know next.js project to having deployed on one of these services in

Starting point is 00:27:31 in 90 in less than a minute worst case 90 seconds let's say that and that's a great developer experience and interestingly they have really good um hooks into say github right so you can have a pr open and they open up a. So you can have a PR open, and they open up a preview for you. So they give you a URL, like a private URL that you can go to and see what your application will look like when it's deployed.

Starting point is 00:27:55 That's just a really compelling developer experience. It just works really well. I think where the services currently fall short sometimes is when it comes to building truly dynamic interactive applications. Because then you have to deal with the database. Then you have to deal with this amazing developer experience. But all of a sudden it falls apart when you have to deal with schema management or how to talk to AWS and all this kind of stuff.

Starting point is 00:28:21 So I think Convex is really bridging that gap on the data side. And so, yes, these services, you know, Netlify, VSL, they have a local instance and you can run locally. But also, it's just as easy to deploy into a preview instance

Starting point is 00:28:38 in production. And that's the point we have to get to. You're not really thinking about this distinction, what's local, what's not. It's just what is pushed to my users and what is not pushed to my users that's the only distinction that matters like is this out there for people to use or is this something we're developing yeah you have just like a git push convex main boom new production version running with the new like database schema and stuff and speaking about schemas and just

Starting point is 00:29:05 migration in general like how do you all think about schema and migration like is this something it's clearly something to think about when you're managing data but like what's your perspective yeah this is this is this is great so um anyone who's worked at a big company and dealt with databases knows the pain of schema management and and and for folks who maybe haven't done this i mean the issue is you know generally you're using uh mysql or postgres right and you have these big tables and all of a sudden you want to make a change to that schema and it's very difficult because schema migrations often fail generally people don't want to run them live on a live table,

Starting point is 00:29:45 so you do this with a database clone, and you do a promotion to promote that clone. And big companies, you know, the Dropboxes and the Airbnbs of the world, these migrations take six months, you know, for a big schema migration. They can take a very, very long time. So schema is a pain, right?

Starting point is 00:30:02 And so I think convex is a place where we can really make this problem go away. So now, you know, PlanetScale's built a business around a lot of things involved in database management, but in particular, they have their schema branching, etc. It's a big part of their pitch. Now, the difference with convex is we want to bridge the gap between what it looks like when your company succeeds

Starting point is 00:30:24 and what are the really, you know, the large-scale problems you have and what it looks like on day one when you just want to get your job done, right? And so typically when people are building an application, the first question they're asking is not, oh, what's my schema going to be? What's my primary keys? They don't, you know, no matter how experienced you are with databases, that's just not how you think, right? Because you don't even know what your tables are yet.

Starting point is 00:30:47 Like, okay, I guess I'm going to start with a messages table. Now I'm going to add some channels. Now I'm going to add some users, et cetera, right? So this stuff evolves very quickly. And this is partly why Mongo and these somewhat schemeless platforms took off so fast. And I think we can put maybe Firebase in that category too. These platforms where you basically just dump some some data into the database in json whatever and and be done right so this kind of

Starting point is 00:31:11 early early kind of um you know schemaless let's just throw some data in there and have the system adapt is an awesome way to get started right but then as your as your company grows or your your product grows and gets more complex, you start wanting to codify these schemas because you know that this is going to cause you problems. You start finding data in your tables that doesn't conform to the schema that you think it has, etc. And so you have these kind of two worlds.

Starting point is 00:31:39 You have the getting started world, the kind of zero to one world, and then this large scale scale everything's complicated world and so part of the kind of the the the pitch convex is pushing is just how that we can link these worlds together all right so with convex it's it's it looks schemaless right you can just dump your data in on day one you can build your application but convex actually understands the shape of the data in the database and this is actually a bit like a one of the areas where we innovated i'm not sure if other folks have done this um but um so we have some some some pretty clever code that when you're putting data in into a table we track all all the data in in um quite efficiently all the data that's in that table,

Starting point is 00:32:26 and figure out the shape of that data. So a shape is basically a union type of all the potential schemas in that table. So you might have a field called name, and it's just a string, and then later on you have a field called first name and a field called last name, and they're both strings, right?

Starting point is 00:32:40 And so we'll know that the shape is either a name or a first name and last name. Or you might have a field called age and it's initially you put an integer in there but then all of a sudden you put 4.5 in. So we'll understand that's a floating point number. But then if you delete all the floating point rows we'll know it's actually an integer again so we do dynamic schema tracking we know exactly the shape of the data in the database so right now we expose the developers you can go to our dashboard and see the the shape and understand that the data what we haven't exposed yet this is something that's

Starting point is 00:33:21 coming soon is how to actually codify a schema on top of this. And there's a lot of very interesting opportunities for schema migrations and how to take a shape and codify this, or how to take a shape and apply deterministic transformation functions to turn that into an actual codified schema. This is just something that's not rolled out, but stay tuned. That's a really nice, elegant approach, because most of the time when you're starting off, your schemas are going to be roughly similar,

Starting point is 00:33:51 except for you were testing something and you modified something by mistake. And then eventually you're like, okay, I need to collaborate with 20 people on this database table or collection. Let me codify a schema. And then as you mentioned, let me maybe add an index if I know I'm going to be using this data in a certain way. So I like how it's like a gradual ramp up to like serious schema needs

Starting point is 00:34:12 or like serious use in a sense. One of the things we talk about a lot is wanting to be useful on day two and year two. Like how do we build something that's very compelling on the first couple of days you use it? And in the first couple of days, the value prop is almost an existential one.

Starting point is 00:34:30 Like, oh my God, I couldn't do something previously, now I can. And it's a lot about convenience, ergonomics, developer experience. And then later on, the year two approaches, does this system keep running? Does it scale? Does it provide the guardrails that I need?

Starting point is 00:34:50 Does it provide me the visibility that i need does it does it give me um proper indexing support do i have schema codification do i have all these things do i have you know distributed caching etc um so these are these are these are two two different pitches what we don't want to do is is go to a developer and say you know use convex trust us in two years time you're going to be really grateful right we could we could say that but that's that's it's not a very compelling to a developer and say, use Convex, trust us in two years time, you're going to be really grateful. We could say that, but that's not a very compelling pitch. What we want to do is say, hey, use Convex right now. It's actually going to make you a faster developer. It's going to get your company or your product or your project off the ground, but we don't want to lose you. We want to grow with you through the through the duration of the

Starting point is 00:35:25 um you know of your project and this is you know this is this is um we've spoken to a lot of um of customers who use firebase right and firebase is is you know again i have tremendous respect for firebase as well very um many with many respects very successful um a very popular product and the customers you speak to who use firebase generally love it in many respects very successful and very popular product and the customers you speak to who use Firebase generally love it but then they all say oh I'm going to have to migrate off this thing

Starting point is 00:35:51 because they have request waterfalls because they can't fetch multiple complex queries at once or they don't have first class support for relations so they can't model their relations or they don't have the schema support they need etc. So we want to have that compelling developer experience but also be the platform you can you can use for the for the lifetime of your project and and that talking to users that things that makes me think of this one question where i'm

Starting point is 00:36:19 sure you would have had a bunch of ideas of how this could be done better as you all start digging into this ecosystem and thinking about the abstractions you've used across your career what was like this one surprising piece of feedback once you know you demo the first version of your app to someone or like the first version of convex sdk to someone that made you maybe rethink how you're doing things yeah let's think about this so um i'm going to answer that in two ways right because um the feedback we've got for convex has actually always been positive but um you don't want to just get positive feedback right so so you know early stage you start showing people things and like

Starting point is 00:36:58 oh this is awesome i love this this is great right but what you really but i don't think you really start getting that really incisive feedback of like oh i can't really use this until people depend on your platform right and what we're getting now is getting people to start depending on convex which is great right so now we get you know that you know we're starting to get more you know concrete feedback i think in terms of of um the things that were surprising i think the the overall product strategy hasn't really changed right so this hypothesis that there's a gap in the market has has borne out to be true so far I think some things that we kind of under underestimated stuff like you know

Starting point is 00:37:38 the value of typing and autocomplete and you know this is you know uh i think uh the web dev community has has received an unfair rap for not caring about types and just using json blobs everywhere i mean the modern typescript developer really cares uh that that everything they do is strongly typed which i love right it makes life a bit tricky sometimes so we've invested more in say code gen and having and really good autocomplete also the value of you know ide support so you know being able to a lot of folks you know when they're when in their kind of development journey they start you know typing stuff and and see what the auto complete let the autocomplete guide them right so we added much better you know code gen like i said and um and and and stronger types so you can you can um you can get off the ground

Starting point is 00:38:25 and start using convex in a quite self-guided way. And that's also fed into stuff like our plans for how schema works. We realized pretty quickly everything we do has to be really strongly typed on the client side. And that's something that stood out.

Starting point is 00:38:40 Other stuff is just you realize the things that you think people might care about and they just don't. And one was thinking about reference types. If you're a database person, you might think about weak references and strong references. Weak references meaning ones that can be dangling, strong references that are ref-checked, etc. And these are just things that people haven't cared about. And so we have internal support for these things but we've you know i think we've as we've developed that product we have kind of um reprioritized some areas around

Starting point is 00:39:10 really focusing on who the the target developer is right and the target developer they want a really compelling development experience right and that's and that's you know um what we're going after the other thing is interesting to me now and I think in working at a much smaller company now and I think how the industry's changed it's this kind of development in the open kind of model now where you know people people you know you don't don't sit there and wait till everything's finished and then ship it you know people people want to get on board they want to be part of a community so we have people kind of you know part of our community giving feedback you know you know us talking about features that are coming out soon

Starting point is 00:39:48 you know releasing them and as as they happen and that's that's been kind of exciting too this is a it's a it's a somewhat different world for me but this kind of you know engaging very directly with your customer as as you as you add new features and and using customer feedback to help guide prioritization. Like the hashtag build in public. Yeah. Like Slack communities and stuff. Yeah. And this is something we just need to do.

Starting point is 00:40:12 We've been doing a lot of, but we need to do a lot more of as well. And it's just, it's so gratifying when people like have an opinion, you know, it's so gratifying when someone wants to record a little video about your platform. You know, that's just great.

Starting point is 00:40:28 Yeah, and I'm sure engineers can also be brutal sometimes. Like, oh, how come this doesn't work this way? I've had developers as my customers internally at companies since forever, essentially. And engineers can be pretty brutal sometimes. Speaking of CodeGen, though, I don't know if you'll have taken a look at prisma which is like the standard d factor like that that is an amazing development experience i've played around with it a little bit we were considering like a migration or like to use prisma internally we had some issues with transactions for all you know

Starting point is 00:41:00 that that made us like move away from it because we needed to make sure that all of these like all of our certain queries and updates happen transactionally and prisma didn't have great support for it but it was it just looked so amazing i use it for a personal project of mine yeah prisma is like it's like on the gold standard for this ergonomic you know rm style um you know client-side types and they do a great job. You should try it at Convex, but I think we're getting pretty good and getting better every day. And this also leads me to thoughts about GraphQL versus non-GraphQL. And GraphQL is also a very interesting topic. And Convex had

Starting point is 00:41:43 GraphQL support very early on. We actually pulled it out temporarily, at least from the public version. And I think right now we have a, you know, development language that, you know, that people are really quite liking. These are things that we can add back in as needed um i think the the real distinction between say graphql and and and not graphql is oftentimes more like can i can i request all my data in one request you know can i avoid waterfalls can i can i do a request from a single function and and fetch all the data i need to render it in one request and and oftentimes not necessarily the the language itself.

Starting point is 00:42:25 And then eventually data exports and stuff, I need to manage all of that. GraphQL is interesting. I don't know if you've seen what Hasura is doing with you just set up a schema for a database and they expose an API automatically for you and that API will just query the database directly through some weird JSON magic.

Starting point is 00:42:44 Yeah, I mean, there's a lot of interesting stuff going on in the space, right? And the web world is so, the web dev world is so fun. It's not so hard to keep on top of because every day something new is out, right? There's, you know, what's hot changes every day. I think in general, like the word magic is an interesting one right because um i kind of made this point earlier but if if you have magic it has to always be magic right and and there's other platforms that i know i um i've never built anything on hash for so i'm not referring to specifically the other platforms i've used personally where you know out of the box that

Starting point is 00:43:22 you know graphical experience seems great you know um but just doesn't quite scale um both ergonomically in terms of just um you know that's that strong compartmentalization and and and and strong schemas etc and also in terms of performance as as your product grows right and and we want to we want to like that we want to build you know, magic that is sound, the magic that doesn't stop working as you grow. And so where we're investing in things that you might call magic or like, you know,

Starting point is 00:43:54 probably what I would call is like an ergonomic abstraction is ones where we believe we can do so in a way that's going to grow with a project. I'm just imagining the idea of shapes, which you were talking about, like you automatically infer the shape in some backend convex thing. I can run the convex client generate

Starting point is 00:44:14 and it takes all those shapes and makes it like type safe for me to use on the client. So like I can just stuff data in there and yeah, use it in a type safe way locally. Absolutely. There's a ton of things we could and it's it's interesting because you know oftentimes i hear you say things and this you know stuff we we work on and we have internal versions of all of all this stuff um for us you know the the real

Starting point is 00:44:35 key focus right now is building what's what's the the minimal set of features we we can we can build to allow people to really get off the ground building really compelling dynamic applications and then you know we continue to expand that as time goes on yeah and speaking of that you spoke about indexes you spoke about like showing schemas in in the dashboard like specifying schema is there like one feature like one next big thing that you're working on that you're like really excited about it's like i can't wait until we ship this oh yeah there are but i don't know if i'm going to talk about them i will i think certainly indexing is really important i mean i think indexing um in particular like just um you know being able to be able to to fetch data sorted by um by by columns you know having

Starting point is 00:45:22 secondary keys is critical, in particular, in conjunction with pagination. And so this is an area where I think we could be doing a slightly better job. And I guess this is an area that I'm excited about is, well, two areas I'll call that. One is optimistic updates and one is pagination. These are both areas where we have a solution,

Starting point is 00:45:42 but I think our solution can get better and will get better, right? So what are optimistic updates? Optimistic updates is when you make a change locally in your application. You want to reflect that change in the DOM immediately and then send the request to the server, the mutation to the server,

Starting point is 00:46:04 and then confirm that change once the message once that once that once you know confirmation comes back so in convex what that will look like is you know when with when you um create a mutation you you can you can simultaneously kind of bind an optimistic update function to that you know optimistically update so you type a chat message. You can show the chat message and then later on when the message comes back over a subscription, the message is confirmed and stable. That's done in such a way right now

Starting point is 00:46:33 that there's actually quite a bit of sophistication going on beneath the covers because we have to know, you know, which fields are optimistically updated, which need to be confirmed. But I think there's areas where the ergonomics can get better over time. And I think the other is pagination.

Starting point is 00:46:53 It's a very interesting one. How do you page over large amounts of data? Again, you can do so in complex. We have examples of this. You can build paginated apps. But I think we can improve our story there. And we have some ideas for how we can have an even more ergonomic experience for how to paginate over this data. Yeah.

Starting point is 00:47:15 Speaking of optimistic updates, I think I finally understand why it's so important to have this opinion of you're going to work really well with React. Because I can imagine it's really hard to build that in a very generic way. The idea of showing some data before the server has said that it's confirmed this data. You need to tell React so you need to have a convex provider

Starting point is 00:47:37 on top of the app which needs to know that there's something changed before the request gets reloaded or you get new data. It makes sense that you're going so deep. And there's something changed before the request gets reloaded or like you get new data so it makes sense that you're going so deep and you know there's all little little annoyances like you know in javascript typescript you know equality is done on object equality is done on in a point of basis so you know so there's a little there's a little like in a bit of work has to go beneath the covers um what we care about is that these things are ergonomic and composable.

Starting point is 00:48:08 So yeah, like you said before, if you're building a Slack clone and managing consistency between various components manually, it gets very complicated very quickly. And if you're managing optimizing updates manually, it gets very complicated very quickly, especially if you start putting caches in the way right especially you know and so i think there's solutions to a lot of these problems that exist client side but unless you have the full end-to-end

Starting point is 00:48:36 support of the back end which we do right we have a back end that understands what's going on all the way down to the client we understand what components you have we understand what the what the data dependencies those components depend on right we can build like a you know a really transparent seamless end-to-end experience in a way that you can't really solve just purely client-side and we've been talking about chat apps a lot but the more you describe this the more it sounds to me like any multiplayer app like you know google docs i just love using that example. It's just my go-to. It's just the second example in our demos.

Starting point is 00:49:09 Yeah, it's a simple thing to think about, yeah. Yeah, but, like, you can just build any app that needs to think about stuff like, you know, like a Figma, right? You draw something. Absolutely. You need to wait for that data to hit the server, but you can render it before it does that.

Starting point is 00:49:23 And you can maybe, like, show that and you can maybe like show that oh it couldn't save successfully and like pause the app in case like the network request fails yeah any any app that that requires dynamically updating is is is is a great fit for for complex right anything that involves data that's shared between client server or particularly shared between multiple clients um this is you know doing these things manually is a huge hassle and it's a pain to the point that it just discourages people from doing so. So the big hope with the initial target market for Convex,

Starting point is 00:49:56 the initial product and the developers we're going after, are people, yeah, web developers, writing React, who maybe oftentimes would not build certain types of products because it's just too hard or you know not necessarily out of the their their capability but certainly out of their their willingness to do so right there's certainly many folks who who wouldn't know where to start when it comes to deploying it back in and there's folks like me who do know where to start when it comes to deploying it back in and there's folks like me who who do know where to start but don't want to right and you know and um it's just what's what's funny also is inside convex watching our developers who are extraordinarily skilled

Starting point is 00:50:36 talented developers being oh wow i'm gonna build an app for my book club or i'm gonna build an app for like uh coordinating you know, going to restaurants, or whatever it is, that people all of a sudden are having ideas for personal projects, which I think is a great sign, right? It's like, oh, I never would have bothered to build that thing, because I know that the barrier to entry would have been so high.

Starting point is 00:51:00 And now people are, oh, well, the barrier to entry is low, I'm just gonna do it. Yeah, like, I've always wanted to build a Gmail clone that's like real time because I find the Gmail UI so clunky. I used to pay for Superhuman, but it's just obnoxiously expensive and I couldn't justify it after a certain point. Well, I'll give you a beta key. You can build it on Comeback.

Starting point is 00:51:18 Yeah, yeah. And the last thing I want to point out is like you mentioned reactive as like one of the key like phrases right like and reactive it's clearly coupled to react but it's really reactive in the sense of any app that needs to automatically react based on a change so there's this nice fun there absolutely and yeah so i mean reactive works great right now because hey there's a there's a huge community developers in react right but there's there's absolutely no reason why you know,

Starting point is 00:51:45 you can't use convex with Vue and Svelte, et cetera, right? That's just, they're just small changes in the client libraries. You know, then mobile support. We've done internal React Native stuff on convex. But, you know, for now, what we want to do is, you know, is get really, you know, drive initial adoption, get really focused really incisive feedback right so that that is this we're going after a certain target market

Starting point is 00:52:09 and just making sure we you know my my desire is is right now is to not be uh a platform that that millions of people think is okay right i mean platform that that some subset of people now and then we have um i don't know if i should say that we have many thousands of people you know sign up for convex but um but um i want to be the platform that they think is indispensable right and then we then we can you know expand from there yeah like like a thousand true friends type approach yeah that makes a lot of sense well james thank you so much for being a guest and congrats on your series a first of all i forgot to mention that oh yeah thank you yeah yeah and best of luck i hope i can talk to

Starting point is 00:52:50 you in like a year and you've achieved your next milestone and you have like lots of people using and complaining about the platform probably yeah i would love to anyone out there um please go go check out convex if you want a job let me know and we'd love you

Your Ad Here

Software at Scale - Software at Scale 49 - State Management with James Cowling

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.