The Data Stack Show - 111: What if Your Code Just Ran in the Cloud for You? Featuring Erik Bernhardsson of Modal Labs

Episode Date: November 2, 2022

Highlights from this week’s conversation include:Erik’s background and career journey (2:51)Managing scale in a rapidly changing environment (6:35)The people side of hypergrowth (12:36)Coding comp...etitions (17:50)Introducing Modal Labs (19:02)How Erik got into building Modal (21:45)The employee experience at Modal (28:09)How a data engineering team would use Modal (31:21)What it takes to build a platform like Modal (36:27)What makes Modal different (42:49)Evolution coming for the data world (45:52)Untapped areas in the data world (48:46)Spotify playlists (52:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com..

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You'll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by Rudderstack, the CDP for developers. You can learn more at rudderstack.com. Welcome back to the Data Stack Show. I am incredibly excited about our guest today. We're going to talk with Eric Bernardson, who was one of the early engineers at Spotify, did a ton of things there, including music recommendations, started it better at the insurance company when they were just a couple people and
Starting point is 00:00:46 scaled that. He was there when they grew to 10,000 people. So just incredible experience. And he's building something new and fascinating called Modal. So Costas, this is not going to surprise you or the listeners at all, but I want to hear about the lessons that Eric has learned going through this drastic phase of scale two times over, which I think is pretty rare. And especially to do it at a company like Spotify, which I think solves a problem that a lot of other companies tried to solve. We tend to think of Spotify as dominant today, but back in, you know, I guess 2008, there were a lot of other major players in the space, you know, and they were just a small, scrappy startup.
Starting point is 00:01:30 So I'm just really interested to hear that story. How about you? I mean, obviously, like, I'd love to hear about that. But Eric is also like building, let's say, on the new version of the next generation of cloud infrastructure. So that's based like on serverless, that is like much more seamless as an experience like from the developer. So I'd love to talk more about that stuff and see like what serverless
Starting point is 00:02:02 experience is, first of all, why it is important and what you have to do in order to build something that is serverless, right? So yeah, that's what I'm going to like to focus more on. I'm pretty sure we're going to have surprises today. Oh yeah. There's no question. There is no question. All right.
Starting point is 00:02:24 Well, let's jump in and talk with Eric. Eric, welcome to the Data Stack Show. So many things to cover. So thank you for giving us some of your time. Thanks for hosting me. It's fun. All right. Well, give us, give us your background, kind of what you've done over your career, and
Starting point is 00:02:41 then tell us a little bit about what you're doing today. Yeah, I'll try to condense it down. I originally did the chronological order, it's a little bit easier. I grew up in Sweden. I did a lot of programming competitions when I was little. I was always coding and like messing around with, you know, this was back in the nineties, running code and building crappy websites, ended up meeting a bunch of people in school who were early at Spotify. So when I was done with school and I started physics, but I'll take the
Starting point is 00:03:13 better of it, but I ended up joining Spotify in 2008, spent a couple of years there in Stockholm building the music recommendation system. Eventually ended up moving to New York. I built up a team, did all kinds of other stuff too at Spotify, not just music recs, but that was sort of the prime thing, but I also did a lot of like random sort of data infrastructure and ended up open sourcing Effectful Luigi, which was an orchestrator and another thing called Annoy, which is a vector database, built up a team of about 30, left in 2015 and took a job as
Starting point is 00:03:47 a CTO at a company that was then 10 people, that was there for six years, it was Better, I hope it's called Better, it was a mortgage company. The company ended up growing to 10,000 people, then went through some challenging times, so it's a little bit smaller now. But it's oversaw the technology there. And then I left almost two years ago and left in order to pursue a bunch of random ideas I had around data and started thinking about, okay, like, you know, how can we build better tools? What's missing with data infrastructure today?
Starting point is 00:04:19 What's the tool that I always wanted to have 20 years? So I started working on something and over the last few years I've built up a small team and raised some money and, and I don't know if I mentioned the name, but yeah, Modal is, is a company where I'm working right now. Very cool. Well, we want to dig into Modal cause it's so exciting, but first I'd love, I mean, you have such a unique experience, right? So Spotify 2008, you know, a handful of people on the super early team. I think, you know, what's interesting, actually, I think back on that and, you know, when you
Starting point is 00:04:56 think about streaming, Spotify is the first thing that comes to mind. But back then, you know, there was RDO and like a couple other major like streaming services really large, actually, at the time, thinking back in those early days of streaming, which was super interesting. But you have been through this scale arc more than once, right? So like, you know, Spotify in 2008 to 2015, hyper growth, the company growing to thousands of people, better, you know, 10 people to 10,000 people, just serious exponential growth. And with those unique experiences, I'd love to look at that from two angles. I mean, both personally, I'm just so interested in this,
Starting point is 00:05:41 but I think our listeners as well. Let's talk about Spotify first. So between the time you started and the time you left, the technology landscape went through an unbelievable change. There were things in 2015, I think you said when you left, that didn't exist back in 2008. And so you have this massive change happening in the technology landscape as your company's going through this period of hyper growth. How did you manage that on the technology side? Because it seems like you would constantly be facing decisions around infrastructure, like we can get more performance or leverage this new technology. Like when do we change this?
Starting point is 00:06:30 Why do you change it? When are you forced to change it? So we'd just love to know what sticks out to you as you think back over that experience on the technical side of how you manage through such an interesting time of scale with such a drastically changing tech landscape? Yeah, I mean, I think the most obvious thing is, like, let's see, like, the cloud is different, right? Like, you know, when I started working on,
Starting point is 00:06:55 when we built Spotify, it was all on prep, right? Like, we'd run, like, Capacit and Data Center and, like, put our own racks there. Like, you know, probably had to sneak through that. And I think, you know, now like almost no one does that. So I think that's probably the biggest thing. But also, you know, you look at like the data side, like back then it was all like Hadoop, you know,
Starting point is 00:07:12 because people were inspired by, you know, what Google was doing, what MapReduce. And there really wasn't anything that scaled. And the data that Spotify generated was, you know, relatively large for its time. We didn't, you know, didn't want to go to like the Oracle route. And so, so I knew it was the way to go. I dunno, I was kind of just dark ages.
Starting point is 00:07:32 Like I, I look back at that and you know, I, I, I, you know, I do not miss it. It was pretty terrible in many ways. And I think, I think also being like so new to it, like I, you know, there's like a certain amount of like normalization of, you know, bad things that I didn't realize until much later, like how crappy it was. And so in terms of like the technology, I wouldn't say there was like a particular day I just woke up and, you know, felt like, you know, we're already on the fly landing in the future, but I, cause it was like every day it was like a little bit new, you know, but so, so, so staying up to that
Starting point is 00:08:11 and up to date on that was like, it's a little bit tricky, but I don't know. Like I've been doing it so long that I feel like if you look at like one Git, GitHub repo every day, like I can sort of, you know, just see what's going on. But yeah, the cloud was always a big thing and I'm very
Starting point is 00:08:29 grateful that the cloud exists. And so I think that's made a huge difference in terms of how I'm operating with
Starting point is 00:08:35 data. Yeah, for sure. How about, you know, so you leave Spotify in 2015.
Starting point is 00:08:46 Were there any, did that did that I mean I'm sure it did but influenced the way that you did things better like were there any you know sort of any big lessons that you took into better based on that experience I think so I mean I spent 12 13 years
Starting point is 00:09:01 consumer companies doing data stuff. And I, I mean, one thing I learned is just like, you know, how incredibly important it is as a consumer company. It's just like really have a firm understanding of what your users are doing with light and conversion rates and the friction points and, you know, and, and the onboarding flow and every, you know, everything that happens in that and like, you know, what's your activation and retention and churn and all those things.
Starting point is 00:09:30 And so I think just coming into my second company, thinking a lot about those things, like from scratch, I think was incredibly helpful. I mean, there was many other things, but as soon as it like stands down on the data side, like just, you know, kind of asked, I was like, I mean, the other thing is also, it can maybe like a counter learning if like, I also learned like the fact that you can't really do data for like, I mean, our only other better, like we didn't have a data team for like two, three years, right? Like, you know, you don't have any users, like, and you don't have any data, then it doesn't make any sense.
Starting point is 00:10:00 Right. Right. And so, so I think that was also kind of clear at Spotify, like early on, like, you know, I usually joined Spotify in 2008 to do music recommendations. Like it kind of realized kind of quickly, like we had much bigger problems to solve and like, we also didn't have enough data to do so. And so I ended up like really not, you know, I ended up working on a lot of other stuff for the first few years and then they got back to it and then something
Starting point is 00:10:20 similar happened at better, like we didn't really have a data team, you know, the first few years since like very basic stuff, but then eventually data become like incredibly important for that. And then I think then sort of a Better. Like, we didn't really have a data team, you know, the first few years. It was, like, very basic stuff. But then eventually, data became, like, incredibly important for Better. And then I think that's sort of a transformation I've seen a lot of data companies, too, is, you know, like, how do you start to think in a data-driven way? Like, I think it's more just, like, a technology shift that ends up being kind of a culture shift, too. Yeah. Super interesting.
Starting point is 00:10:42 One specific question there, just out of pure curiosity when did you like what was the point at which you felt like there was enough data to really make meaningful progress on the recommendation side yeah i don't know it's like like, so, I mean, specifically with that, like, you know, music is so like, it is like the problem I was working on was like so unique in a way. Like, I don't know if there's like a general lesson to learn there. Like we, we had a very big matrix, like the rows were like, you know, users and the columns were tracks and we just wanted enough entries in that matrix.
Starting point is 00:11:21 Yeah. In order to feel like, you know, we could complete the rest of the matrix. So, so, so that could complete the rest of the matrix. Henry Suryawirawanacanthamiloyetho. So, so, so that probably took a couple of years. I mean, like early on, like we only had like basic stuff that worked, but it didn't cover a large part of the, the, the, the catalog, like it only covered. We should only make it good enough for like, you know, David PĂ©rez- yeah, like least common denominator, right?
Starting point is 00:11:40 Like, yeah. Henry Suryawirawanacanthamiloyetho. So getting enough data that it would extend to like the full table of the catalog, I think that took many, many years. So for a long time, you know, that the, the, the recreation system only covered maybe the top hundred thousand tracks and then eventually covered a million tracks and eventually a million tracks like that took a long time. Yep.
Starting point is 00:11:56 Yeah. Super interesting. And then how about on the people side? So, you know, it sounds like you went through like you were, you know, sort of on a very small or maybe even like a one person, you know, engineering team or, you know, a the technology side is one thing, but the people side is another one, which is, you know, arguably like a lot more difficult in many ways. Yeah. Yeah.
Starting point is 00:12:31 People, people are tricky. I, yeah. And, and, you know, Spotify was weird, right? Like Spotify was just kind of almost like anarchy. And I think there's, you know, in retrospect, I think it's kind of a unique environment in that sense that, you know, sort of, sort of told me that like, if you just hire a bunch of like, you know, relatively strong people and just like throw them into a room and like, you know, tell them to go like build an amazing
Starting point is 00:12:55 service, like you can actually sort of make that work if you have the right culture. So I think to me, that was a very positive experience that I had, you know, with me from Spotify, you know, and the amount of trust and the amount of like self-organization that new early days of Spotify that we had, which also have like, you know, negative sides, like no one told me who my manager was the first three years, like, I didn't even know, like, you know, I didn't have anyone involved, like whatever, but like, and like no one told me what to work on either. Like I, so I was just like sitting, like, you know, I started building stuff and like eventually I found some stuff that was useful and then I just like, and like, no one told me what to work on either. Like, so I was just like sitting around, like, you know, I started building stuff and like
Starting point is 00:13:25 eventually I found some stuff that was useful. And then I just like, hey, I just like data. Like people got interested in it, you know? So I started working on so much like random stuff. But I think, I think that sort of complete anarchy that existed as Spotify, which, you know, people, you know, might seem scary, actually was a very propulsive thing in the early days. I don't think it like scaled beyond like maybe, you know, 20, 30 engineers.
Starting point is 00:13:47 I said that, you know, yeah, if I can post a little bit more structure. So, but I'm sort of fundamentally like a believer in people and, you know, trusting people and, and so I took that with me with, to, when I built up a second, the better venture scale of 200 engineers that, you know, you know, sort of aspirationally, you can get pretty close to a culture where people just come in every morning and just ask themselves, like, what is the number one thing I should do today that adds the most
Starting point is 00:14:12 business value, right? Like, helping people have that inner voice, like that sort of, you know, guiding, you know, star of, like, you know, today, these are the things that matter and, like, you know, let's figure out together how to get there. I'm not in that need. Like, I think there's always going to need, you know, we're going to need managers and and like, you know, let's figure out together how to get there. I'm not in that need. Like, I think there's always going to need, you know, we're going to need managers and
Starting point is 00:14:28 like performance reviews and all this stuff. Sure. But like, you know, I think it's like, you know, as like a thought experiment, like how close can you get to that, you know, that fully self-organized, you know, platonic ideal of an organization. And I think the truth is you can get pretty close to that. You know, if you instill that culture into people. Can I ask a specific question about Spotify in the early days? Because
Starting point is 00:14:51 you know, when you use the term anarchy to describe company culture, that's a pretty like shocking term, I think for a lot of people, but it's so compelling that, you know, there was self-organization. What do you think enabled that? Because the thing that comes to mind is like, okay, no one told you what to build, but was the mission, like, even if it was very broad and even like somewhat difficult to translate down into like, what code should I write today? Was the mission like clearly and consistently communicated so that people could sort of self-organize and at least prioritize? Like, what were the unique characteristics that you think enabled that? Because a lot of times you
Starting point is 00:15:37 think about like aligning around a mission, even with a complete lack of structure, at least people are building towards the same like direction. No one sat down and like, you know, wrote a bunch of cultural tenets and put it up on the wall. But I feel like it almost like kind of ended up being that way because I think Spotify just had such a unique sort of product. In fact, we had a product that like people just love, like, you know, and I think that was, you know, looking back and this is my first job you know out of school so like i don't think i realized like how unique that was in that sense but like yeah you know how how you know cool and like everyone who worked as one if i had this like you know love for their own product i mean like frankly like
Starting point is 00:16:21 i don't know any other product that like like, you know, the employees of that company would use, you know, their own product. Like, I don't know, like, maybe, you know, you work at IKEA, like, maybe you're sitting at IKEA at Sheriff's all day. You're like, okay, like, you know. But, like, I would listen to Spotify 12 hours a day while I was working at Spotify.
Starting point is 00:16:40 So, like, you know, if you're, like, using your own product, like, 12 hours a day, I think there's so much care. That is fascinating. Yeah, that's such a visceral personal experience, but that's actually shared among a small group of people that can direct like through a high level of care. That is so fascinating. That is such an interesting dynamic. Yeah.
Starting point is 00:17:09 And I think that factor was like tremendously helpful. And I'm not saying it's like, it's like, you know, necessary. Like, I think, I think it's definitely harder if you're working on, I don't know, like, you know, claims processing software for like, you know, corporate, you know, liability insurance, right? Like then it's probably a little bit harder, but I think, you know, there's still ways to sort of, you know, create a little bit of that culture of like, you know, people feeling like they have a power
Starting point is 00:17:38 and an autonomy and like, you know, all those things. Yeah, absolutely. So fascinating. So interesting. One, okay. One more question from your background that I just can't help but ask, but coding competitions,
Starting point is 00:17:51 is there like one coding competition that sticks out in your memory from doing those in the nineties or like one challenge? I think we're only 2000. I don't know. I think, yeah, I don't know. Like I, there's like a lot of that, but like, you know, there was like a crazy one.
Starting point is 00:18:09 We would go to like Hungary every year and there's like this 24 hour programming competition that was like really different. Now it was actually kind of fun. It's like, you know, you have all this like weird programming competition, like it's like problems, like you have to control, I don't know, like the Lego train or something like that, like once off. So that was a lot of fun. Plus it was like 24 hours. It's kind problems, like you have to control, I don't know, like the Lego train or something like that. Like once I was really, that was a lot of fun. Plus it was like 24 hours.
Starting point is 00:18:30 It's kind of fun to like manage your own energy. Like it's kind of. Sure. Yeah. It's like a, a physical challenge too. Yeah. Yeah. Yeah. At the end of the, you know, neck, you know, early hours of like this, the second day.
Starting point is 00:18:40 So out of it sits down. That one used to be like fun every year yeah super interesting all right well tell us give us just a brief overview of what modal is and then i want to hand the mic off to costas because i know he has a bunch of questions about modal but yeah can you just give us a brief overview of what it is yes it is a way to run code in the cloud in a way that, you know, probably is focused on data-based experience. Let me contextualize a little bit. So I think it's actually, so sorry if this may not be quite as brief as
Starting point is 00:19:15 it seems, but I took, you know, I always wanted to build better tools for data engineers, like, or data scientists. Like I looked at a lot of different, like, you know, as a CTO, I saw this very clearly, like, I think data teams like me need better tools, right? They're sort of behind, I think, other segments things, you know, as a CTO, I saw this very clearly, like, I think data teams like need better tools, right. There's sort of behind, I think other segments of, you know, software engineering. And so, and then I looked at a lot of different like parts of the stack, like, you know, I looked a lot in orchestration to begin with. I, as a mission, I'm the author of NovoSource tool for Legion, which nowhere leaves yesterday, but like 10 years ago, some people use this sort of kind of a precursor, like kind of before Airflow was like sort of a simpler stuff. Sure.
Starting point is 00:19:47 So I started working on that. I was like, you know, this is really cool. Like, you know, like an orchestrator kind of sits in the middle. It controls a lot of the other stuff. You know, what if you start there? You can sort of, you know, then it becomes like Nexus for all the other stuff. Then I realized like at some point, like, cool, like you can orchestrate the code, but you have to run it somewhere. And like, where do you run it? And I started thinking, okay, I have to build this integration layer with like,
Starting point is 00:20:07 with like Kubernetes or Docker or like Lambda or like these two. And that just like turned out in my brain, it's like, it's slow to realize how down that is, like how hard it is. Like, you know, fundamentally, like the user experience of building an orchestrator will never be better than the sort of user experience of like the integration with the underlying substrate where you actually run code. So I was like, why don't I just fix that problem instead? And so I started focusing on this idea that like,
Starting point is 00:20:31 what if it's like throw out all the other stuff, right? Like, you know, you look at like, you know, data teams and like how they run code today. Like there's like Renetta, Docker, Terraform, you know, Helm, like all this stuff, Airflow. And then like, what if you started from scratch and like built it in a truly complicated way where way where like, you don't have to think about resources, you don't have to think about like, you know, setting up
Starting point is 00:20:50 instances, installing infrastructure. It just like runs in the cloud for you. Get yourself up and down. You can schedule things like, what would that look like? And so that's what I started working on two years ago. And, and yeah, we have something that, you know, works recently while today. We're still in a sort of closed testing stage right now. Stig Brodersen Gosses, all yours.
Starting point is 00:21:12 Stig Brodersen Thank you, Eric. Thank you. So Eric, you mentioned Luigi. So let's start from that because it's, you said like you have a passion of like building tools for like data engineers and Luigi is probably one of the first that you've built maybe, or at least the, yeah, the one that got open sourced, right? So tell us a little bit more about Luigi and I'm not talking about that much about like the technology itself, but like how, how did you get into building it?
Starting point is 00:21:44 Yeah. So just kind of a recap of Luigi is it's like, it, how did you get into building needs? David PĂ©rez- Yeah. So just kind of a recap of LouisJS. It's like, it's an orchestrator, AKA workflow scheduler, you know, sort of saying a cam is primarily Airflow. Now there's also, there's Daxter, Prebank and Flight. So, but like when I started working on Spotify, there really wasn't anything in that space, right? And so I, I ended up having, you know, more and more complex data by class for the music
Starting point is 00:22:09 recommendation system. And like, you know, like a lot in particular, a lot of Hadoop jobs. Like, so I would have serious, like, you know, complicated, you know, hundreds of Hadoop jobs that I had to chain together and like run in a particular order. And I ended up like realizing at some point, like, actually, this is a graph, you know, we should model it as such. And, you know, because I'm like an old school person, Bitcoin, blah, blah. Like I, you know, my first sort of like, you know, thought was like make file,
Starting point is 00:22:36 like, this is kind of like makes up. And so I started looking at make file and like, I kind of like, you know, cause I liked the idea of just like functional nature of like how make all works. You didn't find like targets and rules. I mean, like there's, there's syntax with make files, incredibly arcane and like, you know, super annoying. Yeah. It was like dollar percentage is like the exit code of the previous, you know,
Starting point is 00:22:58 whatever it's like, I just like, I was certain because it was built in the seventies, I think, something like that. So anyway, so I started, but like, I went to sort of idea of it. So I started building the lege, you know, sort of losing model and that, and then, you know, kind of evolved, like, which was actually the third iteration of, of the same sort of idea. And the first one where I felt like this is good enough, or I'm just going to throw it out there on GitHub and see if anyone uses it. And I guess other people had similar problems.
Starting point is 00:23:25 Like, you know, after just a few months, like someone Foursquare reached out and they're like, Hey, we're running Luigi. We're, you know, we'd love to collaborate. And then, you know, from there, like, you know, over the next few years, like there were many more companies that started using it more and more. I never like thought Luigi was like amazing. It kind of funny in a way but like you know to me it was just like a thing that solved a particular problem that was like good enough where I felt like other people would also have the same problem and I think they had and so I
Starting point is 00:23:56 think there were a couple of things that I'm really happy with Luigi like it sort of had this like functional nature of like you know how you express the. But the other states also don't like about it. In particular, I think the reason why people liked Airflow more was like, it had a much better web interface and lots of other stuff that people actually wanted, so I don't know, it was kind of a fun thing. I mean, people still use it, but it's kind of rare today. Henry Suryawirawan, Yeah. Yeah.
Starting point is 00:24:20 So what's like the current state of the project? Like is there still like, you said they're kind they like, I mean, I don't know, like, there aren't as I mentioned, like there are some companies like still using it. But I think, you know, I haven't touched it for six years. But like, I think Spotify still has like a lot of stuff and Luigi, I heard they had a bunch of stuff. And there are some companies that still use it, right? Yeah. Got to like speak to a little bit to like, you know, it's a very sticky product, it's very hard to get out of, you know, these like work
Starting point is 00:24:49 rules once you like use them, you know, then you have to rewrite all the code. Um, but yeah, it's not being like actively maintained. And I think for a good reason, I think many, I mean, I always like thought about it, like, what if I'd actually like kept working in Lutie, what would it look like today? And I think, you know, there are, you know, that's actually kind of like what I started working on, you know, I mentioned like, you know, work straighter like before model and I think there's a lot of things you could have done like so much better and I still kind of don't see in like the today's workflow
Starting point is 00:25:16 schedule, but, but, but yeah, I mean, yeah. To answer your question, like it really interacting with me right now. Okay. Okay. So how does Orchestrator fit in the worlds of today with the serverless cloud infrastructure that you have in mind when you're building a model? Yeah. I mean, what are those kind of the layer below, right? As I mentioned, I started thinking about Orchestrator and I realized, actually, I don't want to
Starting point is 00:25:44 build Orchestrator. I want to build a layer below, right? Because I think I mentioned, like, I started thinking about orchestrator and I realized, actually, like, I don't want to build an orchestrator. I want to build a layer below, right? Like, because I think that's going to be a problem. And so, I think it was actually two quite big layers. And we have a couple people
Starting point is 00:25:52 we use modal together with the orchestrator, like Airflow, Prefect, or Daxter. I mean, like, you know, so, so, so, and, you know, and those,
Starting point is 00:26:02 by the way, like, orchestrator, like, I feel like, in a way, like, there's someone else that got to, like, where the job actually runs. Like, typically, people run and, you know, and those, by the way, like orchestrator, like, I feel like in a way, like there's someone else that gets to like where the job actually runs, like typically people run it up for bananas or, or, or something else. Or, but, but you can certainly integrate the surface. Like I, I feel like in general, you know, like thinking about where the cloud is going. And then it is a function of having seen this kind of firsthand, like all the old days of like on-prem to like now, like the cloud, I think we're still kind of early in the cloud, you know, in the evolution of the cloud.
Starting point is 00:26:31 And I think to me, the natural sort of inevitability of the cloud is to some extent serverless, like why as an engineer should I have to think about these, like, you know, these abstractions, like clusters and instances, managed resources, and like, you know, all this other stuff, right? Like, computers should just do it for me. They're much better at those types of things. And so I'm extremely bullish on serverless for that reason. And I especially think in, you know, data land, that might actually be a better fit for serverless, right?
Starting point is 00:27:07 There's a lot of like aspects, you know, serverless so far, I think has been like mostly prevalent in like, you know, front-end and back-end type architectures. And you see a couple of vendors like going all in on like sort of Versailles or like Netlify, whatever, like doing that kind of stuff, but I think like the sort of nature of like what developers in data teams do might actually be a benefit. You have this like very like bursty workflows. You have like esoteric sort of heterogeneous hardware requirements. And you have, you know, sort of like very like exploratory things that
Starting point is 00:27:34 require you to like, you know, build things quickly and like, you know, try it out and stuff like that. So that's sort of, you know, idea that I always had with modal. Like, you know, while I'm like, I don't know if I'm answering your question, I kind of, you know, ended up diverging a little bit, but those are the things that I think about a lot these days. Okay. So like, if I stopped like working with model today, like what's my experience
Starting point is 00:27:58 would be like, well, how do I interact with model and how is this like different than the cloud or yes or no. Right. Or like, I mean, like, I think, you know, this is still very much like the, it is those companies, right? Like you write code, you run it locally as you're developing code, you're sort of writing it locally. You're like kind of test running it. And then like, eventually you're like, okay, great. I'm going to like deploy this to production. It's time. And then you're like, oh my God, I have to like containerize this.
Starting point is 00:28:38 I have to like write once for YAML. I have to like, you know, whatever, you know, ask the ops team to do this thing, you know, permission to stuff. And like, you know, we started being just like this inordinate amount of like chore, like, just like, you know, it's like a super annoying process. And so one of the things that I fundamentally believe in is like, you know, like what if I used to think that like, you know, the only way to like work with the cloud is like have to mock all the things globally.
Starting point is 00:29:06 But then I actually realized, what if you actually bring in the cloud much, much earlier in that process? What if you always run things on the cloud? As you're writing code locally, what if when you run it, it actually runs in the cloud in the same environment that you're going to eventually deploy it to? And so that's what modal does. And I think, you know, and the only way to do that is to make it super, super fast and fast, not in the sense of like running it fast, because you know, there's
Starting point is 00:29:33 like upper limits to how fast computers are that we can't do much about, but, but, but it works, however, what we can do is we can store containers super fast. We can take code locally, like you're writing a script, you know, I just wanted to run it. And so what we can do is we take that code, stick it in a container in the cloud, we launch it in less than a second. And, you know, we'll launch a hundred of those containers
Starting point is 00:29:57 in less than a second. And then, you know, they just like print to, you know, standard out, like whatever is happening, like you see it. And like, you don't have this like annoying, you know, you have to build a container and push it to ECR, then go trigger this interface, and then you'll like download the logs, you know, whatever. Like you don't have this like super long feedback loops, right? Like when I think about developer productivity, to me, I think developer
Starting point is 00:30:20 productivity is like best understood in terms of these like feedback loops. And so you have to make these feedback loops fast in order for developers to feel happy and to feel productive. If the model does this with the cloud, like you go from like writing code low-blade to actually executing it in the cloud in less than a second, which is something we have to go very deep and kind of in the guts of containers and file systems and that kind of stuff to, to others.
Starting point is 00:30:47 Henry Suryawirawan, Okay. We'll talk about that. But before that, so you said that like model is I mean, no, before, before that, your passion, like building products, like technology, like that's tools that's going to be used by data engineers, like data, let's say professionals, how is model used for that by a data engineering team? Like what is like a data engineering team going to use model for today? David PĂ©rez- Yeah.
Starting point is 00:31:16 Yeah. Yeah. And like, yes. So I, what I think is interesting is like over the last 20 years or whatever, 15 years, it's like, you know, there's been all this like back and forth between like, you know, first we were saying I do, and then the warehouses came and it's been like, you know, now, you know, SQL is like very dominant, you know, for a lot of workloads. I love SQL, but, you know, I think fundamentally,
Starting point is 00:31:38 like my belief that there's always going to be stuff you have to write code for, right? So SQL covers like a very large, like, you know, percentage of work workloads. It's all stuff you're gonna have to run, write code for, like, you know, doing, you know, sharding or doing simulations or whatever. So that type, so Modul exclusively focuses on code. And in particular right now, Modul supports Python, right? Like eventually we'll probably have support for other languages too. But the benefit of focusing on data teams specifically is like
Starting point is 00:32:02 they'll almost always use Python. Yeah. So, so what are the problems that model primarily helps with? Like right now we're targeting what I call, what I think of as like embarrassingly parallel workflows. So those could be things like I have a hundred million images and I need to compute embeddings for those. And I have this function that runs on a GPU that takes an image and produces
Starting point is 00:32:24 this vector and I just want to scale it up and run it. That's like one use case. Or I have, you know, a hundred million satellite images. I want to run some computer vision. Or there can also be things like I need to scrape a bunch of websites using a headless browser, you know, Chromium or Playwright or whatever, and take screenshots or like, you know, stuff like that. It could be, you know, financial backtesting.
Starting point is 00:32:42 It can be like various types of simulations. Multi-parallel simulations, you know, like, you know, either, you know, pricing and financial instruments or I don't know, like, you know, sampling from Brassiere, you know, things like that. So we focus a lot on these types of sort of embarrassing, like parallel use cases, that's a sort of, you know, starting point. But, and then another slightly smaller bucket of things tend to be just like simple cron jobs, like, you know, people have like a tiny script and they just, you
Starting point is 00:33:09 know, they just want to run it every hour, like every, every midnight or something. I guess that's like a smaller bucket. Um, but I think there's a lot of, you know, super early stage tech teams and like, they just have like a tiny thing. It doesn't even have to be a tech team. They just want to run something on a schedule. Henry Suryawirawanacik, And how does like model differs compared to something like AWS Lambda, for example?
Starting point is 00:33:33 David PĂ©rez- Architecture, it's very similar in the sense that we're all about like running containers in a way where you don't have to think about the underlying infrastructure. I think the difference is primarily, and I thought a lot about this, like, you know, why has serverless not been more successful? My answer to that is frankly, the developer experience kind of sucks. Like, I don't know if you're like, you know, production as like Lambda functions, but it's like a pretty bad experience.
Starting point is 00:34:00 Like you have to set up like, you know, CloudWatch or like whatever it's called. I don't even remember. Like there's so much stuff you have to set up, like, you know, CloudWatch or, like, whatever it's called. I don't even remember. Like, there's so much stuff you have to do. And you don't have, like, you know, nice feedback over the years, like, run code in StackWars and Cloud.
Starting point is 00:34:12 So a big part of what Modal does is really just, you know, make it put, like, you know, offer a much, much better user experience. And I think because the latency between, like, writing code and launching
Starting point is 00:34:25 containers is so critical to that developer experience, we basically have to start over and build our own infrastructure for that. Good man. Okay. And while you were going through like the use cases that you have seen like in model so far, you talked like a lot about like, let's say working like with unstructured binary data. What about tabular data?
Starting point is 00:34:48 I mean, do you see like use cases also there or this is not, let's say... David PĂ©rez- Yeah. I mean, I think there's like always stuff that, you know, I mean, like, you know, like there's always stuff that you can, if you can do things in pure SQL, like I think you should do it. Right. But it's like, let's say you have like a lot of like tabular data and, you know, you want to fit, you know, an XGBoost on top of that, like, you know,
Starting point is 00:35:11 that might be hard to do in a database. I mean, I don't know. Like, I know there's like crazy, like Postgres extensions, whatever, to do that today, but like typically you want to write code for that, you know, and you want to run it in like a normal like containerized environment. And so I think that would be like one example, right? Like, you know, we can, you know can take that data, operate on it, and fit models, and then help you deploy those models, because you can also take functions in model and deploy them as either REST
Starting point is 00:35:36 endpoints or focusing the keys and twirling it from other apps inside of a organization. So you could also use this sort of, you know, kind of a model, a model-serving interface. But, but yeah, those are some of the things we can do with model and tabular data that, you know, you probably don't want to use the database for, but again, like, you know, to go back, like, I think there's like plenty of stuff where database is amazing.
Starting point is 00:36:00 It definitely starts with database. Yeah. Makes a lot of sense. And there's another huge conversation about what a serverless database looks like, which is another interesting topic. But let's talk about model and the architecture of model, right?
Starting point is 00:36:16 What it takes to build a platform like model, like a serverless cloud infrastructure service. Henry Suryawirawanacke... I, you know, so I started looking at this a couple of years ago and immediately like kind of ran into this problem. It's like, you know, building Docker containers and like pushing Docker containers around was kind of slow. And so how to fix that.
Starting point is 00:36:40 And so, so one of the things I realized pretty easy, pretty, pretty early on is like, I'm not going to be able to use StarCraft for this, and so we ended up using a much lower level primitive, RumC. And, you know, I mentioned, I think we could have switched to Firecracker and to run containers, which, which, you know, using that, you can spin up containers in like milliseconds or less. The problem is like, okay, now you have this like big container, like container images, like how do you, how do you like ship around these like big container
Starting point is 00:37:07 images that like, you don't want to like fall for this, like, you know, sort of traditional like Docker method of like, okay, we're just going to push and pull the whole image, although it does a little bit of de-diplication on all their levels, it's a little bit faster sometimes, but like, if you have, you know, a container image that's like 10 gigabytes large, you really want to avoid like, you know, sending that back and forth over the network between different nodes in your cluster. So I basically realized, okay, well, if you look at containers, sorry, this is getting really technical, by the way. So feel free to tell me to shut up if this is like too low level.
Starting point is 00:37:38 I realized like if you look at container images, first of all, most of the container images are, like, never read. You know, the average, like, container image is, like, you know, if you look at, like, a bare-bones, like, Linux distribution, it's, like, all this, like, random, like, time zone information from, for, like, Uzbekistan
Starting point is 00:37:54 and, like, random islands, like, no one lives on. Locale information, like, watch this stuff, right? Like, so, why send those states in the first place? The other thing is, like,
Starting point is 00:38:03 even for the files on the Linux distribution, or on that image that are actually read, most of them tend to be the same. It tends to be, you know, the Python interpreter, a bunch of Python standard library, you know, a bunch of stuff is like very, you know, like when you launch a Python interpreter, like it'll read like the same 200 files roughly. Right. Like, you know, and that doesn't differ much across like the different containers, it differs across different Python versions. But so what we ended up doing was like, we ended up building our own file system that basically stores all these container images in a content address bay where, you know, we go through all the image, we compute checksums for every image, for every file, and then we basically store those files only once in an underlying network storage.
Starting point is 00:38:50 And then we like kind of create this like virtual file system in Fuse. We ended up building a Rust for performance that then exposes that to the container runtime as like a Linux root. And I actually worked super well. Like everyone told me like, you're crazy for building a Linux root. And it actually worked super well. Like everyone told me, like, you're crazy for building a file system. But we ended up doing it. It actually works really well.
Starting point is 00:39:12 So we have to build a bunch of other stuff too, like sort of one-less lines. But that's like, you know, kind of one of the things that we ended up building. It's like very technically challenging and kind of complex.
Starting point is 00:39:20 But I think ultimately, like also now that's just delivered this experience to consumers that I always wanted. It's like, you know, just write code and it's like immediately starts in the cloud. So if I understand correctly, you are using, like in terms of like the technologies that have been used, like as part of like the model of stuff, you use Firecracker, right?
Starting point is 00:39:41 Not yet. Ah, we're planning to move to Firecracker. It has, this is where GPU is, some of the issues. So, but that's the plan. Okay. So what are you using now instead of like Firecracker? We use, we use run C, just like a lower level primitive. Docker actually uses, well, C, I'm going to put, it's just like a simpler.
Starting point is 00:40:04 Okay. And if you look just like a simpler. Okay. But if you look at like what is a Linux container, it's like basically like ch-true and like set-conf and like, you know, c-groups and also like those types of things. And that's essentially
Starting point is 00:40:12 what like Run-C does. It's like a thousand line Go binary or Go script binary or program that basically is like, you know, wraps those things
Starting point is 00:40:20 into kind of unified OCI image compatible running. Yeah. And then you use like fuse to create like the file system, but she don't feel like story they much is there. Right. So that's correct.
Starting point is 00:40:35 Yeah. Yeah. And actually every fuse, you know, makes it pretty easy to build file systems. We actually ended up like the first prototype we built in. I thought there's like Python violence for queues, which is like terrible for the performance part of it. But it's actually kind of nice. Like, cause you know, it was very simple to experiment with. So here's what's from this.
Starting point is 00:40:54 And where do you run that stuff? It's like, you're using like bare metal servers that you have like on the clouds. Yeah, exactly. So we need, we run it on this bare metal, these two instances, and then, you know, we maintain that pool of, of workers and we, you know, we start and stop instances as we need to, to, you know, need, need more resources. So, so, so from a, you know, user's point of view, they never have to think about those things, like we do that all the time.
Starting point is 00:41:22 Yep. And okay. The users do not have to think about that stuff, but you have do that all the time. Yep. And okay. The users do not have to think about that stuff, but you have to do it for them, right? So yeah. That's like OOPs, right? Of like such a product and infrastructure. How it looks like.
Starting point is 00:41:37 How like an SRE or like whatever OOPs title you have like in model like look like. All right. This might be like a controversial fit thing, but like, I actually don't think you should have like a separate ops team like early on, I think you should have like people who are like, maybe like more like interested in it who can like get on the thought board with it. But you know, early on, like I actually really think, you know, you align incentives really nicely if everyone's part of the firefighting at all times, right?
Starting point is 00:42:05 You know, if something breaks, anyone on that team just like jumps in. Ideally the person you're like, you know, overwrote the code, we're like, understand that part of the code that, I mean, we're early, you know, we're early, like we're six people. So we don't have like a dedicated office, you know, eventually we'll have, but not right now. Yep. Yep.
Starting point is 00:42:24 No, I totally understand. I think that's also my experience, like in building like teams. So how is it like, let's say different than what, let's say someone was doing operations in like a company that is, you know, like relying on like Kubernetes and like other kinds of primitives, like for the infrastructure there. How's like different the operations between like model and like a company that is more cloud traditional, let's say. You're never going to have to write a single line of YAML. That's the biggest difference. I think you are going to do like a very good job in like hiring through this
Starting point is 00:43:01 podcast episode, but we just did, so. Stas Miliusiskeva, Yeah, no, I think we have to write all the YAML, you know, once and so that our users never have to write YAML. I, yeah, I think, I mean, like, you know, ideally, like, you know, don't have to even think about it. So I think, you know, in your mind, I wouldn't even have to think about data engineering, right, like, you know, and this might also be a controversial opinion, but like, I almost wonder if like long-term, like data engineering as such, you know, as it is today, like we'll go away. Because I look at like all these startups, right?
Starting point is 00:43:33 Like, you know, every tech company in the world ends up building their own internal data platform, right? And they all kind of look the same. It's like, you know, you start out with like Kubernetes and, you know, a bunch of stuff. And then like someone builds an abstraction to make it easier to launch internal like machine learning models or train things in notebooks or whatever. And then, you know, but like, you know, what if like, you know, someone just built that, you know, and then like sold that as a service? Like I almost, I kind of feel like the world would be a better place than like, you know, instead of like, you know, to that, you know, 10,000 companies like building it themselves. And so that's sort of where like, you know, logically where I see the world evolving to is like, you know, a lot of more of those things should really just be like services that you use in the cloud. And, you know, so you don't even need to have necessarily a platform teams internally that does this. Yeah, makes total sense. Cool. All right.
Starting point is 00:44:27 I have like a question. While you were like talking and you were mentioning like the Docker images there and like how big these images are like in general and like my opinion is that especially when it comes like to infrastructure,
Starting point is 00:44:42 there are still like a lot of primitives around that they just feel like not the right primitives yet. They're like, yeah. Or sometimes what comes to my mind is that because we are probably not that far in terms of age, but if you remember back in early zeros when you had to download these stupid, super bloated installers to install something? And you had this really bad experience with like, why I need this thing? Sometimes I get the same kind of feeling, but on an infrastructure level, which obviously is much more complicated, right? But we have like, yeah, as you said, like, okay. Time zones from Uzbekistan.
Starting point is 00:45:26 Like, why do I need that? Like to run like a serverless Lambda function, but that like, yeah, our class one equals two. Right. So what, what's like, what kind of evolution did you expect to happen? And what are you like expecting to see in the industry? Like as new primitives that like we can work with the infrastructure at the end. David PĂ©rez- Yeah.
Starting point is 00:45:49 I mean, I, I think, I feel like there's a lot of like tools that are like, you know, they try to, you know, wrap underlying layers, but they end up being kind of leaky obstructions in a way where like, it doesn't prevent you from actually having to learn about those underlying layers anyway, and now you almost like kind of made the problem worse because now there's like another layer that you have to learn, right? Maybe like a good abstraction. Like, you know, if you build like an amazing data platform tool that wraps Kubernetes and you know, then like ideally, you know, if you use that tool,
Starting point is 00:46:25 you should never have to learn a single thing about Kubernetes or Docker or like whatever. But like none of those tools really work that way. And I feel like they always like kind of leak through in the end, right? And so that to me, I think, I don't know, like my theory is like the first generation tools,
Starting point is 00:46:42 they're all about like, they enable you to do something. Like they, you know, they solve like a hard problem, like a hard technical problem in a way that like, you know, there was no tool before that solved. And like people then have to use those tools by necessity. I think what ends up happening is the second generation then actually like kind of preserve the functionality, but like rewrite all abstraction and actually present it in a much better way where fundamentally the technical, the enablement is the same,
Starting point is 00:47:16 but you no longer have to jump through all these hoops to get it installed. And instead I was like, I don't know, like, you know, I am looking like, you know, machine learning, right? Like I started doing deep learning, like 2014, you know, those days. And it was like incredibly important back then. Like you had to like build all, you know, install Theano, like, you know, install a bunch of like random CUDA drives or whatever. Yeah. I guess she kind of still very much a problem, but like, you know, but to some extent, like now you can also just like, you know, go to, you know, a
Starting point is 00:47:43 plugin phase, like download a model and like, you know, now you suddenly have like a model, you can actually do like very cool know, but to some extent, like now you can also just like, you know, go to, you know, a plugin phase, like download a model and like, you know, now you suddenly have like a model, you can actually do like very cool things. But I think, you know, you, you guys, it's not like this like similar story to be told at like many different fields. Like, you know, sort of. Mitpackage thinks in a way that suddenly becomes a lot more accessible and, you know, it makes a lot easier and a lot more fun to use. Yeah.
Starting point is 00:48:04 Yeah. Yeah. I totally agree. All right. Last question for me, and then I'll hand like the microphone to Eric again. Eric, touch. So what kind of like opportunities you see out there, like business opportunities in like building new experiences like over like cloud. You mentioned at some point that we are still like very early on the clouds. So obviously you believe in that. That's why you built Modal. But what else is out there in your opinion, like interesting problems that can be also like interesting business opportunities?
Starting point is 00:48:45 Yeah. I mean, like here's like one thing I've been like ranting about on Twitter is CI. Like CI is like this like super janky experience today. Right. And like, I've been like, you know, I want someone else to build it, but I'm like, you know, I'm just like ranting about it on Twitter. It's like, if no one builds it in, you know, in the next year, like I might just do it myself. It's like, I have like an idea that you can even like use modal for some of it, but I don't, ideally I don't have to do it. But like, you know, what's like crazy about CI is like, you know, first of all, like, you know, I don't know, use like GitHub Actions or whatever, like then you have actually,
Starting point is 00:49:16 I actually think it's like a really cool product in itself. And I think it like, it sort of is the same story of like, it enabled me to do things in a new way. But, you know, today I want to do GitHub Act actions, like I was like, yeah, I don't stop and then, you know, like something breaks and then like I had this like super slow feedback cycles where you have to like commit something to get and then like wait like five minutes and then like see in the look, you know, so you have this like slow feedback loops, which is like the most like torture
Starting point is 00:49:41 relay in a developer is like debugging problems with slow feedback loops. And then, you know, you think about all this like extreme amount of like resource, like wasted resources. Like you have like, you know, 10 containers each pulling down the same libraries over and over again and installing them, like, and like, and you also think about, you know, I also think a lot about this, you know, unit tests. Like, they're the ultimate, like, parallelizable thing. Like, why can't I just, like, let's say I have this, like, large, you know, project.
Starting point is 00:50:12 And, you know, I have a thousand unit tests. Like, can I just, like, stitch every unit test in the Lambda function and just, like, run all of them at the same time? Like, you know, because, like, you have a human sitting there waiting for a test. You know, that's very That's a very expensive time. And also another thing I think about is you have this super annoying lack of parity between the CI environment and local testing. And think about it from that point of view. Why do developers even run things locally? It's because the AI environment itself. But if the CI environment was so good that the experience was as good as running things locally,
Starting point is 00:50:48 you wouldn't even run the tests locally. You would develop code and just launch a thousand tests and each running in their own Lambda environment, Lambda container or whatever. And then you would immediately just see the failing tests. What if you can build that? I think that would be a fantastic world. I was six years.
Starting point is 00:51:06 There's very few things where if someone said, would you spend $100,000 making your engineers not have to wait for CI? I'd be like, take my money. So, you know, I think it's a huge opportunity to do this. But, you know, if no one else does it in the next few years, like maybe I'll have to do it myself. I kind of don't want to.
Starting point is 00:51:26 Sounds good. All right. Eric, it's all yours. Yes. This has been so fascinating. I'm interested to know, have you, like, have you built anything, even if it's just for you or your small team to like address some of those CI issues? Or are you still just using like a process that you largely hate? I generally tend to build things when I don't like it, but yeah, I know so far we haven't
Starting point is 00:51:50 thought for CI. Yeah. Yeah. Super interesting. Okay. Well, we're really close to time here. And so I have a, I have a question that is unrelated to technology at all, but I want to know if you still have any of your Spotify playlists back in 2008, and if so, what music is on them? I mean, I don't know.
Starting point is 00:52:17 I do. I do a lot of playlists. I'm sure you can find it if you Google my name and browse it a little bit. I don't know. I'm just a degenerate Detroit techno fan, growing up in Europe. And I lived in Berlin at some point. So I tend to, you know, my taste tends to skew to those sort of styles. But I don't know. I always grew up listening to music. I listened to everything like jazz, hip hop, and like, you know, classical
Starting point is 00:52:47 music, like whatever. Like I, I do like music overall. And that was actually a fantastic experience working at Spotify. Like I, I got to use, you know, I got to, I, and it's like, you know, I almost like wondered, like I built the music recommendation system purely for like selfish reasons. I discovered like a lot of music throughout my own, you know, through my own system, which was, you know, kind of gave me some sort of pleasure. Yeah, for sure. That's super interesting.
Starting point is 00:53:11 I mean, as a Spotify user, I remember, I can't remember where I listened to it, but I listened to an interview with Daniel Ek and he was talking about how, you know, at some point you figured out that if you could recommend something new that a user liked every week, then they would, you know, essentially sort of say it was Spotify forever, right? Because you're sort of providing this like discovery experience.
Starting point is 00:53:36 Yeah. It's been very true. Like exposing downloads of like different things. And, you know, that wasn't me personal, definitely it was people in my team who came up with that idea of Discover Weekly I think Edward and Chris and a few other people
Starting point is 00:53:49 in my team fantastic idea that you know I still use every week it's you know it's a good product yeah
Starting point is 00:53:57 super interesting all right well Eric this has been such a great conversation so much more to cover so we'd love to have you back on the show
Starting point is 00:54:04 anytime anytime this is fun I really appreciate it Costas Such a great conversation. So much more to cover. So we'd love to have you back on the show. Anytime. Anytime. This is fun. I really appreciate it. Costas, this is going to be, this may sound like an interesting takeaway, but I remember being so interested. I think I mentioned on the show that I'd listened to an interview with Daniel Ek, one of the founders of Spotify.
Starting point is 00:54:20 And this was years ago, but for some reason, that interview has really stuck with me. And I'm going to draw a connection that Eric didn't draw, but that I think is interesting. figure out ways to almost like create a user experience that made users okay with latency because speed was a really big deal when you were trying to deliver a large file like an you know an mp3 over the internet streaming right and it was really interesting to me that he used a lot of very similar language when talking about modal and the developer experience, right? There's like a latency challenge and friction points. And it was, it's fascinating to me to think about the similar nature of those two problems. So yeah, that's, that's my big takeaway.
Starting point is 00:55:21 I think that's really interesting. Oh yeah. That's like a, I told like a great point that you're making here. It's very interesting, like how it connects the developer experience with like the user experience at the end. Because yeah, I think we tend like to consider them as like very different, but some assumptions that they are common, they just manifest themselves like in a different way. Obviously, a developer has latency in different things
Starting point is 00:55:49 than someone who listens to music. Sure. But yeah, that's an excellent point, actually. I mean, I don't know. I think every part of the conversation with Eric was great. I loved hearing about the stuff that they're building and how they build them. And what I will keep is that it's great to hear that you can have like startups today that in order like to operate, they have like to create their own file systems.
Starting point is 00:56:20 Yeah. That's like, I think like a great indication of like the progress that has happened like in all these years in terms of like the infrastructure that we have and the primitives that we have out there to build upon and yeah, it's like people like just shouldn't be scared to even like go and build their own file system if they have to. So that's what I keep. And yeah, hopefully
Starting point is 00:56:48 we're going to have him back soon. Yeah, I agree. Awesome. Well, thank you for listening to the Data Stack Show. Great conversation with Erik Bernardsson. And we'll catch you on the next one. We hope you enjoyed this episode of the Data Stack Show. Be sure to
Starting point is 00:57:04 subscribe on your favorite podcast app to get notified about new episodes every week. We'd also love your feedback. You can email me, ericdodds, at eric at datastackshow.com. That's E-R-I-C at datastackshow.com. The show is brought to you by Rudderstack, the CDP for developers. Learn how to build a CDP on your data warehouse at rudderstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.