The Data Stack Show - 172: How WebAssembly is Enabling the Third Wave of Cloud Compute with Matt Butcher of Fermyon Technologies

Episode Date: January 10, 2024

Highlights from this week’s conversation include:Matt’s background and journey with Fermyon (2:32)WebAssembly and enhanced security models (3:43)The IOT Startup and Google Acquisition (10:49)Googl...e's Early Containers (11:50)Scaling and anticipating requests (20:22)Introduction to WebAssembly and its importance (23:32)The Benefits of WebAssembly (30:57)Comparison of Virtual Machines, Containers, and Micro VMs (33:12)The Importance of Fast Startup Times in WebAssembly (37:39)Metaphysics and software development (42:12)The importance of effective communication in code development (43:18)The challenges and progress of WebAssembly (47:40)Requirements of different teams and different jobs (52:17)Final thoughts and takeaway (53:14)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You'll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by Rudderstack, the CDP for developers. You can learn more at rudderstack.com. We are here with Matt Butcher from Fermion. Matt, welcome to the Data Stack Show. We're thrilled to have you as a guest.
Starting point is 00:00:33 Yeah, thanks for having me. I'm looking forward to this. Oh man, so much to talk about. So give us a quick background on sort of where you came from and what you're doing at Fermion. Yep. So if we were to rewind to my high school career time, I would have told you that when I grew up, I wanted to be a philosopher. So when I started college, that's what I was setting out to do. But I had sort of gotten a job on the side doing some computer stuff. And philosophy degrees are expensive, especially when you're going to do
Starting point is 00:01:05 a bachelor's, then a master's, then a PhD. And so I ended up kind of paying my way through by writing software and doing stuff like that. And at some point, I realized that software was a lot more fun than philosophy and kind of switched career tracks. Of course, after I'd incurred lots of debt, I really went from there. And I first got interested in content management systems and did a lot of work in the Drupal ecosystem. By that point, I'd learned like Java and PHP and languages like that. Then I got working at HP Cloud originally to do their documentation. And as soon as I got kind of a taste of cloud technologies and what was going to be popular or possible and what, you know, I kind of had one of those moments where I saw a glimpse of the future and I was like, I want to be part of that. And that really shifted
Starting point is 00:01:48 my career. And I've gone on from there into, you know, through Microsoft, through Google and on into starting up Fermion a couple of weeks ago, a couple of weeks ago, a couple of years ago. Yeah. It's been a fast couple of years. A fast couple of years. Yeah. Very cool. And give us just a quick overview of what Fernion is. Yeah, we set out to build what we saw as the next wave of cloud computing. And we thought that the foundation of that was going to be a technology developed for the browser, but that we thought was better applied on the cloud or on the server side. And that was WebAssembly.
Starting point is 00:02:23 So we've been kind of doing the thing that we do best. And that's building an open source tool and toolkit that developers can use to get started. And then building a hosted cloud platform and a server side Kubernetes style application platform where people can run these things in their cloud. That's amazing, Martin. I'd love to get more into this
Starting point is 00:02:45 because like WebAssembly has been around for a while now. We've heard many things about it, like many different use cases.
Starting point is 00:02:54 It has been used in some like cases also like as part of like products and stuff like that. But we still get like this
Starting point is 00:03:04 feeling that we're still not there with like web assembly like there's a lot of promise and we're still like looking like to see how it gets delivered right like so one of the things that i definitely want like to go through like during our uh conversation is uh about that and i'm sure that you're going like to help us like understand what's going on with the ecosystem like today but what about you like what are like a couple of things that you're looking forward like to talk about during our like recording yeah that one i mean you just hit one of my favorite topics which is i think web assembly know, it has shown promise in a lot of different areas.
Starting point is 00:03:46 But until, I don't know, maybe a couple of weeks ago, some of the most exciting pieces of WebAssembly were not yet accessible to the general developer. It was all very R&D and a little rough around the edges. And now with the component model landing and being supported, suddenly we've got a whole bunch of new and interesting things that we can build with WebAssembly. And to me, the future that opens up out of WebAssembly and the component model is just so exciting. There are so many interesting things we'll be able to do
Starting point is 00:04:12 from true polyglot programming to being able to overlay security models and things like that in ways we've never been able to do before. So I'm looking forward to talking about this. I think it's going to be a lot of fun. Yeah, 100%. Let's go and do it. Matt, welcome to the Data Stack Show. Yeah, thanks for having me.
Starting point is 00:04:32 We love covering new subjects that we haven't covered before. And I guess, I mean, gosh, are we over 160 episodes now? I don't think we've talked about some really key topics like WebAssembly. And so there's a lot to cover, but we're going to start at the very beginning as we always do. So give us your background. How did you get into the world of data and engineering? And then give us an overview of what you're doing today for me. Yeah, sure. You know, I, when I was young, I wanted to be a philosopher. And part of the reason behind that was that I was very much interested in systems like the world is this very elaborate system that seems to be governed by scientific laws that we're still
Starting point is 00:05:16 just discovering. And, you know, it's a world that's simultaneously mysterious and yet predictable enough that we can survive pretty well for about 80 some years on average. And that was, you know, even when I was in high school, I was really enamored with that. And so when I went off to college, I went off intending to study philosophy. But along the way, I happened to get a job doing some IT stuff and then software development and then early web development. And that I use that as a way to pay my way through, through school. And as I got going, you know, I advanced my philosophy career up until I got a PhD. I wrote the dissertation I taught for a little while, but all the while I was still doing
Starting point is 00:05:59 software development on the side, you know, with, you know, Java and Pearl and stuff like that, then moving on into Python and JavaScript as Node.js got popular and on into newer languages. And at some point I had one of those, one of those mornings where I woke up and went, one of these two careers, you know, is really lighting me up every morning. And the other one is moving really slowly. And I, it's time to make a choice. And I kind of said, okay, I'm going to reserve philosophy for the weekend passion projects and I'm going to go all in on software development. So I really got going in content management systems at that point.
Starting point is 00:06:35 Drupal was kind of like the new hotness at the time. I really liked it. New PHP and so could do a lot of development there. And I spent years just building websites with Drupal working at various places. And at one point I was offered a job at Hewlett Packard. HP was just getting into the cloud space and cloud was just at that point starting to get really popular, right? I'm going to tell it like it is right here's HP, you know, one of the tried and true, you know, original Silicon Valley powerhouse technology companies watching a bookstore take over the cloud world and going, wait, Amazon can't win this battle. battle were hp and so they started a group called hp cloud and i i joined that group to do the cms systems that were going to that shared documentation and all the marketing pages and all of that
Starting point is 00:07:30 and i was building that in drupal and how's that help orient just a little bit sorry to interrupt so can you just give us a timeline here when is hp realizing this just to sort of orient us because we live in the days of prime or maybe post-prime when one day means three days. But, you know. This must have been like, what, 2011-ish, I want to say. Somewhere around there. Maybe 2010, 2011.
Starting point is 00:07:55 Time, man, really starts to blur together. But yeah, that would have been the time frame. Pre-cloud warehouse. Yeah. Okay. Yeah. And OpenStack had just sort of come on the scene. Right.
Starting point is 00:08:08 So up until that point, it was sort of like Amazon had built their thing, which was entirely proprietary. Microsoft had built their thing, which is entirely proprietary. And then out of like rack space and the NSA, you know, an unlikely Alliance comes open stack,
Starting point is 00:08:24 which promises first compute. And then, you know, object storage and other forms of storage come after that networking. And it was a fun, fun time to be involved in that ecosystem because it's like every morning you'd wake up and brand new features had dropped. There were so many developers working on it. It was all open source. It was happening very quickly. We were maturing very rapidly. And it was just, it was a really fun time to be in the cloud ecosystem. We were all just kind of starting to understand exactly how big this thing was going to be.
Starting point is 00:08:53 You know, it's all pre-containers. So Docker hadn't yet come around. It was very heady times, right? And at HP, I mean, this kind of vision that, you know, this was a new area and we could just build something that would be unrivaled, right? And at HP, I mean, this kind of vision that, you know, this was a new area and we could just build something that would be unrivaled, right? And catch up and then pass everybody. And we had this firm vision for where we were going. It was very exciting. So when I, when I was doing the website development for HP, I asked, you know, can I switch teams? Can I start working more on the compute side of things, the core open stack side
Starting point is 00:09:26 of things, and gradually sort of finagled my way over from documentation and running this big Drupal site and writing a lot of PHP to then writing some JavaScript to do JavaScript, node JS bindings into this kind of thing, and then worked my way over into the platform as a service and ran the platform as a service team. And it was just, it was kind of a fun, like, you know, those kind of, you know, those sprinklers that are, you know, and you hear the clicks as they switch. That's how I felt like my career was doing. I was just clicking through a bunch of different roles until I got to the one I wanted, which was leading the platform as a service team there. And it was so much fun. But along the way, it sort of set in. There were some internal hiccups. The VP that I worked for, whom I absolutely loved, had departed HP. We kind of lost our vision. And it was starting to look unclear where we were going, how we were going to get there. And I hit this point where I was just sort of depressed.
Starting point is 00:10:26 And I guess I'm, guess I must've been moping around the house a lot because my wife was like, maybe you should look at a different job. Wise man to listen to her. I'm assuming. Yeah, I did. Yeah. Well, yeah, I, she went and actually she said, maybe you should look at a different job. I've been job hunting for you. Here's my list of several. What a woman.
Starting point is 00:10:51 I know. Right. She was amazing. And not only that, but she picked the job that as soon as I saw it, I'm like, oh, I want to do this. She had found an IOT consumer IOT startup in Boulder. We lived in Chicago at the time, found one in Boulder that was looking for a head of cloud, somebody to really help them take this thing from an early POC into a product, which was exactly the kind of work that I thought would be a really rejuvenating experience after sort of feeling ground down and worn out. And it was in Boulder, which was closer to family and also closer to the mountains. And so I flew out here, interviewed, took the job, moved the family out here,
Starting point is 00:11:27 and started working on this IoT backend, a very awesome cloud system, met some amazing engineers. We worked really hard on this kind of virtual machine-driven platform that was a backend for IoT. A lot of fun. And we were having so much fun that we attracted the attention of Google who acquired us. So I went and spent some time inside of Google, worked inside of the Nest team there. That was a really eye-opening experience because Google's infrastructure is just so much bigger than anything I'd seen before, even compared to what we had at HP.
Starting point is 00:12:02 And they were using this kind of the early containers. And I had been dabbling with containers on the side. And when I saw the way they were doing Borg, I thought, oh, this is just mind-boggling and awesome at the same time. So that was- Sort of like the, you know, Docker is like now a thing, you know,
Starting point is 00:12:23 where's the core? But then you sort of got to go into the heart of the beast and see how Google's doing. Right. Yeah. Yeah. Cause I think so Google had been using LXC containers, which are sort of like the, one of the early analogs to Docker. Docker had just kind of come on the scene they were building some interesting but not quite production ready containers at that point um and and but google was on the opposite side they had this big giant container ecosystem the the user wasn't really exposed to directly you couldn't upload a container there you would write app engine software and it would be deployed
Starting point is 00:13:02 in containers the wizard wizard of Oz. Yeah, exactly. Yeah, exactly. No, no peeks behind this curtain. But the awesome thing and the thing that you did get to see if you look behind the curtain was they had this orchestrator called Borg that knew how to take all of these containers and shuffle them around and put them in the most reasonable, on the most reasonable compute platform. And so Borg will come into this
Starting point is 00:13:25 story a little bit later, but that was my first peek at Borg there. I made it at Google for a while and then I got this hankering to go back to startup life. And in particular, I wanted to do the container thing because now that I understood it, I was really excited about it. And I wanted to do more of like the PaaS platform as a service Heroku style thing. Again, you know, run the infrastructure behind something like that, like what I was doing at HP. And so I found another startup in Boulder called Deus. And Deus was building an open source Heroku competitor based on containers. And they were looking for somebody to do sort of the architectural work behind this.
Starting point is 00:14:01 And I'm like, this is the perfect job for me. So I joined Deus and about, I don't know, maybe six or so months into working at Deus, Google did something that really surprised me. They dropped an open source equivalent or version of Borg called Kubernetes. And it was like 1.0, 1.1. It was held together by toothpicks and marshmallows, but it was like, I saw this and I'm like, oh yes. This is like, it's open source now. We can build all kinds of things on top of it.
Starting point is 00:14:33 So the CTO and I convinced the rest of Deus, I was an architect there. So the CTO and I convinced the rest of Deus that we should replatform our paths on top of Kubernetes. And that, you know, it was another like little sprinkler kind of thing in my career, because what I didn't realize was Kubernetes was on the cusp of really exploding. And we were starting to build key pieces of Kubernetes. So we built Helm, the package manager for Kubernetes. We were building a whole bunch
Starting point is 00:14:58 of other projects for Kubernetes. And once more... Building like Helm and other stuff, you were doing that instead of deus at deus yeah we we built helm as part of what we thought was going to be you know the long-term deus offering realize now holy cow okay yeah yeah yeah i mean okay so so helm came out of a hackathon project so we we did this all hands meeting i'll tell you this story really quickly because it's kind of funny sorry let's stop. We're diverting here. Sorry, bro. That's right.
Starting point is 00:15:27 Yeah. Nothing is linear with me. So, so we, I had, you know, Gabe and I had the CTO and I had basically said, okay, we think the right move is to switch over to Kubernetes and sort of replatform on Kubernetes. And Gabe said, you know, we're doing this all hands meeting. I really want you to, you know, come up with some things we can do to get people going on Kubernetes. And so we decided we'd do a hackathon.
Starting point is 00:15:48 We decided we'd do a session on, you know, what Kubernetes is. And we lined this all up. And so that the hackathon, the idea was we'd kind of challenge people, hey, build something fun and cool that's sort of in this new cloud ecosystem. And the winner, the winning team will get a $75 Amazon gift card. And the average team was three people in size. My mind, Jack Remus and I were the three who worked on this. And so we sat down and did some brainstorming. I was telling him, you know, we're trying to figure out how to install our new Deus paths on top of Kubernetes. And we
Starting point is 00:16:22 ended up talking about NPM and package management. And we decided we'd build a package manager for Kubernetes. So it's called Kate's Place, K-8-S Place. And it was coffee shop themed. And so it was all, you know, we had this whole like Kate's Place is this nice coffee shop where you go and you get little shots of Kubernetes installs and stuff. And we just, we skipped the team dinner. We did, we worked all night. We,
Starting point is 00:16:45 you know, worked the next day, built this little demo of Kate's place, the package manager for Kubernetes. And we demoed it the next day and we won the $75 gift card for Amazon. I blew my 25 on coffee. So the offsite ended, we all went back to our homes. And the next day I got a call from the CEO and CTO of Deus. And they're going, so you know that package manager thing? They're like, yeah, that's a really good idea. I think package manager for Kubernetes, that's an idea that's got some momentum behind it. We should do that. I think you should start building that as your full-time job.
Starting point is 00:17:24 And we'll give you a team. You can pick a couple of people to be on the team and get started building that. I mean, this is like, this is what we all dream of when we do these hackathon projects. It's like, Hey, if I could invent my own day job, I'd do this. And here I was basically getting, you know, carte blanche to do my little idea. And it was fantastic, but they said, this is just one thing. And I said, yeah, what's what's that they said we really hate the coffee shop theme so i don't know like all the things to be you know devoted to the name was not one of them i'd rather build the software i want to build yeah so jack yeah so jack and i jack francis and i sat down with another one of the other people on the hackathon team sat down with a nautical dictionary and started reading it out loud to each other, trying to come up with a name.
Starting point is 00:18:06 And that's where we came up with Helm. That's where we came up with calling the packages charts. It was all just sitting there reading this little dictionary. What a story. That's right. So the next time you get an opportunity to do a hackathon, do it. Yeah, totally. Okay, well, take us.
Starting point is 00:18:23 I mean, that was an amazing detour, but so take us from that point. And then how did you get to Fermion and tell us what Fermion is? For sure. Yeah. So, you know, Helm and the other things we were doing in Kubernetes land attracted the attention of Microsoft who was trying to, you know, Brendan Burns, who created Kubernetes left Google and went to Microsoft and started building a team. And part of that effort was them acquiring Deus and rolled us into the Azure part of Microsoft. I had a fantastic job there.
Starting point is 00:18:57 My job there was I got an open source team and my mandate was, you know, find what's missing in the container and virtual machine ecosystem and build it and open source it and, contributed up to the CNCF, the cloud native computing foundation, the governing group for Kubernetes and the like, and it was fun and we had a lot of fun, but one of the coolest things about a job like that is that you're always out there asking questions of people, you know, customers, other teams inside of Microsoft and so on. What are your big problems, right? What can you not do?
Starting point is 00:19:28 Where are you struggling? What are the roadblocks that are preventing you from migrating workloads to Kubernetes or questions like that? And then you get these challenges back and you just try and build solutions to them. And some of them, it's fairly straightforward and you build solutions like OAM or like Brigade that just kind of answer people's questions. But some of the problems were really vexing.
Starting point is 00:19:51 And really, we could not figure out good solutions. One of them was we really wanted to be able to scale workloads to zero. So when you're dealing with huge amounts of compute, during peak time, you might be consuming like nine different virtual machines. And during low times, you might be consuming none, right? You might have no traffic in the middle of the night. So you should really be able to scale from zero on your workload up to being able to handle tens of thousands and as close to instantly as possible.
Starting point is 00:20:22 But scaling is bound to the problem that when requests come in, you either have to be able to start up really fast, or you somehow have to anticipate ahead of time that when the requests are going to come in and scale up before the traffic starts to go up. If we were all good at predicting the future, you know, stock market would be no fun and neither would gambling. So we took the approach that we needed to come up with a faster way to do startups. Another problem that we ran into around the same time was a lot of developers were telling us that building Docker containers was cool, except they had to know ahead of time what the operating
Starting point is 00:21:03 system and architecture of the target environment was going to be. And then oftentimes they had to know ahead of time what the operating system and architecture of the target environment was going to be. And then oftentimes they had to do really ugly cross-compilation steps if it was different than theirs. So if I'm writing code on Windows running an Intel machine, running on an Intel architecture, and I'm deploying to Linux on an ARM architecture, my deployment life is going to be kind of hard. And so we were looking for what Java promised at the beginning, a compile-once-run-anywhere style of thing for cloud workloads. So those are a couple of the examples of things that we were working on that we just couldn't figure out. And so at one point, we started saying,
Starting point is 00:21:39 well, we can't do this with virtual machines. And we also can't do this with containers. And we've been trying this for months, if not years. Maybe we should open ourselves up to the possibility that there's a third kind of cloud compute that nobody has started using yet. What would the characteristics be? Well, it would have to have a cold start time that was like 10 milliseconds or under so that we could rapidly scale up when load came in and we could rapidly scale down without worry when load left. We want it to be cross-platform and cross-architecture. Of course, it has to have a really good security sandbox model because that's essentially what a
Starting point is 00:22:17 cloud runtime has to guarantee for you that you can run, that you as the operator can run untrusted code from anybody else who's willing to pay the subscription fees. And you can do it without risk to yourself or risk that they can attack other tenants in your environment. And so we had to approach to this problem this way and begun looking at potential technologies that can solve it. And that's kind of what led us up to, first of all, discovering WebAssembly, which was originally a browser technology. And then second of all, going, wait, we've got an idea here. And we have pretty much a team of amazing experts in this field. Maybe we should do the startup thing.
Starting point is 00:22:55 And so a couple of years ago, we started Fermion Technologies with the idea that we could build this next wave of cloud computing using WebAssembly as the platform. I love it. Okay. Can you, let's start with a couple of definitions. I know Costas is chomping at the bit with a bunch of questions, but let's just do a couple of definitions before I hand the mic off. WebAssembly, what is it?
Starting point is 00:23:20 Break it down for us. We actually, I don't think we've talked about this on the show before. So this is like a first sort of definition, which is exciting. So yeah, that is exciting. A lot of pressure too. Yeah. Yeah. No pressure. This is just a conversation. What is WebAssembly and why is it important? Yeah. We'll give the most boring definition of it. And then out of that kind of unpack why it's actually pretty exciting. The most boring definition of it is that Web Assembly is a binary format that you can compile different languages to. So, you know, if you're compiling natively on Linux, you're compiling to the ELF format, right? And you've got separate
Starting point is 00:24:03 compilation targets for every, well, probably every operating system out there, but at least the big three or four, I suppose others probably borrow. So we're going, okay, so that compilation process is part of what introduces the cross-platform, cross-architecture problem that we had seen. But if you find a binary format
Starting point is 00:24:23 that could run on any architecture and any operating system, and it had the right security sandbox, then those were two of the really big checkboxes on the list. So WebAssembly happens to also have a couple of other virtues. I should back up and say, what was WebAssembly originally designed for? Because once we understand that, then we start to see why this story is so interesting. WebAssembly was originally designed to run in a web browser. And the original intent of WebAssembly, if you go back to 2015 when Luke Wagner and a group of people at Mozilla started it, the stated goal was we want to build a platform-neutral binary format that can run inside the web browser and that different languages can compile to so that in the browser we can run other languages
Starting point is 00:25:13 side by side with javascript so you can imagine some of these use cases right i've got this cruft dc library that's been around since before i was born i don't want to have to rewrite this in javascript but i also know that it does something important wouldn't it be cool if that's been around since before I was born. I don't want to have to rewrite this in JavaScript, but I also know that it does something important. Wouldn't it be cool if I could compile it to something that I could run in the browser and make function calls from JavaScript into this C library? Yeah. Those are the kinds of cases that were
Starting point is 00:25:38 in the original scope of WebAssembly. Yeah. Figma, in fact, if you've ever used Figma and some of the other, Adobe, I think also, they use WebAssembly in browser to be able to, they write code in C++, compile it to WebAssembly, and then use JavaScript to kind of call into it. And that's how they get such great performance on all their vector drawing is because some of that's going through C++, not through JavaScript. Fascinating. So that was sort of like a transformative experience. If you transition between the web app and the desktop app, which are, you know, obviously under the hood, like, yeah, it's really the same thing.
Starting point is 00:26:17 It is pretty wild. Like, it's pretty anyone who's used design software, which I'm not a designer, but Brooks knows that I will get into some design files, much to my design team's chagrin. But that's actually the thing that I noticed the most that is absolutely unbelievable, is that it is a seamless experience, and it's so fast. Like, it's dealing with some pretty large files.
Starting point is 00:26:45 Yeah, and some pretty complex on-the-fly calculations too, because you can drag, resize things very quickly and not have any kind of lag like we used to see in sort of the olden days of the web. Sure. But when you think about how then something like those Figma libraries would have to run in a browser, particularly if you're thinking sort of generically about this and not in
Starting point is 00:27:07 the case of one particular application, there are about four features that you would really want. The first one is a sandbox. You'd want a very strict security sandbox because again, you know, the browser is running binary code that it has not inspected inside of an environment. So not only do you kind of have to be able to protect the system from getting rooted by gnarly binaries that you downloaded, but you also have to protect the JavaScript sandbox because that's an attack vector.
Starting point is 00:27:34 So the sandbox that you have to design for WebAssembly ends up having to be very good and very reliable. So, and which of course, one of the check boxes for the cloud, we want that same level of reliability. Another one is we are notoriously impatient when it comes to waiting for web pages to load on the internet, right? We want them to be snappy. Some of the research suggests that at a hundred milliseconds, one piece of research I've read said in within 10 milliseconds, people's attention actually starts to dwindle, which is remarkable because that's way before we're aware of our attention starting to drift. But that's how impatient we as human beings are. So the WebAssembly sandbox had to be very fast.
Starting point is 00:28:16 Maybe that's more reflection on society than the technology, but we'll save that because I want you and Costas to discuss some philosophy. Costas is a philosopher and that's your training. There you go. There you go. We'll save that for later. So yeah, we'll spare the societal. Yeah. Yeah.
Starting point is 00:28:35 So we got two more on WebAssembly. It has to be cross-platform and cross-architecture too, because we want to be able to run. You can't have it where Figma works on one operating system. And then I open my MacBook M1 and it's like, sorry, this processor is not supported. That'd be a horrible experience. So the binary format also had to be cross-platform. And then the last one was really the most audacious of all of them. And that was that the format was designed so that any language could in theory be compiled to it. And that's pretty wild because essentially
Starting point is 00:29:05 what the precondition for success of WebAssembly was is that they would be able to rally enough language communities that we would actually get WebAssembly support in languages from C and C++ to Rust and Zig and Go to Java and.NET and Python and Ruby and all of that, right? And it's remarkable. We bought into that. We bought into that early in Fermion. But we were also, that was identified as our first major risk, is that if that didn't really take off, then we would be in trouble.
Starting point is 00:29:36 And I think Costas was talking a little when we chatted beforehand about how WebAssembly has sort of seemed to have fits and starts as it's gotten going. And one of those has been, you know, early buzz was not fulfilled when there weren't enough languages, when you could really only write in C and Rust, it wasn't terribly compelling. And in the last year, in the calendar year 2023, we have gone through language after language adding support..NET has piloted support for all of the.NET languages. Python and Ruby have added support. Dart and Kotlin are coming along.
Starting point is 00:30:13 You know, and languages like Rust and C++ and Zig and all of those continue to mature. Swift is moving along. It's like, whoa, the most ambitious part of all of WebAssembly is actually happening this year. And that's been really exciting. So you can kind of see there were four little attributes there that were designed for the browser. All four of those ended up being really important in satisfying the conditions we were looking
Starting point is 00:30:36 for a cloud runtime. And in particular, we did kind of skip over this. The workloads that I was most interested in when we were looking for this third wave of cloud compute were what we would call serverless or FAS or functions as a service. The kind where we wanted to do a discrete step, start it up, run it to completion, and shut it down as fast as possible. So the most simple way we can think about this is, hey, a user makes an HTTP request. We answer the request, send back a response and shut back down right away. And then we're not running any long running processes. So that whole scale to zero thing
Starting point is 00:31:10 just sort of automatically falls out, right? When load is coming in, we might have 10,000 WebAssembly functions firing off, answering all these requests. When, you know, 2am rolls around and everybody's asleep, we can scale, you know, there's nothing running and essentially we're not paying a compute bill. So that was kind of one of the workloads that we had really targeted as being perfect for this third kind of cloud computing, which WebAssembly then turned out to be a pretty good example of. Yeah, that's great. But quick question here, because, okay, I think one of the, what makes WebAssembly a little bit confusing to people out there who haven't been active in WebAssembly itself is there are so many different use cases,
Starting point is 00:31:54 right? Like from someone who listens about all the stuff, we have the security, we have the serverless model, we have the polyglot part of it. And we have web also. But let me ask you the following question. One of the ways that you position it as part of computing in general is next to containers. Fixing some of the problems that containers traditionally had, right?
Starting point is 00:32:28 And we have a couple of different primitives here. We have containers. Before that, we had virtual machines. Now we have WebAssembly. And we also have micro VMs, right? So we have systems like, for example, Firecracker, which gives you the opportunity to solve some of the problems of the cold start problem, like fast systems and all that
Starting point is 00:32:57 stuff. What are the differences between all of them? And how do they fit, let's say, in the infrastructure world? Do they, let's say, compete or complement each other? That is, I think that right there at the end is a fascinating part of this whole thing, right? We're building this big cloud world. And every time we introduce a new technology, it kind of competes and it kind of complements. And I think if we look at virtual machines, right?
Starting point is 00:33:26 So what does a virtual machine do? What is it for? A virtual machine runs an entire operating system from the kernel and the drivers all the way up through the libraries and the utilities and on into your user land code. And it packages, you package all that stuff up and you ship it off to somebody else's hardware and you execute it there in the cloud. And so you're really thinking like soup to nuts, the entire operating system. Now that's great for a number of workloads. Some of them are, which you're running large scale databases, things where being able to
Starting point is 00:34:03 tune up the kernel parameters or the driver parameters is really important. You can use these things and be highly effective. But I, as a developer, and I think many developers out there, regardless of what, you know, domain you're working in are going to go, yeah, but they're no fun to build. They're actually really hard for a developer to build because it requires a tremendous amount of operational knowledge to assemble them. And then they're very hard to maintain. So really, as a primitive, they've worked very well for platform engineering and DevOps and teams like that who are focused on the operation of a system.
Starting point is 00:34:36 But they weren't as popular for developers. And that's where containers came in. So a container does not have a kernel or low-level drivers, right? A container is just sort of like a little pie-sliced version of an operating system. It has just the part of the file system your application needs, just the supporting files it needs, just the system libraries it needs, and your binary. And it's great for long-running processes that perhaps don't sort of need that low-level access to the kernel and don't sort of need that low level access to the kernel and don't really optimize at a low level.
Starting point is 00:35:08 So you can think, you know, web servers, microservices, those kinds of things work great in containers and developers. We like them because they are a lot easier to build. You write a Docker file that just plonks your binary file inside of one of these images and it packages it up and then you can ship, you know, instead of a six gig or 20 gig virtual machine image, usually you're talking about maybe a hundred meg of slices of operating systems that you're pushing and moving around. And those are really good for long running server processes.
Starting point is 00:35:41 It was the next class of computing, that serverless one that I was talking about, where really you don't want anything long running. You want a process that gets started up when a request comes in, handles the request, returns a response, and then shuts back down. The typical container takes about a dozen seconds to start. The typical virtual machine takes a couple of minutes to start. So you can't really effectively start up, handle a request, and shut down when that's the characteristic of your underlying runtime. So the way this was solved in sort of like serverless V1 worlds, right, with early Lambda and all of that, with Lambda today, Azure functions, Google Cloud functions, things like that, is you essentially pre-warm virtual machines and keep a huge queue of virtual
Starting point is 00:36:24 machines around. And then as requests come in, you drop a workload on a pre-warmed virtual machine, execute it and tear the whole thing down. So it's inefficient and it's actually fairly expensive to operate. And that was, you know, seeing how this worked behind the hood in Azure was one of the reasons why we identified this
Starting point is 00:36:44 as an interesting problem to solve. Because anytime we can reduce the amount of energy consumed and drive down prices and free up computing resources to do other things, you know, from the perspective of someone like Azure or Google or AWS, this translates directly to not just cost savings, but actually being able to do more with the compute power they have available. So essentially, you can sell more faster if you can do this kind of thing. For us as consumers, right, it's really about the fact that we're only paying for traffic
Starting point is 00:37:14 when the workload is actually happening, right? When there's traffic coming in, then we're watching our function startup run to completion shutdown. When there's not traffic coming in, we're not paying anything. And so it's compelling really on both sides of that story. Micro VMs are another attempt to solve a similar problem here, playing on this idea that maybe you can strip down a virtual machine to the point where it starts up in just several hundred milliseconds. A lot of that is very promising. And for some kinds of workloads, I'm pretty excited about that. And we use it a little bit here and there. But if you compare, so a typical AWS Lambda function
Starting point is 00:37:50 takes about 200 to 500 milliseconds to cold start. And then that's the amount of time it takes from when the request comes in to when your code starts to execute. It's all warming, right? That's fast compared to several seconds for a container, but it's slow if you're talking about a user request, right? Google starts to ding you on your page rank if you exceed 100 milliseconds before delivering your first byte. If it takes two to 500 milliseconds just to cold start before you're even doing your processing, you can't build the kind of high-performing system that you want for user-facing web applications. So when we looked at WebAssembly,
Starting point is 00:38:31 one of the key things there was, can we get it to start up really fast? And right now, you know, originally we were at 10 milliseconds. Then when we released Spin 1.0, we were at one millisecond. When we released Spin 2.0 last week, we were down at half a millisecond or less to cold start.
Starting point is 00:38:46 That is the time it takes from the record when the request comes in to your code being executed was under half a millisecond. And that gives you the developer about 100 and some, about 100 milliseconds to try and get those first bytes back to Google and score high on page ranking, a very high for responsiveness. If you're doing anything like streaming
Starting point is 00:39:04 or things like that, where it really matters. This is a big deal. This is a very big deal. That's amazing. Okay. I want you now to put your philosopher hats and give actually an answer as a philosopher to engineers. And the question is how much abstraction over the hardware is too much abstraction because we've talked about virtualization in like so many different levels right and i wonder like at what point maybe there's no point, right? Maybe abstraction is eternally
Starting point is 00:39:46 ad infinitum, something that we should be doing, right? But I want the answer from not the angle of the engineer here, because as engineers we thrive in abstraction, right? That's how... We are lazy. We want to abstract, so
Starting point is 00:40:02 we can build one thing and apply it to many things, so we don build one thing and like apply it like many things so we don't have like to do it like like in it again but from a philosopher's point of view right like how you would say to your engineer side to stop abstracting yeah so abstraction comes with a cognitive cost and that's the most important thing for us all to remember, right? And so if you look at, so the discipline in philosophy that most deals with trying to understand the structure of the world is called metaphysics, right? And if you rewind history all the way back to the very earliest philosophers, you know, Plato and Aristotle did, both of them worked very much in this field of metaphysics. What kinds of things is the world composed of? In fact, Aristotle coined the term
Starting point is 00:40:50 metaphysics because he said it meant what must come before physics. What do we need to understand about the world before we can understand how the pieces of the world are interacting? And he said, you know, what we need to understand is what the actual structure of the world is. What kind of stuff is the world composed of and how complex are the sets of rules? And what is computer science, if not applied metaphysics, right? Here we have this ability to build systems that are based on the way we think about the architecture of things. What is a shopping cart? And what is an online store? What are the components I need in this? And then we start building the rule systems around them and how they work together. So in a way, your question is perfect because the history of
Starting point is 00:41:34 philosophy can inform exactly what we're trying to do in computer science. And what you see from Plato onward is metaphysics going through these cycles of getting increasingly complicated and then getting to the point where they're out of touch with reality. That is, they're so impenetrable that it's hard to even test whether you're describing reality anymore or not. And then after that, you know, you start to see them retract again and you get movements like empiricism or stoicism or even skepticism, the idea that all metaphysics is doomed. We might as well just live life as it is and doubt that we actually know anything. All of these movements are kind of reactions against the fact that metaphysics can lead
Starting point is 00:42:17 to systems that are so complicated and so hard to even test whether they are actually describing the world that they become essentially either useless or vacuous, right? Either there's nothing we can do with them that's productive, or they're so difficult to explain that by the time we're in that sort of like enlightened cogitation about them, we're not really talking about anything people care about. I think that particular play in philosophy that we've seen now over thousands and thousands of years should inform the way that we build systems and software. Because to your point, what is an abstraction for? It is to, well, a programming language, right? The nuts and bolts of what we are doing as software developers is attempting to build a language or languages
Starting point is 00:43:02 that help me describe to you what I'm thinking and help both of us describe to a computer, a pure deductive logical system, how to execute things in a step-by-step way. So we've got kind of dueling objectives here. On one side, it's how do we make sure that we are explaining it at the level of terseness that the computer needs to be able to execute it? And that's what compilers are for. But it's also, you know, part of the reason why some of our languages have peculiar concepts like the borrow checker in Rust or type systems in languages like Java. But the other thing is you and I have to be able to communicate effectively on our code, right? If you and I are working together on a code base, if I write code that you don't understand, I'm making a mistake. And likewise, if the two of us get together and start building these grand edifices that use all kinds of specialized terminology and we build lots and lots of layers of abstraction, and then Eric comes in and looks at this and is like, I don't even know where to start, right? This is so complicated. I have no idea.
Starting point is 00:44:09 Then we've failed as software engineers, right? So that's the framing for the answer. The next question you really have to ask is, well, the thing we can do is we can solve this problem by introducing abstractions and specializations. And that's what I think has happened, right? We have data stacks that are designed for data processing. We have web stacks that are designed for web developers. We have IoT ones for IoT developers. And we've managed to do a reasonable job of carving up our day jobs such that we can have some divergences in there in terminology. We can use terms like node and every one of us thinks a different thing when we hear that because we applied it in different ways.
Starting point is 00:44:48 And we can have some success there. And we can actually look at science and see that science has been relatively successful where it started out as a unified discipline and has since broken out into sub-disciplines like physics and biology and stuff, and then broken out into further sub-disciplines like astrophysics and things like that. And there's been some success in doing things that way. But at each time we do that, we introduce a new level of complexity, which we have to acknowledge when we do it. When I introduce this new level of complexity, I'm essentially saying either there's going to be a new specialization that comes out of this or i'm gonna end up you know making this too complicated for a person so i don't know if there's a strict answer to your question but at least there's kind of a framework for thinking about it no and i think it's a great framework like to ask like my last question which has to do and go back to the web assembly, like context again.
Starting point is 00:45:50 So when I tried, and it's been a while, I have to admit, like to play around. Okay. I experienced a lot of like, I love like the word that you said, like the cognitive, like cost that I had to go and like figure figure out what I can do with this thing. The pitch was great. All these things of like, oh, now maybe I can take Python codes, for example, and run it as part of my Rust code or vice versa. That's great. That's great.
Starting point is 00:46:23 I'd love to be able to do that, especially for me as a person who comes from the data infrastructure. And I've seen how big of a moat for old systems has been the fact that a lot of code has been written in legacy systems, right? And we cannot just move it easily to a new one, right? So just moving UDFs from Hive to Spark would be amazing. Yeah, it would be like... I don't think people realize how many millions of dollars would be saved by doing that, right? Yeah, yeah.
Starting point is 00:47:02 But when I started playing around, I got lost. And then I gave up. And the reason I'm saying my personal story is because I think I'm not the only one out there. And I would like to ask you why this happened with WebAssembly. Why we had this process where the promise was really big like people were like really eager about like getting into that but it feels almost like we're still waiting to see the outcomes of that like to see them like applied right like what was missing and if it's not missing anymore like what happened and i i think the answer to what was missing was it's typical of many systems and maybe WebAssembly got a little more hyped than we thought it would, a little bit faster than we thought it would. But the early tooling for WebAssembly was actually fairly difficult to use.
Starting point is 00:47:57 And you might follow a set of instructions. It was like, download this library, put it in this place, download this tool. We'll tell you later what this tool does. Trust us, download it, install it. You'll need it. You know, that kind of thing where you're like, okay, step 15, install the WebAssembly compiler. Now I can start writing my first piece of code. You know, that was a, that's an experience that was non-ideal. And that was the way things were when I started working. When we started Fermion, that was the way things were. And one of the first things that Fermion did, we said, okay, our first user story has got to be as a developer i can go from blinking cursor to
Starting point is 00:48:29 deployed application in two minutes or less and that was exactly because of the problem you described that we came up with that user story first first thing we have to prove is that this is easy and that the developer doesn't actually have to understand the web assembly bytecode format or what a runtime does or which tools are used to assemble a thing this way. They just needed to be able to write code in kind of their usual way. And WebAssembly is still there. Some of the standards are still in flight. So for example, networking is not fully baked yet. So there are some things we know are still going to be a little hiccupy for users as they get going. But for the most part, by taking that perspective, you know, we spent most
Starting point is 00:49:06 of 2022 and part of 2023 going, we just need to get a developer experience where you can do, you know, just a couple of commands. So for us, it's like spin, spin is our open source tool for developing WebAssembly serverless applications. And so you can do spin new, tell it what language and give it a name and it'll scaffold out a project for you. So spin new Rust, you know, foo. And then spin build will compile it for you. So you don't have to know all the compiling commands for each and every language. And then spin up will allow you to test it locally and spin deploy will allow you to push it somewhere else and run it.
Starting point is 00:49:37 And we thought if we can build an experience that's that simple, then developers can trust us that we're not going to just overly burden them with a whole bunch of new things they have to learn. So I think we've made good progress on that in 2023. The component model is one of the things we are most excited about. You alluded to it there, and it's new in Spin 2, which came out only, well, first week of November is when Spin 2.0 came out. And the component model is the first step against the trend you described so we waste huge amounts of time in this discipline re-implementing the same thing in lots of different places and lots of different languages and the component model allows two web assembly binaries to talk to each other or more specifically it allows a binary to say this is these are the
Starting point is 00:50:22 functions i export and these are the functions i need to import. And then you can start negotiating how you put these together, right? So essentially, binary, WebAssembly binaries can work like libraries. So I can say, hey, I need to import this thing that it provides the YAML parser, I'm going to use it. I don't care what language it was written in. So suddenly, we start saying, all, it doesn't matter if my library is in Python or Rust or JavaScript, I can still use it from my Dart program or something like that. And that's the world that we want to get to because then we can start reusing code instead of having to rewrite code. And then instead of having nine different YAML parsers, everyone with different divergences from the spec, everyone with different bugs. We can concentrate on writing one really
Starting point is 00:51:05 good one in a language that's well-suited for it, like say Rust. And then when it comes to AI libraries, we can use all the stuff in the Python ecosystem, even if I'm writing code in JavaScript or TypeScript. And that I think is a step away from complexity. And we just now, literally within the last few weeks, have gotten past that milestone. So I think from here forward, my hope is that as you start looking at this tooling, as it evolves over the next several months, this stuff is going to get easier and easier. We're not quite at easy yet. It's easy to build your first WebAssembly application. Components are still a little bit hard to assemble.
Starting point is 00:51:42 So the next thing will be, how do we make it easy to build applications out of components. And then at that point, I think we can start telling a very compelling story that we can build a less wasteful, more fun way of kind of building applications based on, you know, WebAssembly component binaries, instead of lots and lots of different languages, and lots of different libraries. Yeah, fascinating. I mean, I think that this is a really good story around how consolidation needs to happen at a lower level in the stack
Starting point is 00:52:13 because that requirement of different teams and different jobs, to your point, is that, well, something may need to be written in Rust, right? But something else may need to be written in Rust, right? But something else may need to be written in JavaScript, right? In terms of the runtime, that really needs to be the layer where sort of everything comes
Starting point is 00:52:38 together, which is fascinating. Yeah. We're at the buzzer here, Matt. I do have a personal question for you, which I've been, you know, waiting to answer. We're waiting to ask because I want to hear your answer. So in high school, you wanted to be a philosophy professor, which is fascinating to me, for sure, because you were interested in how the world operated. My question is, why did you choose philosophy instead of sort of what we would call the harder sciences, right? Because software developer, I probably would have put my money on you going with more of
Starting point is 00:53:18 a mathematics degree or biology or chemistry, because those are concrete ways to describe how the world works, but you know, with philosophy. Yeah. And every philosopher would have been offended by your question because the philosopher would say, but where do you think science came from? It came from philosophy. Right. And that, that, I guess, was part of it to me was like, there was that, there's this sort of like the rudimentary part, right? And that, I guess, was part of it to me was like, there was that, there's this sort of like the rudimentary part, right? I wanted to see how far back I could push it. And I didn't know, I didn't understand a lot of this in high school and in ways I got lucky that my naivety about things led me into a discipline that really did help me think through this. But, you know,
Starting point is 00:54:00 that we were talking about the difference between physics and metaphysics, and that was sort of the thing for me, right? Like, I don't want to know how a mechanism in the world works. I want to know how the way that you see that in Plato and the dialogues of Socrates, wisdom kind of comes across as that ability to ask questions and admit that I don't know the answers and be open to kind of hearing the answers, contrasted with knowledge, which is when you do know the answers and it's about applying the answers. There was something about that definition of being wise as being, you know, as a description of being a continual seeker, right? Someone who's continually asking questions and collecting little tidbits and trying to evolve their own view. That was very enticing to me as a young person. And it's still,
Starting point is 00:55:00 even today, that's the kind of thing that gets me excited about philosophy as a discipline. Love it. Matt, it's been so great to have you on the show. We learned so much. Thank you for introducing us to a new topic that we haven't covered. Thanks. And thank you for a couple of philosophical questions from us all the while. Yeah, that was a lot of fun. Thanks for doing that. I had a fantastic time. We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week. We'd also love your feedback.
Starting point is 00:55:33 You can email me, ericdodds, at eric at datastackshow.com. That's E-R-I-C at datastackshow.com. The show is brought to you by Rudderstack, the CDP for developers. Learn how to build a CDP on your data warehouse at rudderstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.