The Data Stack Show - 172: How WebAssembly is Enabling the Third Wave of Cloud Compute with Matt Butcher of Fermyon Technologies
Episode Date: January 10, 2024Highlights from this week’s conversation include:Matt’s background and journey with Fermyon (2:32)WebAssembly and enhanced security models (3:43)The IOT Startup and Google Acquisition (10:49)Googl...e's Early Containers (11:50)Scaling and anticipating requests (20:22)Introduction to WebAssembly and its importance (23:32)The Benefits of WebAssembly (30:57)Comparison of Virtual Machines, Containers, and Micro VMs (33:12)The Importance of Fast Startup Times in WebAssembly (37:39)Metaphysics and software development (42:12)The importance of effective communication in code development (43:18)The challenges and progress of WebAssembly (47:40)Requirements of different teams and different jobs (52:17)Final thoughts and takeaway (53:14)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show.
Each week we explore the world of data by talking to the people shaping its future.
You'll learn about new data technology and trends and how data teams and processes are run at top companies.
The Data Stack Show is brought to you by Rudderstack, the CDP for developers.
You can learn more at rudderstack.com.
We are here with Matt Butcher from Fermion.
Matt, welcome to the Data Stack Show.
We're thrilled to have you as a guest.
Yeah, thanks for having me.
I'm looking forward to this.
Oh man, so much to talk about.
So give us a quick background on sort of where you came from and what you're doing at Fermion.
Yep. So if we were to rewind to my high school career time, I would have told you that when I
grew up, I wanted to be a philosopher. So when I started college, that's what I was setting out to
do. But I had sort of gotten a job on the side doing some computer stuff. And philosophy degrees
are expensive, especially when you're going to do
a bachelor's, then a master's, then a PhD. And so I ended up kind of paying my way through by
writing software and doing stuff like that. And at some point, I realized that software was a lot
more fun than philosophy and kind of switched career tracks. Of course, after I'd incurred lots
of debt, I really went from there. And I first got interested in content management systems and
did a lot of work in the Drupal ecosystem. By that point, I'd learned like Java and PHP and
languages like that. Then I got working at HP Cloud originally to do their documentation.
And as soon as I got kind of a taste of cloud technologies and what was going to be popular
or possible and what, you know, I kind of had one of those moments where I saw a glimpse of the future and I was like, I want to be part of that. And that really shifted
my career. And I've gone on from there into, you know, through Microsoft, through Google and on
into starting up Fermion a couple of weeks ago, a couple of weeks ago, a couple of years ago.
Yeah. It's been a fast couple of years. A fast couple of years.
Yeah. Very cool. And give us just a quick overview of what Fernion is.
Yeah, we set out to build what we saw as the next wave of cloud computing.
And we thought that the foundation of that was going to be a technology developed for
the browser, but that we thought was better applied on the cloud or on the server side.
And that was WebAssembly.
So we've been kind of doing the thing that we do best.
And that's building an open source tool and toolkit
that developers can use to get started.
And then building a hosted cloud platform
and a server side Kubernetes style application platform
where people can run these things in their cloud.
That's amazing, Martin.
I'd love to get more into this
because like
WebAssembly has
been around for a
while now.
We've heard many
things about it,
like many different
use cases.
It has been used
in some
like cases also
like as part of
like products and
stuff like that.
But we still
get like this
feeling that
we're still not there
with like web assembly like there's a lot of promise and we're still like looking like to
see how it gets delivered right like so one of the things that i definitely want like to go through
like during our uh conversation is uh about that and i'm sure that you're going like to help us like understand what's going on with
the ecosystem like today but what about you like what are like a couple of things that
you're looking forward like to talk about during our like recording yeah that one i mean you just
hit one of my favorite topics which is i think web assembly know, it has shown promise in a lot of different areas.
But until, I don't know, maybe a couple of weeks ago, some of the most exciting pieces
of WebAssembly were not yet accessible to the general developer.
It was all very R&D and a little rough around the edges.
And now with the component model landing and being supported, suddenly we've got a whole
bunch of new and interesting things that we can build with WebAssembly.
And to me, the future that opens up out of WebAssembly
and the component model is just so exciting.
There are so many interesting things we'll be able to do
from true polyglot programming
to being able to overlay security models
and things like that in ways we've never been able to do before.
So I'm looking forward to talking about this.
I think it's going to be a lot of fun.
Yeah, 100%. Let's go and do it.
Matt, welcome to the Data Stack Show.
Yeah, thanks for having me.
We love covering new subjects that we haven't covered before.
And I guess, I mean, gosh, are we over 160 episodes now?
I don't think we've talked about some really key topics like WebAssembly. And so
there's a lot to cover, but we're going to start at the very beginning as we always do. So give us
your background. How did you get into the world of data and engineering? And then give us an
overview of what you're doing today for me. Yeah, sure. You know, I, when I was young, I wanted to be a philosopher.
And part of the reason behind that was that I was very much interested in systems like the world is
this very elaborate system that seems to be governed by scientific laws that we're still
just discovering. And, you know, it's a world that's simultaneously mysterious and yet predictable
enough that we can survive pretty well
for about 80 some years on average. And that was, you know, even when I was in high school,
I was really enamored with that. And so when I went off to college, I went off intending to study
philosophy. But along the way, I happened to get a job doing some IT stuff and then software
development and then early web development.
And that I use that as a way to pay my way through, through school.
And as I got going, you know, I advanced my philosophy career up until I got a PhD. I wrote the dissertation I taught for a little while, but all the while I was still doing
software development on the side, you know, with, you know, Java and Pearl and stuff like
that, then moving on into Python
and JavaScript as Node.js got popular and on into newer languages. And at some point I had one of
those, one of those mornings where I woke up and went, one of these two careers, you know, is really
lighting me up every morning. And the other one is moving really slowly. And I, it's time to make
a choice. And I kind of said, okay, I'm going to reserve philosophy for the weekend passion projects
and I'm going to go all in on software development.
So I really got going in content management systems at that point.
Drupal was kind of like the new hotness at the time.
I really liked it.
New PHP and so could do a lot of development there.
And I spent years just building websites
with Drupal working at various places. And at one point I was offered a job at Hewlett Packard. HP
was just getting into the cloud space and cloud was just at that point starting to get
really popular, right? I'm going to tell it like it is right here's HP, you know, one of the tried and true, you know, original Silicon Valley powerhouse technology companies watching a bookstore take over the cloud world and going, wait, Amazon can't win this battle. battle were hp and so they started a group called hp cloud and i i joined that group to do the cms
systems that were going to that shared documentation and all the marketing pages and all of that
and i was building that in drupal and how's that help orient just a little bit sorry to interrupt
so can you just give us a timeline here when is hp realizing this just to sort of orient us because
we live in the days of prime or maybe post-prime when
one day means three days.
But, you know.
This must have been like, what, 2011-ish, I want to say.
Somewhere around there.
Maybe 2010, 2011.
Time, man, really starts to blur together.
But yeah, that would have been the time frame.
Pre-cloud warehouse.
Yeah.
Okay.
Yeah.
And OpenStack had just sort of come on the scene.
Right.
So up until that point,
it was sort of like Amazon had built their thing,
which was entirely proprietary.
Microsoft had built their thing,
which is entirely proprietary.
And then out of like rack space and the NSA,
you know,
an unlikely Alliance comes open stack,
which promises first compute. And then, you know, object storage and other forms of storage come after that networking.
And it was a fun, fun time to be involved in that ecosystem because it's like every morning you'd wake up and brand new features had dropped.
There were so many developers working on it.
It was all open source.
It was happening very quickly.
We were maturing very rapidly.
And it was just, it was a really fun time to be in the cloud ecosystem.
We were all just kind of starting to understand exactly how big this thing was going to be.
You know, it's all pre-containers.
So Docker hadn't yet come around.
It was very heady times, right?
And at HP, I mean, this kind of vision that, you know, this was a new area and we could just build something that would be unrivaled, right? And at HP, I mean, this kind of vision that, you know, this was a new area and we could
just build something that would be unrivaled, right? And catch up and then pass everybody.
And we had this firm vision for where we were going. It was very exciting. So when I, when I
was doing the website development for HP, I asked, you know, can I switch teams? Can I start working
more on the compute side of things, the core open stack side
of things, and gradually sort of finagled my way over from documentation and running
this big Drupal site and writing a lot of PHP to then writing some JavaScript to do
JavaScript, node JS bindings into this kind of thing, and then worked my way over
into the platform as a service and ran the platform as a service team.
And it was just, it was kind of a fun, like, you know, those kind of, you know, those sprinklers
that are, you know, and you hear the clicks as they switch. That's how I felt like my career
was doing. I was just clicking through a bunch of different roles until I got to the one I wanted,
which was leading the platform as a service team there. And it was so much fun. But along the way, it sort of set in. There were some internal hiccups. The VP that I worked for, whom I absolutely loved, had departed HP. We kind of lost our vision. And it was starting to look unclear where we were going, how we were going to get there. And I hit this point where I was just sort of depressed.
And I guess I'm,
guess I must've been moping around the house a lot because my wife was like,
maybe you should look at a different job.
Wise man to listen to her. I'm assuming.
Yeah, I did. Yeah. Well, yeah, I, she went and actually she said,
maybe you should look at a different job.
I've been job hunting for you. Here's my list of several.
What a woman.
I know. Right. She was amazing. And not only that, but she picked the job that as soon as I saw it, I'm like, oh, I want to do this.
She had found an IOT consumer IOT startup in Boulder.
We lived in Chicago at the time, found one in Boulder that was looking for
a head of cloud, somebody to really help them take this thing from an early POC into a product,
which was exactly the kind of work that I thought would be a really rejuvenating experience after
sort of feeling ground down and worn out. And it was in Boulder, which was closer to family and
also closer to the mountains. And so I flew out here, interviewed, took the job,
moved the family out here,
and started working on this IoT backend,
a very awesome cloud system, met some amazing engineers.
We worked really hard on this kind of virtual machine-driven platform
that was a backend for IoT.
A lot of fun. And we were having so much fun that we attracted
the attention of Google who acquired us. So I went and spent some time inside of Google, worked
inside of the Nest team there. That was a really eye-opening experience because Google's infrastructure
is just so much bigger than anything I'd seen before, even compared to what we had at HP.
And they were using this kind of the early containers.
And I had been dabbling with containers on the side.
And when I saw the way they were doing Borg,
I thought, oh, this is just mind-boggling
and awesome at the same time.
So that was-
Sort of like the, you know,
Docker is like now a thing, you know,
where's the core?
But then you sort of got to go into
the heart of the beast and see how Google's doing. Right. Yeah. Yeah. Cause I think so
Google had been using LXC containers, which are sort of like the, one of the early analogs to
Docker. Docker had just kind of come on the scene they were building some interesting but
not quite production ready containers at that point um and and but google was on the opposite
side they had this big giant container ecosystem the the user wasn't really exposed to directly
you couldn't upload a container there you would write app engine software and it would be deployed
in containers the wizard wizard of Oz.
Yeah, exactly.
Yeah, exactly.
No, no peeks behind this curtain.
But the awesome thing and the thing that you did get to see if you look behind the curtain was they had this orchestrator called Borg that knew how to take all of these containers
and shuffle them around and put them in the most reasonable, on the most reasonable compute
platform.
And so Borg will come into this
story a little bit later, but that was my first peek at Borg there. I made it at Google for a
while and then I got this hankering to go back to startup life. And in particular, I wanted to do
the container thing because now that I understood it, I was really excited about it. And I wanted
to do more of like the PaaS platform as a service Heroku style thing.
Again, you know, run the infrastructure behind something like that, like what I was doing at HP.
And so I found another startup in Boulder called Deus.
And Deus was building an open source Heroku competitor based on containers.
And they were looking for somebody to do sort of the architectural work behind this.
And I'm like, this is the perfect job for me. So I joined Deus and about, I don't know, maybe six or so months into working at Deus,
Google did something that really surprised me.
They dropped an open source equivalent or version of Borg called Kubernetes.
And it was like 1.0, 1.1.
It was held together by toothpicks and marshmallows,
but it was like, I saw this and I'm like, oh yes.
This is like, it's open source now.
We can build all kinds of things on top of it.
So the CTO and I convinced the rest of Deus,
I was an architect there.
So the CTO and I convinced the rest of Deus
that we should replatform our paths on top of Kubernetes.
And that, you know, it was another
like little sprinkler kind of thing in my career, because what I didn't realize was Kubernetes was
on the cusp of really exploding. And we were starting to build key pieces of Kubernetes.
So we built Helm, the package manager for Kubernetes. We were building a whole bunch
of other projects for Kubernetes. And once more... Building like Helm and other stuff,
you were doing that instead of deus
at deus yeah we we built helm as part of what we thought was going to be you know the long-term
deus offering realize now holy cow okay yeah yeah yeah i mean okay so so helm came out of a hackathon
project so we we did this all hands meeting i'll tell you this story really quickly because it's
kind of funny sorry let's stop. We're diverting here.
Sorry, bro.
That's right.
Yeah.
Nothing is linear with me.
So, so we, I had, you know, Gabe and I had the CTO and I had basically said, okay, we
think the right move is to switch over to Kubernetes and sort of replatform on Kubernetes.
And Gabe said, you know, we're doing this all hands meeting.
I really want you to, you know, come up with some things we can do to get people going
on Kubernetes.
And so we decided we'd do a hackathon.
We decided we'd do a session on, you know, what Kubernetes is.
And we lined this all up.
And so that the hackathon, the idea was we'd kind of challenge people, hey, build something
fun and cool that's sort of in this new cloud ecosystem.
And the winner, the winning team will get a $75
Amazon gift card. And the average team was three people in size. My mind, Jack Remus and I were the
three who worked on this. And so we sat down and did some brainstorming. I was telling him, you
know, we're trying to figure out how to install our new Deus paths on top of Kubernetes. And we
ended up talking about NPM and package management.
And we decided we'd build a package manager for Kubernetes.
So it's called Kate's Place, K-8-S Place.
And it was coffee shop themed.
And so it was all, you know, we had this whole like Kate's Place is this nice coffee shop
where you go and you get little shots of Kubernetes installs and stuff.
And we just, we skipped the team dinner.
We did, we worked all night. We,
you know, worked the next day, built this little demo of Kate's place, the package manager for
Kubernetes. And we demoed it the next day and we won the $75 gift card for Amazon. I blew my 25 on
coffee. So the offsite ended, we all went back to our homes. And the next day I got a call from the CEO and CTO of Deus.
And they're going, so you know that package manager thing?
They're like, yeah, that's a really good idea.
I think package manager for Kubernetes, that's an idea that's got some momentum behind it.
We should do that.
I think you should start building that as your full-time job.
And we'll give you a team. You can pick a couple of people to be on the team and get started
building that. I mean, this is like, this is what we all dream of when we do these hackathon
projects. It's like, Hey, if I could invent my own day job, I'd do this. And here I was basically
getting, you know, carte blanche to do my little idea. And it was fantastic, but they said, this
is just one thing. And I said, yeah, what's what's that they said we really hate the coffee shop theme so i don't know like all the things to be you know devoted to the name
was not one of them i'd rather build the software i want to build yeah so jack yeah so jack and i
jack francis and i sat down with another one of the other people on the hackathon team sat down
with a nautical dictionary and started reading it out loud to each other, trying to come up with a name.
And that's where we came up with Helm.
That's where we came up with calling the packages charts.
It was all just sitting there reading this little dictionary.
What a story.
That's right.
So the next time you get an opportunity to do a hackathon, do it.
Yeah, totally.
Okay, well, take us.
I mean, that was an amazing detour,
but so take us from that point. And then how did you get to Fermion and tell us what Fermion is?
For sure. Yeah. So, you know, Helm and the other things we were doing in Kubernetes land attracted
the attention of Microsoft who was trying to, you know, Brendan Burns, who created Kubernetes
left Google and went to Microsoft and started building a team.
And part of that effort was them acquiring Deus and rolled us into the Azure part of
Microsoft.
I had a fantastic job there.
My job there was I got an open source team and my mandate was, you know, find what's
missing in the container and virtual machine ecosystem and build it and open source it and, contributed up to the
CNCF, the cloud native computing foundation, the governing group for Kubernetes and
the like, and it was fun and we had a lot of fun, but one of the coolest things
about a job like that is that you're always out there asking questions of
people, you know, customers, other teams inside of Microsoft and so on.
What are your big problems, right?
What can you not do?
Where are you struggling?
What are the roadblocks that are preventing you
from migrating workloads to Kubernetes or questions like that?
And then you get these challenges back
and you just try and build solutions to them.
And some of them, it's fairly straightforward
and you build solutions like OAM or like Brigade that just kind of answer people's questions.
But some of the problems were really vexing.
And really, we could not figure out good solutions.
One of them was we really wanted to be able to scale workloads to zero.
So when you're dealing with huge amounts of compute, during peak time, you might be consuming like
nine different virtual machines.
And during low times, you might be consuming none, right?
You might have no traffic in the middle of the night.
So you should really be able to scale from zero on your workload up to being able to
handle tens of thousands and as close to instantly as possible.
But scaling is bound to the problem that
when requests come in, you either have to be able to start up really fast, or you somehow have to
anticipate ahead of time that when the requests are going to come in and scale up before the
traffic starts to go up. If we were all good at predicting the future, you know, stock market
would be no fun and neither would gambling.
So we took the approach that we needed to come up with a faster way to do startups.
Another problem that we ran into around the same time was a lot of developers were telling us
that building Docker containers was cool, except they had to know ahead of time what the operating
system and architecture of the target environment was going to be. And then oftentimes they had to know ahead of time what the operating system and architecture of the target environment was going to be.
And then oftentimes they had to do really ugly cross-compilation steps if it was different than theirs.
So if I'm writing code on Windows running an Intel machine, running on an Intel architecture, and I'm deploying to Linux on an ARM architecture, my deployment life is going to be kind of hard.
And so we were looking for what Java promised at the beginning,
a compile-once-run-anywhere style of thing for cloud workloads.
So those are a couple of the examples of things that we were working on
that we just couldn't figure out.
And so at one point, we started saying,
well, we can't do this with virtual machines.
And we also can't do this with containers.
And we've been trying this for months, if not years.
Maybe we should open ourselves up to the possibility that there's a third kind of cloud compute that nobody has started using yet.
What would the characteristics be?
Well, it would have to have a cold start time that was like 10 milliseconds or under so that we could rapidly scale up when load came in and we could rapidly
scale down without worry when load left. We want it to be cross-platform and cross-architecture.
Of course, it has to have a really good security sandbox model because that's essentially what a
cloud runtime has to guarantee for you that you can run, that you as the operator can run untrusted
code from anybody else who's willing to pay the subscription
fees. And you can do it without risk to yourself or risk that they can attack other tenants
in your environment. And so we had to approach to this problem this way and begun looking at
potential technologies that can solve it. And that's kind of what led us up to,
first of all, discovering WebAssembly, which was originally a browser technology. And then second of all, going, wait, we've got an idea here.
And we have pretty much a team of amazing experts in this field.
Maybe we should do the startup thing.
And so a couple of years ago, we started Fermion Technologies with the idea that we could build
this next wave of cloud computing using WebAssembly as the platform.
I love it.
Okay.
Can you, let's start with a couple of definitions.
I know Costas is chomping at the bit with a bunch of questions,
but let's just do a couple of definitions before I hand the mic off.
WebAssembly, what is it?
Break it down for us.
We actually, I don't think we've talked about this on the show before.
So this is like a first sort of definition, which is exciting. So yeah, that is exciting.
A lot of pressure too. Yeah. Yeah. No pressure. This is just a conversation.
What is WebAssembly and why is it important? Yeah. We'll give the most boring definition of it. And then out of that
kind of unpack why it's actually pretty exciting. The most boring definition of it is that Web
Assembly is a binary format that you can compile different languages to. So, you know, if you're
compiling natively on Linux, you're compiling to the ELF format, right? And you've got separate
compilation targets
for every, well, probably every operating system out there,
but at least the big three or four,
I suppose others probably borrow.
So we're going, okay, so that compilation process
is part of what introduces the cross-platform,
cross-architecture problem that we had seen.
But if you find a binary format
that could run on any architecture and any operating
system, and it had the right security sandbox, then those were two of the really big checkboxes
on the list. So WebAssembly happens to also have a couple of other virtues. I should back up and say,
what was WebAssembly originally designed for? Because once we understand that, then we start to see why this story is so interesting.
WebAssembly was originally designed to run in a web browser.
And the original intent of WebAssembly, if you go back to 2015 when Luke Wagner and a group of people at Mozilla started it,
the stated goal was we want to build a platform-neutral binary format that can run inside the web browser
and that different languages can compile to so that in the browser we can run other languages
side by side with javascript so you can imagine some of these use cases right i've got this
cruft dc library that's been around since before i was born i don't want to have to rewrite this
in javascript but i also know that it does something important wouldn't it be cool if that's been around since before I was born. I don't want to have to rewrite this in JavaScript,
but I also know that it does something important.
Wouldn't it be cool if I could compile it to something that I could run in the browser
and make function calls from JavaScript into this C library?
Yeah.
Those are the kinds of cases that were
in the original scope of WebAssembly.
Yeah.
Figma, in fact, if you've ever used Figma and some of the other,
Adobe, I think also, they use WebAssembly in browser to be able to, they write code in C++,
compile it to WebAssembly, and then use JavaScript to kind of call into it. And that's how they get
such great performance on all their vector drawing is because some of that's going through C++,
not through JavaScript. Fascinating. So that was sort of like a transformative experience.
If you transition between the web app and the desktop app, which are, you know, obviously under the hood, like, yeah, it's really the same thing.
It is pretty wild.
Like, it's pretty anyone who's used design software, which I'm not a designer, but Brooks knows that I will get into some design files,
much to my design team's chagrin.
But that's actually the thing that I noticed the most
that is absolutely unbelievable,
is that it is a seamless experience,
and it's so fast.
Like, it's dealing with some pretty large files.
Yeah, and some pretty complex on-the-fly calculations too,
because you can drag, resize things very quickly
and not have any kind of lag like we used to see
in sort of the olden days of the web.
Sure.
But when you think about how then something like those Figma libraries
would have to run in a browser,
particularly if you're thinking sort of generically about this and not in
the case of one particular application, there are about four features that you would really
want.
The first one is a sandbox.
You'd want a very strict security sandbox because again, you know, the browser is running
binary code that it has not inspected inside of an environment.
So not only do you kind of have to be able to protect the system from getting rooted
by gnarly binaries that you downloaded, but you also have to protect the JavaScript sandbox
because that's an attack vector.
So the sandbox that you have to design for WebAssembly ends up having to be very good
and very reliable.
So, and which of course, one of the check boxes for the cloud, we want that same
level of reliability. Another one is we are notoriously impatient when it comes to waiting
for web pages to load on the internet, right? We want them to be snappy. Some of the research
suggests that at a hundred milliseconds, one piece of research I've read said in within 10
milliseconds, people's attention actually starts to dwindle, which is remarkable because that's way before we're aware of our attention starting to drift.
But that's how impatient we as human beings are. So the WebAssembly sandbox had to be very fast.
Maybe that's more reflection on society than the technology, but we'll save that because I want you
and Costas to discuss some philosophy. Costas is a philosopher and that's your training.
There you go.
There you go.
We'll save that for later.
So yeah, we'll spare the societal.
Yeah.
Yeah.
So we got two more on WebAssembly.
It has to be cross-platform and cross-architecture too, because we want to be able to run.
You can't have it where Figma works on one operating system.
And then I open my MacBook M1 and it's like, sorry, this processor is not supported.
That'd be a horrible experience. So the binary format also had to be cross-platform.
And then the last one was really the most audacious of all of them. And that was that
the format was designed so that any language could in theory be compiled to it.
And that's pretty wild because essentially
what the precondition for success of WebAssembly was is that they would be able to rally enough
language communities that we would actually get WebAssembly support in languages from
C and C++ to Rust and Zig and Go to Java and.NET and Python and Ruby and all of that, right?
And it's remarkable.
We bought into that.
We bought into that early in Fermion.
But we were also, that was identified as our first major risk, is that if that didn't really
take off, then we would be in trouble.
And I think Costas was talking a little when we chatted beforehand about how WebAssembly
has sort of seemed to have fits and
starts as it's gotten going. And one of those has been, you know, early buzz was not fulfilled when
there weren't enough languages, when you could really only write in C and Rust, it wasn't
terribly compelling. And in the last year, in the calendar year 2023, we have gone through
language after language adding support..NET has piloted support for all of the.NET languages.
Python and Ruby have added support.
Dart and Kotlin are coming along.
You know, and languages like Rust and C++ and Zig and all of those continue to mature.
Swift is moving along.
It's like, whoa, the most ambitious part of all of WebAssembly is
actually happening this year.
And that's been really exciting.
So you can kind of see there were four little attributes there that were designed for the
browser.
All four of those ended up being really important in satisfying the conditions we were looking
for a cloud runtime.
And in particular, we did kind of skip over this.
The workloads that I was most interested in when we were looking for this third wave of
cloud compute were what we would call serverless or FAS or functions as a service.
The kind where we wanted to do a discrete step, start it up, run it to completion, and shut it down as fast as possible.
So the most simple way we can think about this is, hey, a user makes an HTTP request.
We answer the request, send back a response and shut back down right away.
And then we're not running any long running processes. So that whole scale to zero thing
just sort of automatically falls out, right? When load is coming in, we might have 10,000
WebAssembly functions firing off, answering all these requests. When, you know, 2am rolls around
and everybody's asleep, we can scale, you know, there's nothing running and essentially we're not
paying a compute bill. So that was kind of one of the workloads that we had really targeted
as being perfect for this third kind of cloud computing, which WebAssembly then turned out to be
a pretty good example of. Yeah, that's great. But quick question here, because,
okay, I think one of the, what makes WebAssembly a little bit confusing to people out there
who haven't been active in WebAssembly itself is there are so many different use cases,
right?
Like from someone who listens about all the stuff, we have the security, we have the serverless
model, we have the polyglot part of it.
And we have web also.
But let me ask you the following question.
One of the ways that you position it as part of computing in general is next to containers.
Fixing some of the problems
that containers traditionally had, right?
And we have a couple of different primitives here.
We have containers.
Before that, we had virtual machines.
Now we have WebAssembly.
And we also have micro VMs, right?
So we have systems like, for example, Firecracker,
which gives you the opportunity
to solve some of the problems of the cold start problem, like fast systems and all that
stuff.
What are the differences between all of them?
And how do they fit, let's say, in the infrastructure world?
Do they, let's say, compete or complement each other?
That is, I think that right there at the end is a fascinating part of this whole thing, right?
We're building this big cloud world.
And every time we introduce a new technology, it kind of competes and it kind of complements.
And I think if we look at virtual machines, right?
So what does a virtual machine do?
What is it for?
A virtual machine runs an entire operating system from the kernel and the drivers all
the way up through the libraries and the utilities and on into your user land code.
And it packages, you package all that stuff up and you ship it off to somebody else's hardware and you execute it there in the cloud.
And so you're really thinking like soup to nuts, the entire operating system.
Now that's great for a number of workloads.
Some of them are, which you're running large scale databases, things where being able to
tune up the kernel parameters or the
driver parameters is really important. You can use these things and be highly effective.
But I, as a developer, and I think many developers out there, regardless of what,
you know, domain you're working in are going to go, yeah, but they're no fun to build.
They're actually really hard for a developer to build because it requires a tremendous amount
of operational knowledge to assemble them.
And then they're very hard to maintain.
So really, as a primitive, they've worked very well for platform engineering and DevOps and teams like that who are focused on the operation of a system.
But they weren't as popular for developers.
And that's where containers came in.
So a container does not have a kernel or low-level drivers, right? A container is just
sort of like a little pie-sliced version of an operating system. It has just the part of the
file system your application needs, just the supporting files it needs, just the system
libraries it needs, and your binary. And it's great for long-running processes that perhaps
don't sort of need that low-level access to the kernel and don't sort of need that low level access to
the kernel and don't really optimize at a low level.
So you can think, you know, web servers, microservices, those kinds of things work
great in containers and developers.
We like them because they are a lot easier to build.
You write a Docker file that just plonks your binary file inside of one of these images
and it packages it up and then you can ship, you know, instead of a six gig or 20 gig virtual machine image,
usually you're talking about maybe a hundred meg of slices of operating systems that you're
pushing and moving around.
And those are really good for long running server processes.
It was the next class of computing, that serverless one that I was talking about, where really you don't want anything long running.
You want a process that gets started up when a request comes in, handles the request, returns a response, and then shuts back down.
The typical container takes about a dozen seconds to start.
The typical virtual machine takes a couple of minutes to start. So you can't really effectively start up, handle a request, and shut down when that's
the characteristic of your underlying runtime.
So the way this was solved in sort of like serverless V1 worlds, right, with early Lambda
and all of that, with Lambda today, Azure functions, Google Cloud functions, things
like that, is you essentially pre-warm virtual machines and keep a huge queue of virtual
machines around.
And then as requests come in,
you drop a workload on a pre-warmed virtual machine,
execute it and tear the whole thing down.
So it's inefficient and it's actually fairly expensive to operate.
And that was, you know,
seeing how this worked behind the hood in Azure
was one of the reasons why we identified this
as an interesting problem
to solve. Because anytime we can reduce the amount of energy consumed and drive down prices and free
up computing resources to do other things, you know, from the perspective of someone like Azure
or Google or AWS, this translates directly to not just cost savings, but actually being able to do
more with the compute power
they have available.
So essentially, you can sell more faster if you can do this kind of thing.
For us as consumers, right, it's really about the fact that we're only paying for traffic
when the workload is actually happening, right?
When there's traffic coming in, then we're watching our function startup run to completion
shutdown.
When there's not traffic coming in, we're not paying anything.
And so it's compelling really on both sides of that story. Micro VMs are another attempt to
solve a similar problem here, playing on this idea that maybe you can strip down a virtual
machine to the point where it starts up in just several hundred milliseconds. A lot of that is
very promising. And for some kinds of workloads, I'm pretty excited about that. And we use it a little bit here and there. But if you compare, so a typical AWS Lambda function
takes about 200 to 500 milliseconds to cold start. And then that's the amount of time it takes from
when the request comes in to when your code starts to execute. It's all warming, right? That's fast compared to several seconds for a container,
but it's slow if you're talking about a user request, right? Google starts to ding you on
your page rank if you exceed 100 milliseconds before delivering your first byte. If it takes
two to 500 milliseconds just to cold start before you're even doing your processing,
you can't build the kind of high-performing system
that you want for user-facing web applications.
So when we looked at WebAssembly,
one of the key things there was,
can we get it to start up really fast?
And right now, you know,
originally we were at 10 milliseconds.
Then when we released Spin 1.0,
we were at one millisecond.
When we released Spin 2.0 last week,
we were down at half a millisecond or less to cold start.
That is the time it takes from the record
when the request comes in to your code being executed
was under half a millisecond.
And that gives you the developer about 100 and some,
about 100 milliseconds to try and get those first bytes
back to Google and score high on page ranking,
a very high for responsiveness.
If you're doing anything like streaming
or things like that, where it really matters. This is a big deal. This is a very big
deal. That's amazing. Okay. I want you now to
put your philosopher hats
and give actually an answer
as a philosopher to engineers.
And the question is how much abstraction over the
hardware is too much abstraction because we've talked about virtualization in like so many
different levels right and i wonder like at what point maybe there's no point, right? Maybe abstraction is eternally
ad infinitum,
something that we should be doing, right?
But I want
the answer from not the angle of the
engineer here, because as engineers
we thrive
in abstraction, right? That's how...
We are lazy. We want to abstract, so
we can build one thing and
apply it to many things, so we don build one thing and like apply it like many
things so we don't have like to do it like like in it again but from a philosopher's point of view
right like how you would say to your engineer side to stop abstracting yeah so abstraction
comes with a cognitive cost and that's the most important thing for us all to remember, right?
And so if you look at, so the discipline in philosophy that most deals with trying to understand the structure of the world is called metaphysics, right?
And if you rewind history all the way back to the very earliest philosophers, you know, Plato and Aristotle did, both of them worked very much in this field of
metaphysics. What kinds of things is the world composed of? In fact, Aristotle coined the term
metaphysics because he said it meant what must come before physics. What do we need to understand
about the world before we can understand how the pieces of the world are interacting? And he said,
you know, what we need to understand is what the actual structure of the world is. What kind of stuff is the world composed of and how complex are the sets of rules?
And what is computer science, if not applied metaphysics, right?
Here we have this ability to build systems that are based on the way we think about the architecture of things.
What is a shopping cart? And what is an online
store? What are the components I need in this? And then we start building the rule systems around
them and how they work together. So in a way, your question is perfect because the history of
philosophy can inform exactly what we're trying to do in computer science. And what you see from
Plato onward is metaphysics going through these cycles of getting increasingly complicated
and then getting to the point where they're out of touch with reality. That is,
they're so impenetrable that it's hard to even test whether you're describing reality anymore
or not. And then after that, you know, you start to see them retract again and you get
movements like empiricism or stoicism or even skepticism, the idea that all metaphysics is doomed.
We might as well just live life as it is and doubt that we actually know anything.
All of these movements are kind of reactions against the fact that metaphysics can lead
to systems that are so complicated and so hard to even test whether they are actually
describing the world that they become essentially either useless or vacuous, right? Either there's nothing we can
do with them that's productive, or they're so difficult to explain that by the time we're in
that sort of like enlightened cogitation about them, we're not really talking about anything
people care about. I think that particular play in philosophy that we've seen now over thousands and thousands
of years should inform the way that we build systems and software. Because to your point,
what is an abstraction for? It is to, well, a programming language, right? The nuts and bolts
of what we are doing as software developers is attempting to build a language or languages
that help me describe to you what I'm thinking
and help both of us describe to a computer, a pure deductive logical system, how to execute
things in a step-by-step way. So we've got kind of dueling objectives here. On one side, it's
how do we make sure that we are explaining it at the level of terseness that the computer needs to be able to execute it?
And that's what compilers are for.
But it's also, you know, part of the reason why some of our languages have peculiar concepts like the borrow checker in Rust or type systems in languages like Java.
But the other thing is you and I have to be able to communicate effectively on our code, right? If you and I are working together on a code base, if I write code that you don't understand, I'm making a mistake.
And likewise, if the two of us get together and start building these grand edifices that use all kinds of specialized terminology and we build lots and lots of layers of abstraction, and then Eric comes in and looks at this and is like, I don't even know where to start, right? This is so complicated. I have no idea.
Then we've failed as software engineers, right? So that's the framing for the answer. The next
question you really have to ask is, well, the thing we can do is we can solve this problem
by introducing abstractions and specializations. And that's what I think has happened, right?
We have data stacks that are designed for data processing. We have web stacks
that are designed for web developers. We have IoT ones for IoT developers. And we've managed to do
a reasonable job of carving up our day jobs such that we can have some divergences in there in
terminology. We can use terms like node and every one of us
thinks a different thing when we hear that because we applied it in different ways.
And we can have some success there. And we can actually look at science and see that science
has been relatively successful where it started out as a unified discipline and has since broken
out into sub-disciplines like physics and biology and stuff, and then broken out into further sub-disciplines like astrophysics and things like that. And there's been some success in doing
things that way. But at each time we do that, we introduce a new level of complexity, which
we have to acknowledge when we do it. When I introduce this new level of complexity,
I'm essentially saying either there's going to be a new specialization that comes out of this or i'm gonna end up you know making this too complicated for a person so i don't know if
there's a strict answer to your question but at least there's kind of a framework for thinking
about it no and i think it's a great framework like to ask like my last question which has to do and go back to the web assembly, like context again.
So when I tried, and it's been a while, I have to admit, like to play around.
Okay.
I experienced a lot of like, I love like the word that you said, like the cognitive, like
cost that I had to go and like figure figure out what I can do with this thing.
The pitch was great.
All these things of like, oh, now maybe I can take Python codes, for example, and run it as part of my Rust code or vice versa.
That's great.
That's great.
I'd love to be able to do that, especially for me as a
person who comes from the data infrastructure. And I've seen how big of a moat for old systems
has been the fact that a lot of code has been written in legacy systems, right? And we cannot just move it easily to a new one, right?
So just moving UDFs from Hive to Spark would be amazing.
Yeah, it would be like...
I don't think people realize how many millions of dollars
would be saved by doing that, right?
Yeah, yeah.
But when I started playing around, I got lost.
And then I gave up. And the reason I'm saying my personal story is because I think I'm not the only one out there. And
I would like to ask you why this happened with WebAssembly. Why we had this process
where the promise was really big like people were like
really eager about like getting into that but it feels almost like we're still waiting to see the
outcomes of that like to see them like applied right like what was missing and if it's not
missing anymore like what happened and i i think the answer to what was missing was it's typical of many systems and maybe WebAssembly got a little more hyped than we thought it would, a little bit faster than we thought it would.
But the early tooling for WebAssembly was actually fairly difficult to use.
And you might follow a set of instructions.
It was like, download this library, put it in this place, download this tool.
We'll tell you later what this tool does.
Trust us, download it, install it. You'll need it. You know, that kind of thing where you're like,
okay, step 15, install the WebAssembly compiler. Now I can start writing my first piece of code.
You know, that was a, that's an experience that was non-ideal. And that was the way things were
when I started working. When we started Fermion, that was the way things were. And one of the first
things that Fermion did, we said, okay, our first user story has got to be as a developer i can go from blinking cursor to
deployed application in two minutes or less and that was exactly because of the problem you
described that we came up with that user story first first thing we have to prove is that this
is easy and that the developer doesn't actually have to understand the web assembly bytecode format
or what a runtime does or which
tools are used to assemble a thing this way. They just needed to be able to write code in kind of
their usual way. And WebAssembly is still there. Some of the standards are still in flight. So for
example, networking is not fully baked yet. So there are some things we know are still going to
be a little hiccupy for users as they get going. But for the most part, by taking that perspective, you know, we spent most
of 2022 and part of 2023 going, we just need to get a developer experience where you can do,
you know, just a couple of commands. So for us, it's like spin, spin is our open source tool for
developing WebAssembly serverless applications. And so you can do spin new, tell it what language
and give it a name and it'll scaffold out a project for you. So spin new Rust, you know, foo.
And then spin build will compile it for you.
So you don't have to know all the compiling commands for each and every language.
And then spin up will allow you to test it locally and spin deploy will allow you to
push it somewhere else and run it.
And we thought if we can build an experience that's that simple, then developers can trust
us that we're not going to just overly burden them with a whole bunch of new things they have to learn. So I think we've made good progress on that in 2023.
The component model is one of the things we are most excited about. You alluded to it there,
and it's new in Spin 2, which came out only, well, first week of November is when Spin 2.0 came out.
And the component model is the first step against the trend you described so we waste
huge amounts of time in this discipline re-implementing the same thing in lots of
different places and lots of different languages and the component model allows two web assembly
binaries to talk to each other or more specifically it allows a binary to say this is these are the
functions i export and these are the functions i need to import. And then you can start negotiating how you put these together, right? So essentially,
binary, WebAssembly binaries can work like libraries. So I can say, hey, I need to import
this thing that it provides the YAML parser, I'm going to use it. I don't care what language it was
written in. So suddenly, we start saying, all, it doesn't matter if my library is in Python or
Rust or JavaScript, I can still use it from my Dart program or something like that. And that's
the world that we want to get to because then we can start reusing code instead of having to rewrite
code. And then instead of having nine different YAML parsers, everyone with different divergences
from the spec, everyone with different bugs. We can concentrate on writing one really
good one in a language that's well-suited for it, like say Rust. And then when it comes to AI
libraries, we can use all the stuff in the Python ecosystem, even if I'm writing code in JavaScript
or TypeScript. And that I think is a step away from complexity. And we just now, literally within
the last few weeks, have gotten past that milestone.
So I think from here forward, my hope is that as you start looking at this tooling, as it evolves over the next several months, this stuff is going to get easier and easier.
We're not quite at easy yet.
It's easy to build your first WebAssembly application.
Components are still a little bit hard to assemble.
So the next thing will be, how do we make it easy to build applications out of components. And then at that point, I think we can start telling a very
compelling story that we can build a less wasteful, more fun way of kind of building applications
based on, you know, WebAssembly component binaries, instead of lots and lots of different languages,
and lots of different libraries. Yeah, fascinating. I mean, I think that this is a really good story
around how
consolidation
needs to happen at a lower
level in the stack
because that
requirement of
different teams and different jobs,
to your point, is that, well,
something may need to be written in Rust,
right?
But something else may need to be written in Rust, right? But something else may need to be written in JavaScript, right?
In terms of the runtime, that really needs to be the layer where sort of everything comes
together, which is fascinating.
Yeah.
We're at the buzzer here, Matt.
I do have a personal question for you, which I've been, you know, waiting to answer.
We're waiting to ask because I want to hear your answer.
So in high school, you wanted to be a philosophy professor, which is fascinating to me, for sure, because you were interested in how the world operated. My question is, why did you choose philosophy instead of sort of what we would call the
harder sciences, right?
Because software developer, I probably would have put my money on you going with more of
a mathematics degree or biology or chemistry, because those are concrete ways to describe
how the world works, but you know, with philosophy. Yeah. And every philosopher would have been
offended by your question because the philosopher would say, but where do you think science came
from? It came from philosophy. Right. And that, that, I guess, was part of it to me was like,
there was that, there's this sort of like the rudimentary part, right? And that, I guess, was part of it to me was like, there was that, there's this
sort of like the rudimentary part, right? I wanted to see how far back I could push it. And I didn't
know, I didn't understand a lot of this in high school and in ways I got lucky that my naivety
about things led me into a discipline that really did help me think through this. But, you know,
that we were talking about the difference between physics and metaphysics, and that was sort of the
thing for me, right? Like, I don't want to know how a mechanism in the world works. I want to know how the way that you see that in Plato and the dialogues
of Socrates, wisdom kind of comes across as that ability to ask questions and admit that I don't
know the answers and be open to kind of hearing the answers, contrasted with knowledge, which is
when you do know the answers and it's about applying the answers. There was
something about that definition of being wise as being, you know, as a description of being a
continual seeker, right? Someone who's continually asking questions and collecting little tidbits and
trying to evolve their own view. That was very enticing to me as a young person. And it's still,
even today, that's the kind of thing that gets me excited about philosophy as a discipline.
Love it. Matt, it's been so great to have you on the show. We learned so much.
Thank you for introducing us to a new topic that we haven't covered.
Thanks.
And thank you for a couple of philosophical questions from us all the while.
Yeah, that was a lot of fun. Thanks for doing that. I had a fantastic time.
We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week.
We'd also love your feedback.
You can email me, ericdodds, at eric at datastackshow.com.
That's E-R-I-C at datastackshow.com.
The show is brought to you by Rudderstack,
the CDP for developers.
Learn how to build a CDP on your data warehouse
at rudderstack.com.