The Data Stack Show - 111: What if Your Code Just Ran in the Cloud for You? Featuring Erik Bernhardsson of Modal Labs
Episode Date: November 2, 2022Highlights from this week’s conversation include:Erik’s background and career journey (2:51)Managing scale in a rapidly changing environment (6:35)The people side of hypergrowth (12:36)Coding comp...etitions (17:50)Introducing Modal Labs (19:02)How Erik got into building Modal (21:45)The employee experience at Modal (28:09)How a data engineering team would use Modal (31:21)What it takes to build a platform like Modal (36:27)What makes Modal different (42:49)Evolution coming for the data world (45:52)Untapped areas in the data world (48:46)Spotify playlists (52:03)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com..
Transcript
Discussion (0)
Welcome to the Data Stack Show.
Each week we explore the world of data by talking to the people shaping its future.
You'll learn about new data technology and trends and how data teams and processes are run at top companies.
The Data Stack Show is brought to you by Rudderstack, the CDP for developers.
You can learn more at rudderstack.com.
Welcome back to the Data Stack Show. I am incredibly excited about our guest today. We're going to talk with Eric Bernardson,
who was one of the early engineers at Spotify, did a ton of things there, including music
recommendations, started it better at the insurance company when they were just a couple people and
scaled that. He was there when they grew to 10,000 people. So just incredible experience.
And he's building something new and fascinating called Modal. So Costas, this is not going to
surprise you or the listeners at all, but I want to hear about the lessons that Eric has learned going through this drastic phase
of scale two times over, which I think is pretty rare. And especially to do it at a company like
Spotify, which I think solves a problem that a lot of other companies tried to solve. We tend
to think of Spotify as dominant today, but back in, you know, I guess 2008, there were
a lot of other major players in the space, you know, and they were just a small, scrappy
startup.
So I'm just really interested to hear that story.
How about you?
I mean, obviously, like, I'd love to hear about that.
But Eric is also like building, let's say, on the new version of the next
generation of cloud infrastructure.
So that's based like on serverless, that is like much more seamless as an
experience like from the developer.
So I'd love to talk more about that stuff and see like what serverless
experience is, first of all, why it is important and
what you have to do in order to build something that is serverless, right?
So yeah, that's what I'm going to like to focus more on.
I'm pretty sure we're going to have surprises today.
Oh yeah.
There's no question.
There is no question.
All right.
Well, let's jump in and talk with Eric.
Eric, welcome to the Data Stack Show.
So many things to cover.
So thank you for giving us some of your time.
Thanks for hosting me.
It's fun.
All right.
Well, give us, give us your background, kind of what you've done over your career, and
then tell us a little bit about what you're doing today.
Yeah, I'll try to condense it down.
I originally did the chronological order, it's a little bit easier.
I grew up in Sweden.
I did a lot of programming competitions when I was little.
I was always coding and like messing around with, you know, this was back in
the nineties, running code and building crappy websites, ended up meeting a bunch of people in school who were early at Spotify.
So when I was done with school and I started physics, but I'll take the
better of it, but I ended up joining Spotify in 2008, spent a couple of
years there in Stockholm building the music recommendation system.
Eventually ended up moving to New York.
I built up a team, did all kinds of other stuff too at Spotify, not just
music recs, but that was sort of the prime thing, but I also did a lot of
like random sort of data infrastructure and ended up open sourcing Effectful
Luigi, which was an orchestrator and another thing called Annoy, which
is a vector database, built up a team of about 30, left in 2015 and took a job as
a CTO at a company that was then 10 people, that was there for six years,
it was Better, I hope it's called Better, it was a mortgage company.
The company ended up growing to 10,000 people, then went through some challenging
times, so it's a little bit smaller now.
But it's oversaw the technology there.
And then I left almost two years ago and left in order to pursue a bunch of random ideas I had around data and started thinking about, okay, like, you know,
how can we build better tools?
What's missing with data infrastructure today?
What's the tool that I always wanted to have 20 years?
So I started working on something and over the last few years I've
built up a small team and raised some money and, and I don't know if I
mentioned the name, but yeah, Modal is, is a company where I'm working right now.
Very cool.
Well, we want to dig into Modal cause it's so exciting, but first I'd love,
I mean, you have such a unique experience, right? So Spotify 2008, you know, a handful of people on the super early team.
I think, you know, what's interesting, actually, I think back on that and, you know, when you
think about streaming, Spotify is the first thing that comes to mind.
But back then, you know, there was RDO and like a couple other major like streaming services really large, actually, at the time, thinking back in those early days of streaming, which was super interesting.
But you have been through this scale arc more than once, right? So like, you know, Spotify in 2008 to 2015, hyper growth, the company growing to thousands of people,
better, you know, 10 people to 10,000 people,
just serious exponential growth.
And with those unique experiences,
I'd love to look at that from two angles.
I mean, both personally, I'm just so interested in this,
but I think our listeners as well.
Let's talk about Spotify first. So between the time you started and the time you left,
the technology landscape went through an unbelievable change. There were things in
2015, I think you said when you left, that didn't exist back in 2008. And so you have this massive change happening in the
technology landscape as your company's going through this period of hyper growth. How did you
manage that on the technology side? Because it seems like you would constantly be facing
decisions around infrastructure, like we can get more performance or leverage this new technology.
Like when do we change this?
Why do you change it?
When are you forced to change it?
So we'd just love to know what sticks out to you as you think back over that experience
on the technical side of how you manage through such an interesting time of scale with such
a drastically changing tech landscape?
Yeah, I mean, I think the most obvious thing is,
like, let's see, like, the cloud is different, right?
Like, you know, when I started working on,
when we built Spotify, it was all on prep, right?
Like, we'd run, like, Capacit and Data Center
and, like, put our own racks there.
Like, you know, probably had to sneak through that.
And I think, you know, now like almost no one does that.
So I think that's probably the biggest thing.
But also, you know, you look at like the data side,
like back then it was all like Hadoop, you know,
because people were inspired by, you know,
what Google was doing, what MapReduce.
And there really wasn't anything that scaled.
And the data that Spotify generated was, you know,
relatively large for its time.
We didn't, you know, didn't want to go to like the Oracle route.
And so, so I knew it was the way to go.
I dunno, I was kind of just dark ages.
Like I, I look back at that and you know, I, I, I, you know, I do not miss it.
It was pretty terrible in many ways.
And I think, I think also being like so new to it, like I, you know, there's
like a certain amount of like normalization of, you know, bad things that I didn't realize
until much later, like how crappy it was. And so in terms of like the technology, I
wouldn't say there was like a particular day I just woke up and, you know, felt like, you
know, we're already on the fly landing in the future, but I, cause it was like every
day it was like a little bit new, you know, but so, so, so staying up to that
and up to date on that was like, it's a little bit tricky, but I don't know.
Like I've been doing it so long that I feel like if you look at like one
Git, GitHub repo every day, like I can sort of, you know, just see what's
going on.
But yeah,
the cloud was
always a big thing
and I'm very
grateful that the
cloud exists.
And so I think
that's made a
huge difference
in terms of
how I'm
operating with
data.
Yeah,
for sure.
How about,
you know,
so you leave
Spotify in
2015.
Were there any, did that did that I mean I'm sure
it did but influenced the way that you did things
better like were there any you know sort
of any big lessons that
you took into better based on that experience
I think so
I mean I spent
12 13 years
consumer companies
doing data stuff.
And I, I mean, one thing I learned is just like, you know, how incredibly
important it is as a consumer company.
It's just like really have a firm understanding of what your users are
doing with light and conversion rates and the friction points and, you know,
and, and the onboarding flow and every, you know, everything that happens in that and like, you know, what's your activation and
retention and churn and all those things.
And so I think just coming into my second company, thinking a lot about those
things, like from scratch, I think was incredibly helpful.
I mean, there was many other things, but as soon as it like stands down on the data
side, like just, you know, kind of asked, I was like, I mean, the other thing is also, it can maybe like a counter learning if like, I also learned
like the fact that you can't really do data for like, I mean, our only other
better, like we didn't have a data team for like two, three years, right?
Like, you know, you don't have any users, like, and you don't have any data,
then it doesn't make any sense.
Right.
Right.
And so, so I think that was also kind of clear at Spotify, like early on, like,
you know, I usually joined Spotify in 2008 to do music recommendations.
Like it kind of realized kind of quickly, like we had much bigger problems to solve
and like, we also didn't have enough data to do so.
And so I ended up like really not, you know, I ended up working on a lot of other
stuff for the first few years and then they got back to it and then something
similar happened at better, like we didn't really have a data team, you know, the
first few years since like very basic stuff, but then eventually data become like incredibly important for that. And then I think then sort of a Better. Like, we didn't really have a data team, you know, the first few years. It was, like, very basic stuff. But then eventually, data became, like, incredibly important for Better.
And then I think that's sort of a transformation I've seen a lot of data
companies, too, is, you know, like, how do you start to think in a data-driven way?
Like, I think it's more just, like, a technology shift that ends up being kind
of a culture shift, too.
Yeah.
Super interesting.
One specific question there, just out of pure curiosity
when did you like what was the point at which you felt like there was enough data to really
make meaningful progress on the recommendation side
yeah i don't know it's like like, so, I mean, specifically with that, like, you know, music is so like,
it is like the problem I was working on was like so unique in a way.
Like, I don't know if there's like a general lesson to learn there.
Like we, we had a very big matrix, like the rows were like, you know, users and
the columns were tracks and we just wanted enough entries in that matrix.
Yeah.
In order to feel like, you know, we could complete the rest of the matrix.
So, so, so that could complete the rest of the matrix. Henry Suryawirawanacanthamiloyetho.
So, so, so that probably took a couple of years.
I mean, like early on, like we only had like basic stuff that worked, but it
didn't cover a large part of the, the, the, the catalog, like it only covered.
We should only make it good enough for like, you know,
David PĂ©rez- yeah, like least common denominator, right?
Like, yeah.
Henry Suryawirawanacanthamiloyetho.
So getting enough data that it would extend to like the full
table of the catalog, I think that took many, many years.
So for a long time, you know, that the, the, the recreation system only
covered maybe the top hundred thousand tracks and then eventually covered a
million tracks and eventually a million tracks like that took a long time.
Yep.
Yeah.
Super interesting.
And then how about on the people side?
So, you know, it sounds like you went through like you were, you know,
sort of on a very small or maybe even like a one person, you know, engineering team or, you know, a the technology side is one thing, but the people
side is another one, which is, you know, arguably like a lot more difficult in many ways.
Yeah.
Yeah.
People, people are tricky.
I, yeah.
And, and, you know, Spotify was weird, right?
Like Spotify was just kind of almost like anarchy.
And I think there's, you know, in retrospect, I think it's kind of a unique environment
in that sense that, you know, sort of, sort of told me that like, if
you just hire a bunch of like, you know, relatively strong people and just like
throw them into a room and like, you know, tell them to go like build an amazing
service, like you can actually sort of make that work if you have the right culture.
So I think to me, that was a very positive experience that I had, you know,
with me from Spotify, you know, and the amount of trust and the amount of like
self-organization that new early days of Spotify that we had, which also have
like, you know, negative sides, like no one told me who my manager was the first
three years, like, I didn't even know, like, you know, I didn't have anyone
involved, like whatever, but like, and like no one told me what to work on either.
Like I, so I was just like sitting, like, you know, I started building stuff and like eventually I found some stuff that was useful and then I just like, and like, no one told me what to work on either. Like, so I was just like sitting around, like, you know, I started building stuff and like
eventually I found some stuff that was useful.
And then I just like, hey, I just like data.
Like people got interested in it, you know?
So I started working on so much like random stuff.
But I think, I think that sort of complete anarchy that existed as Spotify, which, you
know, people, you know, might seem scary, actually was a very propulsive thing in the
early days.
I don't think it like scaled beyond like maybe, you know, 20, 30 engineers.
I said that, you know, yeah, if I can post a little bit more structure.
So, but I'm sort of fundamentally like a believer in people and, you know, trusting
people and, and so I took that with me with, to, when I built up a second, the
better venture scale of 200 engineers that, you know, you know, sort of
aspirationally, you can get pretty close
to a culture where people just come in every morning
and just ask themselves, like, what is the number one thing
I should do today that adds the most
business value, right? Like, helping
people have that inner voice, like that sort of,
you know, guiding, you know, star of, like, you know,
today, these are the things that matter
and, like, you know, let's figure out together
how to get there.
I'm not in that need. Like, I think there's always going to need, you know, we're going to need managers and and like, you know, let's figure out together how to get there. I'm not in that need.
Like, I think there's always going to need, you know, we're going to need managers and
like performance reviews and all this stuff.
Sure.
But like, you know, I think it's like, you know, as like a thought experiment, like how
close can you get to that, you know, that fully self-organized, you know, platonic ideal
of an organization.
And I think the truth is you can get pretty close to that.
You know, if you instill that culture into people.
Can I ask a specific question about Spotify in the early days? Because
you know, when you use the term anarchy to describe company culture, that's a pretty
like shocking term, I think for a lot of people, but it's so compelling that,
you know, there was self-organization.
What do you think enabled that? Because the thing that comes to mind is like, okay,
no one told you what to build, but was the mission, like, even if it was very broad and even like
somewhat difficult to translate down into like, what code should I write today? Was the mission like clearly and consistently
communicated so that people could sort of self-organize and at least prioritize? Like,
what were the unique characteristics that you think enabled that? Because a lot of times you
think about like aligning around a mission, even with a complete lack of structure, at least people
are building towards the same like direction.
No one sat down and like, you know, wrote a bunch of cultural tenets and put it up on the wall.
But I feel like it almost like kind of ended up being that way because I think Spotify just had
such a unique sort of product. In fact, we had a product that like people just love, like, you know,
and I think that was, you know, looking back and this is my first job you know out of school so like i don't think i realized like how
unique that was in that sense but like yeah you know how how you know cool and like everyone who
worked as one if i had this like you know love for their own product i mean like frankly like
i don't know any other product that like like, you know, the employees of that company
would use, you know, their own
product. Like, I don't know,
like, maybe, you know, you work at IKEA,
like, maybe you're sitting at IKEA at Sheriff's all day.
You're like, okay, like, you know. But, like,
I would listen to Spotify 12 hours
a day while I was working at Spotify.
So, like, you know, if you're, like,
using your own product, like, 12 hours a day,
I think there's so much care.
That is fascinating.
Yeah, that's such a visceral personal experience, but that's actually shared among a small group of people that can direct like through a high level of care.
That is so fascinating.
That is such an interesting dynamic.
Yeah.
And I think that factor was like tremendously helpful.
And I'm not saying it's like, it's like, you know, necessary.
Like, I think, I think it's definitely harder if you're working on, I don't know, like,
you know, claims processing software for like, you know, corporate, you know, liability insurance, right?
Like then it's probably a little bit harder,
but I think, you know, there's still ways to sort of,
you know, create a little bit of that culture of like,
you know, people feeling like they have a power
and an autonomy and like, you know, all those things.
Yeah, absolutely.
So fascinating.
So interesting.
One, okay.
One more question from your background
that I just can't help but ask,
but coding competitions,
is there like one coding competition
that sticks out in your memory
from doing those in the nineties
or like one challenge?
I think we're only 2000.
I don't know.
I think, yeah, I don't know.
Like I, there's like a lot of that, but like, you know, there was like a crazy one.
We would go to like Hungary every year and there's like this 24 hour programming
competition that was like really different.
Now it was actually kind of fun.
It's like, you know, you have all this like weird programming competition,
like it's like problems, like you have to control, I don't know, like the Lego
train or something like that, like once off. So that was a lot of fun. Plus it was like 24 hours. It's kind problems, like you have to control, I don't know, like the Lego train or something like that.
Like once I was really, that was a lot of fun.
Plus it was like 24 hours.
It's kind of fun to like manage your own energy.
Like it's kind of. Sure.
Yeah.
It's like a, a physical challenge too.
Yeah.
Yeah.
Yeah.
At the end of the, you know, neck, you know, early hours of like this, the second day.
So out of it sits down.
That one used to be like fun every year yeah super interesting
all right well tell us give us just a brief overview of what modal is and then i want to
hand the mic off to costas because i know he has a bunch of questions about modal but yeah can you
just give us a brief overview of what it is yes it is a way to run code in the cloud in a way that, you know,
probably is focused on data-based experience.
Let me contextualize a little bit.
So I think it's actually, so sorry if this may not be quite as brief as
it seems, but I took, you know, I always wanted to build better tools for data
engineers, like, or data scientists.
Like I looked at a lot of different, like, you know, as a CTO, I saw this very
clearly, like, I think data teams like me need better tools, right? They're sort of behind, I think, other segments things, you know, as a CTO, I saw this very clearly, like, I think data teams like need better tools, right.
There's sort of behind, I think other segments of, you know, software engineering.
And so, and then I looked at a lot of different like parts of the stack, like, you know, I looked a lot in orchestration to begin with.
I, as a mission, I'm the author of NovoSource tool for Legion, which nowhere leaves yesterday, but like 10 years ago, some people use this sort of kind of a precursor, like kind of before Airflow was like sort of a simpler stuff.
Sure.
So I started working on that.
I was like, you know, this is really cool.
Like, you know, like an orchestrator kind of sits in the middle.
It controls a lot of the other stuff.
You know, what if you start there?
You can sort of, you know, then it becomes like Nexus for all the other stuff.
Then I realized like at some point, like, cool, like you can orchestrate the code, but you have to run it somewhere.
And like, where do you run it? And I started thinking, okay, I have to build this integration layer with like,
with like Kubernetes or Docker or like Lambda or like these two.
And that just like turned out in my brain, it's like, it's slow to realize
how down that is, like how hard it is.
Like, you know, fundamentally, like the user experience of building an
orchestrator will never be better than the sort of user experience of like
the integration with the underlying substrate where you actually run code.
So I was like, why don't I just fix that problem instead?
And so I started focusing on this idea that like,
what if it's like throw out all the other stuff, right?
Like, you know, you look at like, you know,
data teams and like how they run code today.
Like there's like Renetta, Docker, Terraform,
you know, Helm, like all this stuff, Airflow.
And then like, what if you started from scratch
and like built it in a truly complicated way where way where like, you don't have to think
about resources, you don't have to think about like, you know, setting up
instances, installing infrastructure.
It just like runs in the cloud for you.
Get yourself up and down.
You can schedule things like, what would that look like?
And so that's what I started working on two years ago.
And, and yeah, we have something that, you know, works recently while today.
We're still in a sort of closed testing stage right now.
Stig Brodersen Gosses, all yours.
Stig Brodersen Thank you, Eric.
Thank you.
So Eric, you mentioned Luigi.
So let's start from that because it's, you said like you have a passion of like
building tools for like data engineers and Luigi is probably one of the first that you've
built maybe, or at least the, yeah, the one that got open sourced, right?
So tell us a little bit more about Luigi and I'm not talking about that much
about like the technology itself, but like how, how did you get into building it?
Yeah. So just kind of a recap of Luigi is it's like, it, how did you get into building needs? David PĂ©rez- Yeah.
So just kind of a recap of LouisJS.
It's like, it's an orchestrator, AKA workflow scheduler, you know, sort of
saying a cam is primarily Airflow.
Now there's also, there's Daxter, Prebank and Flight.
So, but like when I started working on Spotify, there really wasn't
anything in that space, right?
And so I, I ended up having, you know, more and more complex data by class for the music
recommendation system.
And like, you know, like a lot in particular, a lot of Hadoop jobs.
Like, so I would have serious, like, you know, complicated, you know, hundreds of Hadoop
jobs that I had to chain together and like run in a particular order.
And I ended up like realizing at some point, like, actually, this is a graph,
you know, we should model it as such.
And, you know, because I'm like an old school person, Bitcoin, blah, blah.
Like I, you know, my first sort of like, you know, thought was like make file,
like, this is kind of like makes up.
And so I started looking at make file and like, I kind of like, you know, cause
I liked the idea of just like functional nature of like how make all works.
You didn't find like targets and rules.
I mean, like there's, there's syntax with make files, incredibly
arcane and like, you know, super annoying.
Yeah.
It was like dollar percentage is like the exit code of the previous, you know,
whatever it's like, I just like, I was certain because it was built in the
seventies, I think, something like that.
So anyway, so I started, but like, I went to sort of idea of it.
So I started building the lege, you know, sort of losing model and that, and then,
you know, kind of evolved, like, which was actually the third iteration of,
of the same sort of idea.
And the first one where I felt like this is good enough, or I'm just going to throw
it out there on GitHub and see if anyone uses it. And I guess other people had similar problems.
Like, you know, after just a few months, like someone Foursquare reached out
and they're like, Hey, we're running Luigi.
We're, you know, we'd love to collaborate.
And then, you know, from there, like, you know, over the next few years, like
there were many more companies that started using it more and more.
I never like thought Luigi was like amazing. It kind of funny in a way but like you know
to me it was just like a thing that solved a particular problem that was like good enough
where I felt like other people would also have the same problem and I think they had and so I
think there were a couple of things that I'm really happy with Luigi like it sort of had this
like functional nature of like you know how you express the. But the other states also don't like about it.
In particular, I think the reason why people liked Airflow more was like,
it had a much better web interface and lots of other stuff that people actually
wanted, so I don't know, it was kind of a fun thing.
I mean, people still use it, but it's kind of rare today.
Henry Suryawirawan, Yeah.
Yeah.
So what's like the current state of the project?
Like is there still like, you said they're kind they like, I mean, I don't know, like, there aren't as I
mentioned, like there are some companies like still using it. But I think, you know, I haven't
touched it for six years. But like, I think Spotify still has like a lot of stuff and Luigi,
I heard they had a bunch of stuff. And there are some companies that still use it, right?
Yeah.
Got to like speak to a little bit to like, you know, it's a very
sticky product, it's very hard to get out of, you know, these like work
rules once you like use them, you know, then you have to rewrite all the code.
Um, but yeah, it's not being like actively maintained.
And I think for a good reason, I think many, I mean, I always like thought
about it, like, what if I'd actually like kept working in Lutie, what
would it look like today?
And I think, you know, there are, you know, that's actually kind of like what I started working on, you know, I mentioned like, you know, work straighter
like before model and I think there's a lot of things you could have done like
so much better and I still kind of don't see in like the today's workflow
schedule, but, but, but yeah, I mean, yeah.
To answer your question, like it really interacting with me right now. Okay.
Okay.
So how does Orchestrator fit in the worlds of today with the serverless cloud infrastructure
that you have in mind when you're building a model?
Yeah.
I mean, what are those kind of the layer below, right?
As I mentioned, I started thinking about Orchestrator and I realized, actually, I don't want to
build Orchestrator. I want to build a layer below, right? Because I think I mentioned, like, I started thinking about orchestrator and I realized, actually, like, I don't want to build an orchestrator.
I want to build a layer below, right?
Like, because I think
that's going to be a problem.
And so,
I think it was actually
two quite big layers.
And we have a couple people
we use modal together
with the orchestrator,
like Airflow,
Prefect, or Daxter.
I mean, like, you know,
so, so, so,
and, you know,
and those,
by the way,
like, orchestrator,
like, I feel like,
in a way, like,
there's someone else that got to, like, where the job actually runs. Like, typically, people run and, you know, and those, by the way, like orchestrator, like, I feel like in a way, like there's someone else that gets to like where the job actually runs, like typically people run it up for bananas or, or, or something else. Or, but, but you can certainly integrate the surface.
Like I, I feel like in general, you know, like thinking about where the cloud is going.
And then it is a function of having seen this kind of firsthand, like all the old days of like on-prem to like now, like the cloud, I think
we're still kind of early in the cloud, you know, in the evolution of the cloud.
And I think to me, the natural sort of inevitability of the cloud is
to some extent serverless, like why as an engineer should I have to think
about these, like, you know, these abstractions, like clusters and instances,
managed resources, and like, you know, all this other stuff, right?
Like, computers should just do it for me.
They're much better at those types of things.
And so I'm extremely bullish on serverless for that reason.
And I especially think in, you know, data land, that might actually be a better fit for serverless, right?
There's a lot of like aspects, you know, serverless so far, I think has been
like mostly prevalent in like, you know, front-end and back-end type architectures.
And you see a couple of vendors like going all in on like sort of Versailles
or like Netlify, whatever, like doing that kind of stuff, but I think like
the sort of nature of like what developers in data teams do might actually be a benefit.
You have this like very like bursty workflows.
You have like esoteric sort of heterogeneous hardware requirements.
And you have, you know, sort of like very like exploratory things that
require you to like, you know, build things quickly and like, you know,
try it out and stuff like that.
So that's sort of, you know, idea that I always had with modal.
Like, you know, while I'm like, I don't know if I'm answering your
question, I kind of, you know, ended up diverging a little bit,
but those are the things that I think about a lot these days.
Okay.
So like, if I stopped like working with model today, like what's my experience
would be like, well, how do I interact with model and how is this like different
than the cloud or yes or no. Right. Or like, I mean, like, I think, you know, this is still very much like the, it is those companies, right? Like you write code, you run it locally as you're developing code,
you're sort of writing it locally.
You're like kind of test running it.
And then like, eventually you're like, okay, great.
I'm going to like deploy this to production.
It's time.
And then you're like, oh my God, I have to like containerize this.
I have to like write once for YAML.
I have to like, you know, whatever, you know, ask the ops team to do this thing, you know, permission to stuff.
And like, you know, we started being just like this inordinate amount of like chore,
like, just like, you know, it's like a super annoying process.
And so one of the things that I fundamentally believe in is like, you
know, like what if I used to think that like, you know, the only way to like work
with the cloud is like have to mock all the
things globally.
But then I actually realized, what if you actually bring in the cloud much, much earlier
in that process?
What if you always run things on the cloud?
As you're writing code locally, what if when you run it, it actually runs in the cloud
in the same environment that you're going to eventually deploy it to?
And so that's what modal does.
And I think, you know, and the only way to do that is to make it super, super
fast and fast, not in the sense of like running it fast, because you know, there's
like upper limits to how fast computers are that we can't do much about, but,
but, but it works, however, what we can do is we can store containers super fast.
We can take code locally, like you're writing a script,
you know, I just wanted to run it.
And so what we can do is we take that code,
stick it in a container in the cloud,
we launch it in less than a second.
And, you know, we'll launch a hundred of those containers
in less than a second.
And then, you know, they just like print to, you know,
standard out, like whatever is happening, like you see it.
And like, you don't have this like annoying, you know, you have to build a
container and push it to ECR, then go trigger this interface, and then you'll
like download the logs, you know, whatever.
Like you don't have this like super long feedback loops, right?
Like when I think about developer productivity, to me, I think developer
productivity is like best understood in terms of these like feedback loops.
And so you have to make these feedback loops fast
in order for developers to feel happy and to feel productive.
If the model does this with the cloud,
like you go from like writing code low-blade
to actually executing it in the cloud in less than a second,
which is something we have to go very deep
and kind of in the guts of containers and file systems and that kind of stuff to, to others.
Henry Suryawirawan, Okay.
We'll talk about that.
But before that, so you said that like model is I mean, no, before, before that,
your passion, like building products, like technology, like that's tools that's going
to be used by data engineers, like data, let's say professionals, how is model used for that
by a data engineering team?
Like what is like a data engineering team going to use model for today?
David PĂ©rez- Yeah.
Yeah.
Yeah.
And like, yes.
So I, what I think is interesting is like over the last 20 years or whatever, 15
years, it's like, you know,
there's been all this like back and forth between like, you know, first we were saying I do,
and then the warehouses came and it's been like, you know, now, you know, SQL is like very
dominant, you know, for a lot of workloads. I love SQL, but, you know, I think fundamentally,
like my belief that there's always going to be stuff you have to write code for, right? So SQL
covers like a very large, like, you know, percentage of work workloads.
It's all stuff you're gonna have to run, write code for, like, you know, doing,
you know, sharding or doing simulations or whatever.
So that type, so Modul exclusively focuses on code.
And in particular right now, Modul supports Python, right?
Like eventually we'll probably have support for other languages too.
But the benefit of focusing on data teams specifically is like
they'll almost always use Python.
Yeah.
So, so what are the problems that model primarily helps with?
Like right now we're targeting what I call, what I think of as like
embarrassingly parallel workflows.
So those could be things like I have a hundred million images and I need
to compute embeddings for those.
And I have this function that runs on a GPU that takes an image and produces
this vector and I just want to scale it up and run it.
That's like one use case.
Or I have, you know, a hundred million satellite images.
I want to run some computer vision.
Or there can also be things like I need to scrape a bunch of websites using a headless
browser, you know, Chromium or Playwright or whatever, and take screenshots or like,
you know, stuff like that.
It could be, you know, financial backtesting.
It can be like various types of simulations.
Multi-parallel simulations, you know, like, you know, either, you know, pricing
and financial instruments or I don't know, like, you know, sampling from
Brassiere, you know, things like that.
So we focus a lot on these types of sort of embarrassing, like
parallel use cases, that's a sort of, you know, starting point.
But, and then another slightly smaller bucket of things tend to be just like simple
cron jobs, like, you know, people have like a tiny script and they just, you
know, they just want to run it every hour, like every, every midnight or something.
I guess that's like a smaller bucket.
Um, but I think there's a lot of, you know, super early stage tech teams and
like, they just have like a tiny thing.
It doesn't even have to be a tech team.
They just want to run something on a schedule.
Henry Suryawirawanacik, And how does like model differs compared to something
like AWS Lambda, for example?
David PĂ©rez- Architecture, it's very similar in the sense that we're
all about like running containers in a way where you don't have to think
about the underlying infrastructure.
I think the difference is primarily, and I thought a lot about this, like, you know,
why has serverless not been more successful?
My answer to that is frankly, the developer experience kind of sucks.
Like, I don't know if you're like, you know, production as like Lambda functions,
but it's like a pretty bad experience.
Like you have to set up like, you know, CloudWatch or like whatever it's called. I don't even remember. Like there's so much stuff you have to set up, like, you know, CloudWatch or, like, whatever it's called.
I don't even remember.
Like, there's so much stuff
you have to do.
And you don't have, like,
you know, nice feedback
over the years, like, run code
in StackWars and Cloud.
So a big part of what
Modal does is really just,
you know, make it put, like,
you know, offer a much,
much better user experience.
And I think because
the latency between, like,
writing code and launching
containers is so critical to that developer experience, we basically have
to start over and build our own infrastructure for that.
Good man.
Okay.
And while you were going through like the use cases that you have seen like
in model so far, you talked like a lot about like, let's say working like
with unstructured binary data.
What about tabular data?
I mean, do you see like use cases also there or this is not, let's say...
David PĂ©rez- Yeah.
I mean, I think there's like always stuff that, you know, I mean, like, you know,
like there's always stuff that you can, if you can do things in pure SQL,
like I think you should do it.
Right.
But it's like, let's say you have like a lot of like tabular data and, you
know, you want to fit, you know, an XGBoost on top of that, like, you know,
that might be hard to do in a database.
I mean, I don't know.
Like, I know there's like crazy, like Postgres extensions, whatever, to do
that today, but like typically you want to write code for that, you know,
and you want to run it in like a normal like containerized environment.
And so I think that would be like one example, right?
Like, you know, we can, you know can take that data, operate on it, and fit models, and then help you deploy those
models, because you can also take functions in model and deploy them as either REST
endpoints or focusing the keys and twirling it from other apps inside of a
organization.
So you could also use this sort of, you know, kind of a model,
a model-serving interface.
But, but yeah, those are some of the things we can do with model
and tabular data that, you know, you probably don't want to use the database
for, but again, like, you know, to go back, like, I think there's like
plenty of stuff where database is amazing.
It definitely starts with database.
Yeah.
Makes a lot of sense.
And there's another huge conversation
about what a serverless database looks like,
which is another interesting topic.
But let's talk about model
and the architecture of model, right?
What it takes to build a platform like model,
like a serverless cloud infrastructure service.
Henry Suryawirawanacke...
I, you know, so I started looking at this a couple of years ago and immediately
like kind of ran into this problem.
It's like, you know, building Docker containers and like pushing Docker
containers around was kind of slow.
And so how to fix that.
And so, so one of the things I realized pretty easy, pretty, pretty early on
is like, I'm not going to be able to use StarCraft for this, and so we ended up
using a much lower level primitive, RumC.
And, you know, I mentioned, I think we could have switched to Firecracker
and to run containers, which, which, you know, using that, you can spin up
containers in like milliseconds or less.
The problem is like, okay, now you have this like big container, like container
images, like how do you, how do you like ship around these like big container
images that like, you don't want to like fall for this, like, you know, sort of
traditional like Docker method of like, okay, we're just going to push and pull
the whole image, although it does a little bit of de-diplication on all their
levels, it's a little bit faster sometimes, but like, if you have, you
know, a container image that's like 10 gigabytes large, you really want to
avoid like, you know, sending that back and forth over the network between different nodes in your cluster.
So I basically realized, okay, well, if you look at containers, sorry, this is getting really technical, by the way.
So feel free to tell me to shut up if this is like too low level.
I realized like if you look at container images, first of all, most of the container images are, like, never read.
You know, the average, like,
container image is, like,
you know, if you look at, like,
a bare-bones, like, Linux distribution,
it's, like, all this, like,
random, like, time zone information
from, for, like, Uzbekistan
and, like, random islands,
like, no one lives on.
Locale information,
like, watch this stuff, right?
Like, so,
why send those states
in the first place?
The other thing is, like,
even for the files on the Linux distribution, or on that image that are actually read, most of them tend to be the same.
It tends to be, you know, the Python interpreter, a bunch of Python standard library, you know, a bunch of stuff is like very, you know, like when you launch a Python interpreter, like it'll read like the same 200 files roughly.
Right.
Like, you know, and that doesn't differ much across like the different containers, it differs across different Python versions.
But so what we ended up doing was like, we ended up building our own file system
that basically stores all these container images in a content address
bay where, you know, we go through all the image, we compute checksums for
every image, for every file, and then we basically store those files only once in an underlying network storage.
And then we like kind of create this like virtual file system in Fuse.
We ended up building a Rust for performance that then exposes that to the container runtime as like a Linux root.
And I actually worked super well. Like everyone told me like, you're crazy for building a Linux root. And it actually worked super well.
Like everyone told me,
like, you're crazy
for building a file system.
But we ended up doing it.
It actually works really well.
So we have to build
a bunch of other stuff too,
like sort of one-less lines.
But that's like, you know,
kind of one of the things
that we ended up building.
It's like very technically challenging
and kind of complex.
But I think ultimately,
like also now
that's just delivered
this experience to consumers that I always wanted.
It's like, you know, just write code and it's like immediately starts in the cloud.
So if I understand correctly, you are using, like in terms of like the
technologies that have been used, like as part of like the model of stuff,
you use Firecracker, right?
Not yet.
Ah, we're planning to move to Firecracker.
It has, this is where GPU is, some of the issues.
So, but that's the plan.
Okay.
So what are you using now instead of like Firecracker?
We use, we use run C, just like a lower level primitive.
Docker actually uses, well, C, I'm going to put, it's just like a simpler.
Okay. And if you look just like a simpler. Okay.
But if you look at like what is a Linux container,
it's like basically like
ch-true and like set-conf
and like, you know,
c-groups and also like
those types of things.
And that's essentially
what like Run-C does.
It's like a thousand line
Go binary
or Go script binary
or program
that basically is like,
you know,
wraps those things
into kind of unified
OCI image compatible
running.
Yeah.
And then you use like fuse to create like the file system, but
she don't feel like story they much is there.
Right.
So that's correct.
Yeah.
Yeah.
And actually every fuse, you know, makes it pretty easy to build file systems.
We actually ended up like the first prototype we built in.
I thought there's like Python violence for queues, which is like terrible for the performance part of it.
But it's actually kind of nice.
Like, cause you know, it was very simple to experiment with.
So here's what's from this.
And where do you run that stuff?
It's like, you're using like bare metal servers that you have like on the clouds.
Yeah, exactly.
So we need, we run it on this bare metal, these two instances, and then, you know,
we maintain that pool of, of workers and we, you know, we start and stop instances
as we need to, to, you know, need, need more resources.
So, so, so from a, you know, user's point of view, they never have to think about
those things, like we do that all the time.
Yep.
And okay. The users do not have to think about that stuff, but you have do that all the time. Yep. And okay.
The users do not have to think about that stuff, but you have
to do it for them, right?
So yeah.
That's like OOPs, right?
Of like such a product and infrastructure.
How it looks like.
How like an SRE or like whatever OOPs title you have like in model like look like.
All right.
This might be like a controversial fit thing, but like, I actually don't think
you should have like a separate ops team like early on, I think you should have
like people who are like, maybe like more like interested in it who can
like get on the thought board with it.
But you know, early on, like I actually really think, you know, you align
incentives really nicely if everyone's part of the firefighting at all times, right?
You know, if something breaks, anyone on that team just like jumps in.
Ideally the person you're like, you know, overwrote the code, we're like, understand
that part of the code that, I mean, we're early, you know, we're early, like we're six
people.
So we don't have like a dedicated office, you know, eventually we'll have, but not right
now.
Yep.
Yep.
No, I totally understand. I think that's also my experience, like in building like teams.
So how is it like, let's say different than what, let's say someone was doing
operations in like a company that is, you know, like relying on like Kubernetes
and like other kinds of primitives, like for the infrastructure there.
How's like different the operations between like
model and like a company that is more cloud traditional, let's say.
You're never going to have to write a single line of YAML. That's the biggest difference.
I think you are going to do like a very good job in like hiring through this
podcast episode, but we just did, so.
Stas Miliusiskeva, Yeah, no, I think we have to write all the YAML, you know, once and so that our users never have to write YAML.
I, yeah, I think, I mean, like, you know, ideally, like, you know,
don't have to even think about it.
So I think, you know, in your mind, I wouldn't even have to think about
data engineering, right, like, you know, and this might also be a controversial opinion, but like, I almost wonder if like
long-term, like data engineering as such, you know, as it is today, like we'll go away.
Because I look at like all these startups, right?
Like, you know, every tech company in the world ends up building their own internal
data platform, right?
And they all kind of look the same.
It's like, you know, you start out with like Kubernetes and, you know, a bunch of stuff.
And then like someone builds an abstraction to make it easier to launch internal like machine learning models or train things in notebooks or whatever.
And then, you know, but like, you know, what if like, you know, someone just built that, you know, and then like sold that as a service?
Like I almost, I kind of feel like the world would be a better place than like, you know, instead of like, you know, to that, you know, 10,000 companies like building it themselves. And so that's sort of where like, you know, logically where I see the world evolving to is like, you know, a lot of more of those things should really just be like services that you use in the cloud. And, you know, so you don't even need to have necessarily a platform teams internally that does this. Yeah, makes total sense. Cool.
All right.
I have like a question.
While you were like talking and you were mentioning
like the Docker images there
and like how big these images are
like in general
and like my opinion is that
especially when it comes
like to infrastructure,
there are still like a lot of primitives around
that they just feel like not the right primitives yet.
They're like, yeah.
Or sometimes what comes to my mind is that because we are probably not that far in terms
of age, but if you remember back in early zeros when you had to download these stupid, super bloated installers to install something?
And you had this really bad experience with like, why I need this thing?
Sometimes I get the same kind of feeling, but on an infrastructure level, which obviously is much more complicated, right? But we have like, yeah, as you said, like, okay.
Time zones from Uzbekistan.
Like, why do I need that?
Like to run like a serverless Lambda function, but that like,
yeah, our class one equals two.
Right.
So what, what's like, what kind of evolution did you expect to happen?
And what are you like expecting to see in the industry?
Like as new primitives that like we can work with the infrastructure at the end.
David PĂ©rez- Yeah.
I mean, I, I think, I feel like there's a lot of like tools that are like, you
know, they try to, you know, wrap underlying layers, but they end up being
kind of leaky obstructions in a way where like, it doesn't prevent you from actually having to learn about those underlying layers
anyway, and now you almost like kind of made the problem worse because now there's
like another layer that you have to learn, right?
Maybe like a good abstraction.
Like, you know, if you build like an amazing data platform tool that wraps
Kubernetes and you know, then like ideally, you know, if you use that tool,
you should never have to learn a single thing
about Kubernetes or Docker or like whatever.
But like none of those tools really work that way.
And I feel like they always like
kind of leak through in the end, right?
And so that to me, I think,
I don't know, like my theory is like
the first generation tools,
they're all about like,
they enable you to do something.
Like they, you know, they solve like a hard problem, like a hard technical problem in
a way that like, you know, there was no tool before that solved.
And like people then have to use those tools by necessity.
I think what ends up happening is the second generation then actually like kind of preserve
the functionality, but like rewrite all abstraction and actually present it in a
much better way where fundamentally the technical, the enablement is the same,
but you no longer have to jump through all these hoops to get it installed.
And instead I was like, I don't know, like, you know, I am looking like, you know, machine learning, right? Like I started doing deep learning, like 2014, you know, those days.
And it was like incredibly important back then.
Like you had to like build all, you know, install Theano, like, you know,
install a bunch of like random CUDA drives or whatever.
Yeah.
I guess she kind of still very much a problem, but like, you know, but to
some extent, like now you can also just like, you know, go to, you know, a
plugin phase, like download a model and like, you know, now you suddenly have like a model, you can actually do like very cool know, but to some extent, like now you can also just like, you know, go to, you know, a plugin phase, like download a model and like, you
know, now you suddenly have like a model, you can actually do like very cool things.
But I think, you know, you, you guys, it's not like this like similar story
to be told at like many different fields.
Like, you know, sort of.
Mitpackage thinks in a way that suddenly becomes a lot more accessible and, you
know, it makes a lot easier and a lot more fun to use.
Yeah.
Yeah. Yeah. I totally agree. All right. Last question for me, and then I'll
hand like the microphone to Eric again. Eric, touch. So what kind of like opportunities you
see out there, like business opportunities in like building new experiences like over like cloud.
You mentioned at some point that we are still like very early on the clouds.
So obviously you believe in that.
That's why you built Modal.
But what else is out there in your opinion, like interesting problems that
can be also like interesting business opportunities?
Yeah. I mean, like here's like one thing I've been like ranting about on Twitter is CI.
Like CI is like this like super janky experience today. Right. And like, I've been like, you know,
I want someone else to build it, but I'm like, you know, I'm just like ranting about it on Twitter.
It's like, if no one builds it in, you know, in the next year, like I might just do it myself.
It's like, I have like an idea that you can even like use modal for some of it, but I
don't, ideally I don't have to do it.
But like, you know, what's like crazy about CI is like, you know, first of all, like,
you know, I don't know, use like GitHub Actions or whatever, like then you have actually,
I actually think it's like a really cool product in itself.
And I think it like, it sort of is the same story of like, it enabled me to do things
in a new way.
But, you know, today I want to do GitHub Act actions, like I was like, yeah, I don't
stop and then, you know, like something breaks and then like I had this like
super slow feedback cycles where you have to like commit something to get
and then like wait like five minutes and then like see in the look, you know, so
you have this like slow feedback loops, which is like the most like torture
relay in a developer is like debugging problems with slow feedback loops.
And then, you know, you think about all this like extreme amount of like
resource, like wasted resources.
Like you have like, you know, 10 containers each pulling down the same
libraries over and over again and installing them, like, and like, and
you also think about, you know, I also think a lot about this, you know, unit tests.
Like, they're the ultimate, like, parallelizable thing.
Like, why can't I just, like, let's say I have this, like, large, you know, project.
And, you know, I have a thousand unit tests.
Like, can I just, like, stitch every unit test in the Lambda function and just, like, run all of them at the same time?
Like, you know, because, like, you have a human sitting there waiting for a test.
You know, that's very That's a very expensive time.
And also another thing I think about is you have this super annoying lack of parity between the CI environment and local testing.
And think about it from that point of view.
Why do developers even run things locally?
It's because the AI environment itself. But if the CI environment was so good that the experience was as good as running things locally,
you wouldn't even run the tests locally.
You would develop code and just launch a thousand tests
and each running in their own Lambda environment,
Lambda container or whatever.
And then you would immediately just see the failing tests.
What if you can build that?
I think that would be a fantastic world.
I was six years.
There's very few things where if someone said,
would you spend $100,000 making your engineers
not have to wait for CI?
I'd be like, take my money.
So, you know, I think it's a huge opportunity to do this.
But, you know, if no one else does it in the next few years,
like maybe I'll have to do it myself.
I kind of don't want to.
Sounds good.
All right.
Eric, it's all yours.
Yes.
This has been so fascinating.
I'm interested to know, have you, like, have you built anything, even if it's just for you or your small team to like address some of those CI issues?
Or are you still just using like a process that you largely
hate? I generally tend to build things when I don't like it, but yeah, I know so far we haven't
thought for CI. Yeah. Yeah. Super interesting. Okay. Well, we're really close to time here.
And so I have a, I have a question that is unrelated to technology at all, but I want to know if you still have
any of your Spotify playlists
back in
2008, and if so,
what music is on
them?
I mean, I don't know.
I do.
I do a lot of playlists.
I'm sure you can find it if you Google my name
and browse it a little bit.
I don't know. I'm just a degenerate Detroit techno fan,
growing up in Europe. And I lived in Berlin at some point. So I tend to,
you know, my taste tends to skew to those sort of styles. But I don't know. I always grew up
listening to music. I listened to everything like jazz, hip hop, and like, you know, classical
music, like whatever. Like I, I do like music overall. And that was actually a fantastic
experience working at Spotify. Like I, I got to use, you know, I got to, I, and it's like,
you know, I almost like wondered, like I built the music recommendation system purely for like
selfish reasons. I discovered like a lot of music throughout my own, you know,
through my own system, which was, you know,
kind of gave me some sort of pleasure.
Yeah, for sure.
That's super interesting.
I mean, as a Spotify user, I remember,
I can't remember where I listened to it,
but I listened to an interview with Daniel Ek
and he was talking about how, you know,
at some point you figured out that if you
could recommend something new that a user liked every week, then they would,
you know, essentially sort of say it was Spotify forever, right?
Because you're sort of providing this like discovery experience.
Yeah.
It's been very true.
Like exposing downloads of like different things.
And, you know, that wasn't me personal, definitely it was people in my team who came up with that
idea of Discover Weekly
I think Edward
and Chris
and a few other people
in my team
fantastic idea
that you know
I still use
every week
it's you know
it's a good product
yeah
super interesting
all right
well Eric
this has been
such a great conversation
so much more to cover
so we'd love to have you
back on the show
anytime anytime this is fun I really appreciate it Costas Such a great conversation. So much more to cover. So we'd love to have you back on the show.
Anytime.
Anytime.
This is fun.
I really appreciate it.
Costas, this is going to be, this may sound like an interesting takeaway, but I remember being so interested.
I think I mentioned on the show that I'd listened to an interview with Daniel Ek, one of the
founders of Spotify.
And this was years ago, but for some reason, that interview has really stuck with me. And I'm going to draw a connection that Eric didn't draw, but that I think is interesting. figure out ways to almost like create a user experience that made users okay with latency
because speed was a really big deal when you were trying to deliver a large file like an you know
an mp3 over the internet streaming right and it was really interesting to me that he used a lot
of very similar language when talking about modal and the developer experience, right?
There's like a latency challenge and friction points.
And it was, it's fascinating to me to think about the similar
nature of those two problems.
So yeah, that's, that's my big takeaway.
I think that's really interesting.
Oh yeah.
That's like a, I told like a great point that you're making here.
It's very interesting, like how it connects the developer experience with
like the user experience at the end.
Because yeah, I think we tend like to consider them as like very different,
but some assumptions that they are common, they just manifest themselves
like in a different way. Obviously, a developer has latency in different things
than someone who listens to music.
Sure.
But yeah, that's an excellent point, actually.
I mean, I don't know.
I think every part of the conversation with Eric was great.
I loved hearing about the stuff that they're building and how they build them.
And what I will keep is that it's great to hear that you can have like startups
today that in order like to operate, they have like to create their own file systems.
Yeah.
That's like, I think like a great indication of like the progress
that has happened like in all these years in terms of like the infrastructure that
we have and the primitives that we have out there to build upon and yeah, it's
like people like just shouldn't be scared to even like go and build their own file
system if they have to.
So that's what I keep.
And yeah, hopefully
we're going to have him back soon.
Yeah, I agree.
Awesome. Well, thank you for
listening to the Data Stack Show.
Great conversation with Erik Bernardsson.
And we'll catch you on the next one.
We hope you enjoyed this episode
of the Data Stack Show. Be sure to
subscribe on your favorite podcast app
to get notified about new episodes every week.
We'd also love your feedback.
You can email me, ericdodds, at eric at datastackshow.com.
That's E-R-I-C at datastackshow.com.
The show is brought to you by Rudderstack,
the CDP for developers.
Learn how to build a CDP on your data warehouse at rudderstack.com.