Screaming in the Cloud - It’s Not a Data Science Problem, It’s a Data Engineering Problem with Laurie Voss
Episode Date: April 15, 2021About LaurieLaurie has been a web developer for 25 years and cares deeply about making the web bigger and better for everyone. He previously co-founded awe.sm and npm, and is currently a Seni...or Data Analyst at Netlify.Links:Netlify: https://www.netlify.com/Twitter: https://twitter.com/seldoPersonal website: https://seldo.com/
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
Join me on April 22nd at 1 p.m. Eastern Time or 10 a.m. in the one true Pacific Coast time zone
for a webcast on cloud and Kubernetes failures, like there's another kind, and successes.
Apparently there are other kinds,
in a multi-everything world. Oh god, what are they making me do now? I'll be joined by Fairwinds
President Kendall Miller. Oh, that explains it. And their solution architect, Ivan Fetch,
will discuss the importance of gaining visibility into this multi-everything cloud native world, and I will make fun of them relentlessly.
For more info and to register, visit www.fairwinds.com slash Corey. Oh God, it makes it look
like I work there now. That's C-O-R-E-Y, the E is critical, and tell them exactly what you think of
them because I sure will. Talk to you on April 22nd at 10am in the One True
Pacific time zone. The Apps on Cloud Summit, hosted by Turbonomic, is a new action-packed,
not a conference, happening May 11th through 13th online. It's for everyone who makes applications
in the cloud run screaming, from IT leaders to DevOps pros to you folks, whoever you might be. Take a break from
screaming into the cloudy void with me to learn from some of the best of people who actually know
what they're doing, like Kelsey Hightower, AWS blogger John Meyer, and also me, because apparently
they didn't listen to me saying I had no idea what I was doing. Register now at turbonomic.com slash screaming. There's
a swag box ready to ship for the first 2,000 registrants, so you don't want to miss this.
Thanks again to Turbonomic for sponsoring this ridiculous podcast.
Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Lori Voss,
who is currently a senior data analyst at
a company called Netlify. Laurie, thank you for joining me. Thanks for inviting me.
So let's start at the very beginning. What is Netlify?
Netlify is a single cohesive build chain for websites. A lot of people don't think of it that
way. I think a lot of people think of Netlify as a web host, but really where people are getting value from Netlify is
you build your website, you upload your website, you deploy your website, you host your website,
you test your website, you monitor your website. And, you know, that can be five or six different
services, like a CI service and a hosting service and a Git service and all of those things. And Netlify just joins that entire build chain into a single tool where you
just hook up a Git repo, hit commit, and it goes out into the world. And it's incredibly fast and
convenient. And that's really where people get value out of it. Perhaps somewhat uncharitably,
I would almost think of that as Heroku for this decade. I mean, I would consider that pretty charitable to us and somewhat uncharitable to Heroku,
who are still around and chugging.
Oh, absolutely.
I'm a big fan of things like that, where it's take this code, whatever it looks like.
Maybe it's a repository.
Maybe it's some, I don't know, some files I email over, God forbid.
And then go ahead and deploy it into something that at least pretends to be able to scale.
I often hear Netlify brought up in the context of Jamstack, which seems to be this whole area of cloud computing that I don't tend to spend a whole lot of time in, at least not knowingly.
What is it?
So Jamstack originally stood for JavaScript APIs and markup, sometimes also referred to...
But I hate all of those things. Please continue.
It's sometimes also referred to as static websites, which is a term I tend to avoid
simply because it's not really very accurate. Like a static website is one of the things that
you can deploy on the JAMstack, certainly, but it's certainly not the only thing you can deploy.
I would say that it is an architecture that lends itself to pre-rendering as much content as is possible,
and then caching all of that stuff at the edge, and then pulling in only the bare minimum of
dynamic content to improve both scalability and performance. Those are the things that
people like about JAMstack websites, is that they tend to be extremely fast.
So that makes intuitive sense to me. And you, of course, became fairly broadly known as
one of the people behind NPM. But now you're a senior data analyst, which feels like it's a
departure from the things you were doing to the things you're doing now. Help me either validate
that or tell me what obvious thing I'm missing or highlight something clever for me?
Because right now I feel like there's a missing link in my chain of events here.
Now, that's a totally fair question. So I started NPM as the CTO and hired an excellent engineering
team underneath me. In fact, one of our very first hires was a lady called CJ Silverio,
who is just a staggeringly good engineer. And it became very
obvious very early on in the life of the company that we really had two people of CTO caliber and
we didn't need two of them. But what we did need was somebody to run the operational side of the
business. So relatively early on in the life of the company, we promoted CJ to CTO and I moved
my title to COO, you know, obviously still with a technical bent,
but my job, you know, as a COO is to do operational things. So I was in charge of
running the financials and, you know, making sure that marketing and sales weren't going
massively over budget or, you know, under quota, those sorts of things. And that's fundamentally,
you know, keep the lights on data analysis job. So while I was CTO, I was like sharing fun stats about
NPM's internals. While I was COO, I was doing a lot of analysis of our financials, but like the
common factor was analysis and I was doing more and more of it. So towards the end of my time at
NPM, I became the chief data officer where I basically specialized down into doing just data
things, some financial, some technical, and doing a lot of outward-facing
presentations about that kind of thing. So that was where my job ended up being. And literally,
how I pitched my way into Netlify was like, what if I did that thing that I was doing for NPM for
you? And they were like, great. You can't be a C, though, because you just got here. I was like,
well, of course. You all have to start somewhere. Humility. It took me a couple of years to
unofficially run AWS marketing.
My God.
Yeah, have some humility as you step through this process.
Was it a big barrier to you once you arrived at Netlify convincing them to buy you the Excel license?
You obviously need to do all this data analysis.
Or alternately, are there better tools for it than the one that we've all been using anyway?
Honestly, I've always been a Google
Sheets partisan. I know that the really hardcore financial types will complain about the functions
that are missing from Google Sheets versus Excel. Oh, will they ever? But I'm not that person.
But we have a pretty great stack that I like quite a lot at Netlify these days. We have
a variety of sort of, you know, older tools lying around, not all of which we've
migrated away from. But the core of the new class is this company called Databricks,
who are basically Spark clusters as a service. So you can just throw essentially arbitrarily
large amounts of log data onto S3 buckets on AWS, and it can query them as if they were databases, which is truly beautiful.
And on top of them, we have a system called Mode Analytics, which is a sort of general platform
for data analysis and presentation, draws graphs, that kind of thing, has an SQL interface. And
between those two, we've got a new open source project, or not relatively new to me anyway, called dbt, which is this very
organized, clever way of sort of codifying your best practices around data. So like you've probably
heard of like extract transform load jobs. It's basically a way of codifying chains of extract
transform and load jobs such that they're always tested and always running and you know what the
dependencies are between them and everything is documented okay well i'm in the process of getting
everyone in trouble on things what is your take on machine learning for things like this because
it seems that whenever you talk about data it's inevitable that someone usually with a crap ton
of vc backing will immediately jump in because they're clearly getting bonused
every time they manage to fit the phrase machine learning into basically anything.
So I would step back a bit and say that before I joined Netlify, I interviewed at a couple other
companies just to see what the space was like for basically the same job at other companies.
And there was a really interesting pattern that I noticed, which is that it is quite a common pattern for an early stage startup to say, oh, we have a data
problem. We must hire a data scientist. And they go and find somebody staggeringly qualified with
a PhD in data science. And they hire that person. And that person immediately runs into trouble
because that is not actually the problem that
they have.
They don't have a data science problem.
They have a data engineering problem.
They have like mounds of data lying everywhere and it's not organized.
Nobody knows where it is.
Nobody can query it efficiently.
Like a data scientist is like at earliest, like your fifth hire in your data team.
The first five people are people who have to do an enormous amount of plumbing and engineering
to be able to just get the data from all of the places
that it's lying around, all of the piles that it's accumulating in,
into any kind of a reasonable format
that you can query it and figure out what it does.
You have to forgive my cynicism on some level,
because I've been in the ops space for, I guess, entirely too long,
where I've been dealing, particularly in the context of AWS bills, with making arguments against data science teams
who are insisting that the Apache logs from 2012 that are taking petabytes of space are the key to
unlocking the mysteries of the business. They're not sure how yet, but one day they're going to become super valuable, so I'm never allowed to delete anything. And on some level,
it just almost seems like it's a big make-work conspiracy for data scientists amongst each other,
which, hey, respect. Counter-argument, what sorts of insights can you glean from these vast
quantities of data? Because everyone else I've talked to about this generally works for a big data-oriented company. I got to be honest with you, it feels like they're
selling pickaxes into a gold rush because, oh, it's very important you keep all your data so
then we can sell you things to go through it. You're on the other side of that. You're buy side.
So what is the value that this giant data horde winds up providing?
Well, I will say that my initial inclination
is to agree with you.
There's definitely a lot of pickaxes being sold to miners
who have no idea what they're doing.
I think about 10 years ago,
there was a huge industry-wide pile into big data.
People were like, you need Hadoop
and you need gigantic data processing clusters
and huge data and massive amounts of processing
and buy this enterprise contract for $100,000 a year. And then everybody did those things gigantic data processing clusters and like huge data and massive amounts of processing and like
buy this enterprise contract for a hundred thousand dollars a year and then everybody like did those
things and was like and now what and they were like oh we don't know maybe you can count it up
how many hits did you get that's not you know useful analysis it having all of your data
queryable is not per se a useful thing to be able to do. And I think in the 10 years since then, people have
got smarter about that. They've realized like, you know, medium and small data are actually often
quite useful. It's more about how you analyze it and, you know, can you present it to people and
can you make sense of it? But there was a sort of second gold rush into the ML space. There are
certainly use cases where you have enough data and a problem that
is amenable to being solved by applying ML to it in some way. Those are a minority of cases there,
like maybe, you know, 5% of all data problems are big enough that you can use ML in the first place.
And also an answer that ML can help you with would be helpful. And the other 95%,
it's just, you know, plumbing and engineering.
Once upon a time, it felt like the way to address all this data was the, honestly,
the result of a prank perpetuated many moons ago by what felt like Google in a white paper
that Yahoo went for hook, line, and sinker for MapReduce, which then led to Hadoop and a bunch of other stuff.
I maintain this was a Google April Fool's prank
that everyone took way too seriously and went way too far.
These days, it feels like stream processing, as that data comes in,
is sort of the preferred approach.
Yes? No? Or am I completely misunderstanding most of the point?
Or all of the above?
I would say definitely the industry
has moved away from the batch processing that hadoop did i actually worked at yahoo at the
time when they were inventing hadoop oh you fell for it too great i was we were selling the kool-aid
as opposed to drinking it oh if you're going to be involved in a kool-aid transaction that is
absolutely the side of it you want to be on. Let's be very clear here. So yeah, streaming processing,
but like semi real-time processing of things
as opposed to giant bash jobs
is certainly where stuff has mostly gone.
Although people who are end consumers of data,
as an analyst, if I ask you, you know,
how fresh does this data need to be?
They will always say real-time,
like that will always be their first answer.
And I'll be like, what if it was 24 hours delayed?
And they're like, oh yeah, well, obviously yesterday's data is fine. Like I'll be like, what if it was 24 hours delayed? And they're like, oh,
yeah, well, obviously, yesterday's data is fine. Like, I'm not going to care about what happened
at noon today when it's 2pm. And then you're like, well, yes, well, then it's a batch job.
And it's like an order of magnitude cheaper to provide you. So let's do that. Batch jobs are
still very cost efficient. And so we do a lot of batch processing. It's just we don't make a big
song and dance about it anymore, because it's no longer the new shiny thing.
On some level, it feels like that is the nature of things,
where something gets announced, and it's super complicated and hard,
and people scale to the peaks of complexity,
and they make good money doing it.
I mean, in the original dot-com boom,
firewall engineer was a quarter million dollars a year if you could swing it.
Now it's just assumed that basically anyone who touches the network should be able to configure firewall rules.
Things get simpler with time.
It feels on some level like an awful lot of the data world is undergoing some of that consolidation as well,
where we're starting to find tools and methods and ways to extract
meaning from giant piles of data without the part where, you know, you go and drop $5 million here
on a data science team. Well, you've sort of arrived at my favorite pet topic, which is
the stack. The stack is this sort of abstraction that I wrote about at the beginning of last year. It's the idea that the ever-increasing complexity of technical fields means that we are constantly
inventing, adopting, and then forgetting about abstractions.
As you said, we're constantly chasing after the new shiny thing.
We make a big song and dance about it.
It's very complicated.
People make enormous amounts of money doing it in the early days. And then somebody eventually invents some kind of tool
or open source framework or possibly like a SaaS that makes it one click to do. And it's not any
less complicated or any less magical than it was before. It's just you think about it much less,
right? Like I mentioned Databricks. Every time I run a query, Databricks is taking my SQL,
converting my SQL into giant
MapReduce, is running it on a huge cluster of machines of arbitrary size. I don't know what
size it is because I don't need to care anymore. And then pointing it at AWS, where it's pulling
in every single piece of data in every bucket that I put in there. And all of that 10 years ago
would have been of a complexity that only Google or
Yahoo could do it. And now it's literally, we spin them up by clicking a button and we don't even
remember that it's happening. Like all of that complexity is still happening. All of that magic
is still happening, but now it's just a commodity and we're doing that across the tech space.
So we've certainly done it in data. A bunch of stuff that used to be very complicated,
used to be the thing that you would hire me to do,
is now just like the tool that I use.
And the thing that I do is the analysis,
which is a more useful use of someone's time, really.
One would like to hope so.
But I do feel like there's a story,
and we see it across the board.
This is one of the things I really enjoy about Netlify.
Once upon a time, to put a website in the internet, you had to know a whole bunch of
different things all at the same time. It was how to build a web server, how to maintain and
patch that web server so it didn't become an attack spam cannon, how to get files into a
format the web server could understand, how to put that out there, how to get DNS to work,
how to handle SSL if that was even a glimmer in your eye at that point, and so on and so forth. Now it really requires click a button.
And Netlify has made this way easier. Because I tend to look at this from the exact opposite side
in the industry, where I come from an ops background. Building all the infrastructure
to handle these things is relatively straightforward to me. But then I get to the other side, cool, now all that's done. Build the web app, and my response is, uh, what? Yeah, I can write bad HTML by hand, sort of.
And that's as far as I generally tend to go. Whereas it feels like the Jamstack story in
general, and Netlify in particular, are aimed at folks, in many ways, coming from the other side of the world,
where it's, I picked up JavaScript,
I picked up a framework or two,
I understand front-end,
I understand how web applications get built.
What's the deal with this whole infrastructure piece?
And thanks to the miracle of stacks
collapsing in upon themselves in many respects,
you don't have to know about that or care,
and you live in this blissful world
where the term Kubernetes never crosses your desk.
Is that a fair summation of the state of the industry?
Am I dramatically misunderstanding what Netlify does and for whom?
No, I think that's pretty much how it goes.
Like, one of the reasons that I wrote this blog post about the stack, it was almost exactly a year ago,
is because about a year ago is when I joined Netlify, and I was suddenly immersed in the things that Netlify does. It became more
clear to me that I was seeing a sort of fundamental shift happening. I was like, oh, we are
obeying some kind of natural law here, right? We are taking things that used to be people's whole
jobs and turning them into things that are so simple that you don't even think about them
happening anymore. Like I've definitely met and worked with people in my life whose whole job was like
managing SSL certificates. And now it's literally a checkbox, you know, and it's on by default.
It's like, would you like your site to be secured by SSL? Yes, obviously. I don't know why I would
turn that off. And it just comes as part of deploying your website. Way in the background,
Let's Encrypt is
doing it. And like, there's a whole bunch of song and dance about refreshing certs every 90 days.
And it all just happens completely automatically without you caring even a little bit. And that's
what Netlify is doing. It's taking like, you know, things that used to be five or six companies and
squishing them down into a single layer that you call your deploy service. And you're like, great,
my deploy service does all of those things. And I don't need those other five companies anymore.
Now, if you're one of those five companies, that becomes something of a problem. But again,
that's the pace of innovation. That is the world continuing to evolve.
Nobody wants to be commoditized. But on the other hand, like the company that gets to do
the commoditizing tends to run away with it, right? Like that's kind of the AWS story, right? It's like there used to be lots and lots of companies that would sell you a server in a rack and then
take 24 hours to set it up and you'd pay with a credit card. And AWS was like, what if that was
one button? And everyone was like, yes, I would love that to be one button. I never want to care
about what rack it's in anymore or whether or not it has enough power or whether or not the cable in the back has got jiggly. Just virtualize it all away from me,
thank you. And then AWS completely ran away with it. Oh, yes. And it's AWS. So it was,
what if that button was hidden in a console that doesn't work super well? And then we give that
button a terrible name. People are like, I'll risk it. I mean, the absurd behavior of the industry is that we love the terrible console.
Oh, absolutely.
Everyone talks about infrastructure as code, which is basically a polite way of saying,
I use the console and then lie about it on conference talks.
Indeed.
This episode is sponsored by ExtraHop.
ExtraHop provides threat detection and response for the enterprise, not the starship.
On-prem security doesn't translate well to cloud or multi-cloud environments, and that's not even
counting IoT. ExtraHop automatically discovers everything inside the perimeter, including your
cloud workloads and IoT devices, detects these threats up to 35% faster, and helps you act immediately.
Ask for a free trial of detection and response for AWS today at extrahop.com slash trial.
So since you brought up AWS, terrific. It's time for me to do my whole conspiracy theory
approach here and accuse you of basically war crimes. So you were big into the NPM space for a long time,
which is great. I mean, I accept the fact that that is a thing that happens. Package.json and
package.lock.json are basically artifacts of you folks. Now, AWS has launched their Amazon Code
Guru machine learning, wink, wink, nudge, nudge, powered code review. And of course,
because it's AWS, they charge based upon lines of code in a pull request, which tells me that
you're a deep plant for many years now, planning for the day where this one day supports JavaScript,
which it doesn't today. And all someone has to do is check in the package lock and the package
JSON files once, and suddenly the entire scheme pays off
handsomely. True, false, or I'm not supposed to talk about that in public. It's true. I'm part
of a global cabal whose purpose is to make node modules infinitely deep until the gravity well
sucks in all of programming and we don't have computers anymore. On a slightly more serious
note, I do want to talk a little bit about package management in the context of programming languages, as opposed to package management in the context of Linux distributions, because world. And I'm not a JavaScript programmer,
except when forced to be, and it's usually editing something as small scale as humanly
possible and backing away slowly. But my general consensus, looking at it across the board, is that
there is no consensus, that there is no clear one right way to do things. Invariably, dependencies
always become a challenge. Getting something to a
reproducible build while also being secure is a problem. And no matter what stack you pick,
what language you pick, there's always a, for Hello World, there's a step one of setting up
your local environment to resemble what the person writing the document environment looks like.
Is that accurate?
Is there some magic tool out there that somehow I'm just unaware of
that solves all of this for me?
Well, there's definitely not a single tool that gets it completely right,
but I would say that there is a commonality between the things that work
that I don't know that everyone appreciates.
So I'm going to draw a parallel between package JSON and Kubernetes right now,
so bear with me. Basically, the thing that people often don't like about NPM and the thing that
people don't like about package JSON is that it says all of your dependencies must live here,
in your tree. I don't care how many JavaScript projects are on your computer, I am going to
have one copy of every module right here where I can see it, and I'm going to use those and only those.
It tends to make JavaScript programs a little bit easier to debug because you know that
the code that is at fault can't possibly be anywhere else.
It can't be sitting in userlib unexpectedly or in some additional libraries folder, or
it can't have been blown away by somebody installing something else.
It has to be the one that's sitting in your tree. And that's one of the things
that made Node so popular in the beginning
and NPM so popular at the same time
was that it was very easy to deal with.
And in particular, it made it work on Windows,
which didn't have any of those things anyway.
And like Node's popularity as a development environment
where you could write code on Windows
and it would work perfectly in a Linux environment
because all of the dependencies were JavaScript
and that ran the same on both of those computers is understated.
And that's essentially the Kubernetes story.
Kubernetes is saying this thing where we have libraries all over the place,
where we have dependencies all over the place,
they lie all over the operating system.
It's too late to fix that.
What if we packaged up the entire operating system and said that that's the package? And
that's what Kubernetes is, right? It's like, it's creating a package JSON of your entire computer.
And then you run that. It sure beats the old approach of, oh, it works on your machine.
Great. We'll back up your email slappy because your laptops go into production.
Exactly right. It's, it's basically It's basically you've packaged up the entire world
and people are like, well, this is very wasteful.
And we're like, yes, it's very wasteful, but it works.
And like the other-
It's less wasteful then, that's right.
A whole bunch of engineering time spent fixing things.
Well, that's not the most optimal way of doing it,
say people who seem to consistently mistake
their time for being free.
Exactly.
No, and it makes perfect sense.
I love the fact that I can use
at least some semblance of what
other people are using and get it to work.
The counter-argument to it is that
it's very, how do I put this,
disconcerting when I'm working in a Python project,
but I'm using a framework or so
that generally installs via NPM,
and now my Python project has a package.json in there, and I get very confused at first. And all right, then I run
NPM install in there, and then I'm way more confused. And I mostly just look at this, and I
struggle to make sense of it before the penny drops. Oh, that's right. It's because I'm bad
at computers. I wish people would not keep letting me forget that part.
Is your objection that you can't launch a website these days without JavaScript anymore? Because
a lot of people are angry about that and they send me email more often than you would imagine.
Well, I assume it's your personal fault, right?
I mean, absolutely. Like, again, the secret cabal. We're trying to inflate all of your
applications with as much extraneous code,
with as many security vulnerabilities as we can possibly manage, because I work for the people who, you know, sell storage and virus scanning, obviously.
Emailing you about the world requiring JavaScript is evocative of an old story where some town
manager angrily emailed the CentOS project maintainers because someone installed a web server in his
environment. He pulled it up and this isn't our town's website. It's the default. Welcome to CentOS.
If you're seeing this page, you've been successfully installed Apache. Read these docs to configure it
and accuse them of hacking his website. It seems roughly the same level of technical nuance,
blaming you for the proliferation of something in society. I don't know. I mean, I certainly spent five years cheerleading it.
So I feel like people who are like,
you helped make this popular.
I'm like, oh, thank you.
I'm so glad you think I made a difference.
But really, it probably would have happened on its own.
Like I was running after a snowball
that was already running very quickly downhill
and engulfing villages as it went.
Absolutely.
And I do want to talk to you about that in particular,
because as people on this podcast often hear,
I talk about this podcast,
I talk about the AWS Morning Brief, my other podcast,
and I talk about lastweekinaws.com,
where my newsletter lives.
I don't urge people to follow me on Twitter.
I don't talk about the Facebook page.
I don't have. And the reason behind all of those things is that I have built an audience on open standards and open platforms so that no to directly impact what I do and how I do it.
Do you think this is naive? Do you think that the open web was a nice idea and now we're just going
to see increasingly walled gardens as time goes on? I think the openness of your website is,
or your web app, or your sort of technical strategy in general, is always going to be a hybrid. Like AWS is, it's not
rolling your own. You're using a service. If AWS decides that they don't support your service
anymore, which they never do as far as I can tell, but theoretically they could, you would have to
stop doing that, right? You are to some extent locked into AWS, but I don't think that a website
hosted on AWS is like not part of the open web.
I would agree wholeheartedly on that point.
Absolutely.
Right.
I think at that point, you've adopted a tool that works for you and you can move elsewhere.
So there are people who say using JavaScript frameworks, that's not the open web.
You should have been writing your own.
You're dependent on Facebook continuing to maintain React.
And I'm like, well, kind of, but not really.
Like, you don't have to be.
You could write your own website if you wanted to.
This way it's just faster,
in the same way that hosting it on AWS
is faster than spinning up your own machines.
I take it a step further beyond that.
I pay WP Engine, which they manage WordPress for me,
so I don't have to.
And the reason for that is I've managed WordPress in the past,
and I will not go down that path again for love or money. But then, as a fun artifact of that, last week in AWS.com does in fact live on GCP.
Nice.
But it's WordPress. Worst case, WP Engine shuts down or charges me 80 times more or decides that
now, nope, everything has to move to a new framework, I can migrate it elsewhere.
And the fact that I have that strategic exodus means that I don't need to sit here on everything I build and agonize over, do I go all in on my current hosting provider or not? It's something
that I can migrate with me. And I try and maintain at least that theoretical exodus path. I can
repoint domains to other places. I own the domains myself, and that has
been enough for the way that I view the world. But increasingly, I'm starting to feel like a relic.
Oh, follow me on Instagram. Follow me on TikTok. And if these platforms pull a MySpace and vanish,
then you've got to rebuild your audience from scratch. Whereas email's been with us longer
than I've been alive, and it'll be here long after I'm dead, I can carry that audience with
me regardless of what any particular provider has.
I just wish I didn't feel like such a Captain Edge case
or someone stuck in the past whenever I articulate that to some folks.
Well, I've been in the industry a long time,
so I think if you're going to sort of say, you know, I've got this opinion,
I'm going to be like, me too, I am also extremely old.
And then we'll talk about the Great War. Wasn't it amazing? And here we are. The browser wars of 97. I was 10. Yes, we make
eternal September references all week. Oh my god. See, we're literally doing that thing that I was
just joking we were going to do. We absolutely are. Yeah, I think you have to pick your battles.
I think sort of the one that I personally struggle most with is databases. You know, I spent
a good chunk of my career as a DBA. I definitely know how to install and configure databases. I
don't want to, you know, like using one of the fancy databases and services where you're just
like, you know, it has an SQL interface and it's got apparently infinite storage and infinite
processor and I don't need to worry about it anymore. Exactly. It has those things because what it also has is someone else's credit card. Done.
Right. It's great. But like to some extent, I'm definitely locking myself into that database
service, right? Like to some extent, I have to find an equally capable service if I ever wanted
to migrate away. So am I still open or am I locked in then? Like I don't think anybody can call
themselves truly independent. Anybody can call themselves truly independent.
Anybody can call themselves truly open.
So from your perspective of like, you know, what platform am I on?
As long as you're not only on that platform, as long as it's not your only bet, I think,
you know, sure, pile into the Facebook page.
Why not?
Yeah, I have separate problems with that that we need not get into here.
That'll be a whole separate episode there.
So as you look across the past, I don't know, let's call it eight decades that you and I have been in tech together, what are the themes you've seen continue to emerge that people should be
paying attention to moving forward? I think one of the most common mistakes that I see in
technologists who've been in the industry a long time is I can tell
that they're doing it because they start ranting about the fundamentals. And it is my firmly held
conviction, and no one will sway me from it, there is no such thing as the fundamentals. Everybody
comes into the industry at a certain time when a certain set of tools were considered commodities
that you don't need to think about. A certain set of tools were considered commodities that you don't need to think about a certain set of tools were considered like the complicated thing that you need to learn
and a certain set of tools were considered like fluff on top that are bonus but those things are
always drifting downwards right like yesterday's fluff is today's bedrock and the new fluff is
stuff that wasn't invented before and then they start going well you should be able to understand
http and roll your own javascript framework because those are the fundamentals. And I'm like, only to you,
because you came into the industry when that was the complicated thing. The fundamentals to somebody
who started 20 years before you did are like, you need to know about power management and like how
to configure a firewall. Like you were saying at the beginning of this thing, like everybody's
fundamentals are somebody else's fluff. You want to learn how linux works step one i see this in classes all the time
learn how vim works right how about not doing that focusing on the differentiated part my god
the bizarre cargo culting of them i'm like you know why the people who are good at them are good
at them it's because they've been doing them for 30 years if you do any tool for 30 years you're
going to be really good at it see you say that but then you look at me with databases and I don't know, I might be able
to fool you on that one. Use any tool for 30 years and you'll be so good at it that the switching
cost is too high to go to anything else. But if you're just starting in the industry, you could
start with any editor that you wanted and it would be fine. And by the time you've been using it for
30 years, you'll be like a goddamn wizard at it. Absolutely. So that's
what I tell people is like, the things that you learn now, you're going to have to expect that
they get commoditized. The stack that you live on today will get crushed down to nothing. And you
have to be constantly climbing the stack to what the new thing is. I want to thank you for taking
so much time to speak with me today. If people want to hear more about what you have to say and how you wish to say it, where can they find you? I am most active and responsive on Twitter. My
username is Seldo, and I also own seldo.com, where I blog much less frequently than I would like to.
And we will, of course, put links to both of those into the show notes. Thank you so much for taking
the time to speak with me. I really appreciate it. Thanks for the invitation. It's been a lot of fun.
Really has. Laurie Voss, Senior Data Analyst at Netlify. I'm cloud economist Corey Quinn,
and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star
review on your podcast platform of choice. Whereas if you've hated this podcast, please
leave a five-star review on your podcast platform of choice and an entirely insulting, rambling comment complaining about how I talked about all these different package management systems for different languages and never once mentioned Rust.
If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group.
We help companies fix their AWS bill
by making it smaller and less horrifying.
The Duck Bill Group works for you, not AWS.
We tailor recommendations to your business,
and we get to the point.
Visit duckbillgroup.com to get started.
This has been a HumblePod production.
Stay humble.