Screaming in the Cloud - Making Machine Learning Invisible with Randall Hunt
Episode Date: March 23, 2021About RandallRandall is a Software Engineer and Open Source Developer Advocate at Facebook. Previously of AWS, SpaceX, MongoDB, and NASA.Links:Totes Not Amazon: totes-not-amazon.comTwitter: h...ttps://twitter.com/jrhunt
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, cloud economist Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud. and that's not even counting IoT. ExtraHop automatically discovers everything inside the perimeter,
including your cloud workloads and IoT devices,
detects these threats up to 35% faster, and helps you act immediately.
Ask for a free trial of detection and response for AWS today at extrahop.com.
If your mean time to WTF for a security alert is more than a minute,
it's time to look at Lacework. Lacework will help you get your security act together for everything
from compliance service configurations to container app relationships, all without the
need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements,
you don't really have time to choose between antivirus or firewall companies to help you
secure your stack. That's why Lacework is built from the ground up for the cloud.
Low effort, high visibility, and detection. To learn more, visit lacework.com.
Welcome to Screaming in the Cloud. I'm Corey Quinn. I've been waiting for this one for a while.
I'm joined this week by Randall Hunt, who is currently a developer advocate at Facebook,
but for years was also a developer advocate slash evangelist slash gadfly slash I can't believe he
hasn't been thrown from the building yet at AWS. Randall, welcome to the show.
Hi, Corey. Thanks for having me.
So I've been trying to get you onto the podcast for a long time, and to their credit, AWS PR
did everything in their power to never respond when I explicitly requested you by name, which
makes perfect sense, because you put the two of us together, we feed off of one another, and suddenly
now it's an incident.
It's a parasitic relationship, symbiotic. I don't know. One of those two.
It's a negative feedback loop, I suspect. But now you don't work there anymore, and I never have,
so the gloves get to come off. But first, before we dive into what was, let's talk a little bit
about what you're doing now. You're a developer advocate at Facebook, as mentioned, with an emphasis on AI and more specifically something called PyTorch.
To my understanding, PyTorch is like TensorFlow, except Google didn't create it.
I'm betting there's more nuance there.
Tell me more.
Well, interestingly, I would say TensorFlow is more like PyTorch now.
So PyTorch is the culmination of years and years of open source projects across a couple of different platforms and ideas.
And it's really, if you've ever heard of a framework called NumPy, it does a lot of kind of statistical analysis and other kinds of data manipulation techniques.
What PyTorch does is it's an open source framework that does everything NumPy does and more, but with GPU-based acceleration and a lot of kinds of runtime improvements
that you couldn't really think about doing.
And the really amusing part is TensorFlow kind of realized the flexibility of PyTorch
back, I would say, like 2018 or even earlier.
And TensorFlow 2 uses the same kind of API that PyTorch does. And I think almost all
of these frameworks have started to move towards the execution methodology that PyTorch enables.
And it's a giant open source project, really huge community going around it. And I'm pretty
proud of all the work that everyone has done on it from, you know, professors at Cornell to kids sitting in their
garage, like blasting techno music, you know, it's just a pretty huge breadth of people.
One of the things that I always appreciated about your talks that you gave when you were at AWS,
and, you know, also back in that era when people went places and sat face to face in rooms with no
masks on. What a concept. I don't remember this at
all. Right? You would get on stage and you would give talks. And you were basically everything I
swore I would never be on stage. You would write code live and see how it went. Yeah, you were
someone who went deep into technical weeds and did live coding rather than contrived demos,
the way that I prefer to do it,
because the demo gods are indeed spiteful.
And sometimes it worked, sometimes it didn't.
You always turned it into a story.
But more to the point,
you were one of the shining lights at AWS
when it came to not just telling me
how to use a given service, but why.
The value it got from this,
because you would start out by writing some silly bot to do something,
and your stated purpose at the beginning was,
let's build a bot to do X,
and now suddenly I am coming on board with you,
I'm going down that road.
Even if it's patently ridiculous,
you painted a picture of what was possible,
and that's something that I don't think
that a lot of AWS storytelling focuses. And that's something that I don't think that a lot of
AWS storytelling focuses on, to its own detriment. Well, thank you for saying that. Yeah, I always
really enjoyed doing live coding. To be honest, I got my start. We won this TechCrunch disrupt
nonsense with a team back in, I think it was 2011 or something, maybe even 2012. And I had never
live coded on stage before, but our demo broke while we were speaking. So I started live coding
and the audience just went crazy and laughed. And I was kind of joking as I did it. And I got
addicted to the standup comedy-esque thrill of trying to code live on stage.
And I've done it ever since, I guess.
I feel very similar to you, too.
But the thing that really was revelatory for me, and it sounds like it is for you, too,
it's the wrong framing to view it as stand-up comedy.
Because I thought that's what I did.
Turns out, absolutely not.
Stand-up comics rehearse and repeat and prepare and write constantly.
Whereas what I do is show up unprepared,
which is properly known as improv.
So it's stand-up style,
but it's a lot less work slash planning
slash caring about the outcome, apparently.
Well, I think there's a rehearsal component to it.
So I found the more successful talk-
Oh, here's where we diverge.
You actually prepare for things.
Oops.
Yes.
I had to learn that the hard way.
So I didn't start out doing that in my, in my early twenties, that was not the case, but
I found that the best talks are the ones that are a mix of extemporaneous improvisation. So
even audience prompted. So the audience feels like they have a stake in the game. So if you
take a poll of the audience and you say, Hey, what should we build today? And you can kind of
take suggestions and
craft them into something you already have mentally prepared in your head. So it seems
more extemporaneous than it really is. So I'm kind of revealing my trick here. If you ever see me
go do a live coding demo, almost all of them involve some form of audience participation,
voting. So going and saying, oh, let's vote on it, Vim or Emacs or
something like that. You know, these are things that the audience can get very involved in.
And I definitely have every possible way of building a voting app memorized at this point.
So it's cheating a little bit. You know, I'm not coding completely from scratch. It's in my head.
I've done it before most of the time. So help me recapture that old magic,
I guess. My position on AI slash machine learning is it's a cleverly executed scam across the board
because all the cloud providers are saying up and down that you need to do machine learning to
remain competitive. And okay, great. I would expect that when you scratch beneath the surface
and you
look at what machine learning requires, which is basically a whole bunch of compute and a whole
bunch of storage. So yeah, I can get why the big cloud providers who sell both of those things
would be interested in advocating for this. Dollar dollar bills. Exactly. But when you look at the
stories of people using machine learning or AI in the real world, they always have the ring of either something
that is extraordinarily use case specific to one company or is patently ridiculous. I mean,
the classic example was WeWork used advanced machine learning algorithms to learn that there
was a bottleneck in several of their facilities in certain times of the morning, and they alleviated
that by hiring a second barista.
Not kidding. The response is, wait, you spent how much on data science and machine learning to figure out that people like to drink coffee in the morning? It just seemed insane from that
perspective. So change my mind. You've been a big advocate of it for a long time. What is the
business value other than it makes you sound more expensive and thus can command a higher salary,
which respect. Okay. You mentioned WeWork. And I just want to say, if anybody from WeWork is
listening to this, the WeWork across the street from my house has had a light shining into my
face for literally a year. Please turn it off. Okay. Now let's focus on machine learning. So
when I think about machine learning, I think about it as, you know, on the x-axis,
you will have, say, compute, and on the y-axis, you will have the amount of data that's available
to you. And there's a great graphic, maybe I can send it to you later, that the SciPy team produced,
which says which machine learning techniques or deep learning techniques should be used for
which sets of data and things like that. Now, traditional machine learning,
it's everywhere in the world around us. So the voice activation filter on the microphone that
you're using, that is likely using machine learning now instead of using some hardware
pop filter or something. The noise subtraction that you hear in audio engineering will be using
deep learning now instead of trying to do it with frequency
analysis tools. The ad bidding, all of this other stuff, the typical specific use cases,
those are all still using machine learning as well. But it's a little bit of a misconception
that you need to build these bigger and bigger and grander and grander models. There's definitely a
part of the industry that's moving that way. So there's a part of the industry that says, hey, let's have these terabyte
models that have billions and billions of parameters. There's a great talk by Jeffrey
Hinton back in, I don't know, I can't remember the date, but there's a great talk by Jeffrey
Hinton that goes over the number of fixations that a human brain makes, right? So a human brain is
what a lot of artificial intelligence
machine learning is based on, like my entire Twitter account, for instance. And if you take
the number of fixations that a human brain makes, it's like 10 billion fixations over the course of
its lifetime, where we can outperform human brains with way fewer parameters than that. So the ways machine learning are being applied
and the things that are being done these days
are ubiquitous.
I mean, it really is everywhere,
but the things that are working
and the things that are easy
are the ones you don't see.
So machine learning, when it's done correctly,
you don't even realize it's happening in the background.
You don't see what algorithms are going on and doing these predictions. And so you might not see it, but it's pretty muchesnotamazon.com. That's totes-not-amazon.com. And it uses a
Markov generator every time someone visits the page that is trained on the corpus of all, I think,
what is it, 12,000 AWS service announcements dating back to 2004. And some are hilarious,
some make no sense, and some are hilarious and make no sense. And the funniest ones, of course, are where you don't actually know if it was lifted wholesale from an actual
release announcement or not. Yeah, I think that's a funny thing. These models now, like GPT-3 and
BERT and other things, where you're starting to wonder if a human wrote this as satire,
or if it's actually AI. Well, to be fair, that's some of the release announcements that humans apparently do write, or I assume it's humans. It could just be machine learning
experiments. My personal favorite is whenever they come out with an announcement that I wind
up reading. And at the end of it, it was, well, those are certainly a bunch of words,
none of which I understood in the context they were used.
I just went to this site and the headline that I got was Amazon Polyads bilingual Indian English Hindi language support for AWS CodePipeline AWS config ad support for non-RFC 1918 address ranges.
That's amazing.
That's the sort of thing we're talking about.
It's a blast.
I did some experimentation with GPT-2 and that turned into some interesting stuff. But for that, I had a bit more fun and turned it loose on effectively,
not the release announcements, but all of the blog posts that Jeff Barr wrote dating back to
the launch of AWS. And it's fun because Jeff has a personality and it would capture entire sentences
that I thought were hilarious. And that's ridiculous. What a human sounding bot this is.
And then I would check the corpus and that exact entire sentence was used. And it was a lot more sensible in the context
where it was originally found. It just becomes ridiculous when you surround it with other
unrelated things. So honestly, having talked to people who are good at this, it turns out I don't
fully understand how training works and how to tune it to get the best results possible. I'm
mostly punting until GPT-3 becomes more available.
Yeah, well, there are even things
beyond GPT-3 already as well.
You know, I don't know if you remember back in 2018,
I did something similar
as I scraped all of the blog posts.
I used to write for the AWS blog.
I think I wrote 50 or so posts over the years.
And I scraped not only my posts,
but all the posts from all the authors on all
AWS blogs. And I created a model that would generate sort of like this totes not Amazon
site, it would generate fake blog posts. And I did this all live on Twitch. So that kind of goes
back to the live coding is, I think there's this general idea in the industry that machine learning
is some secret. And I think the machine learning engineers are kind of incentivized to keep it that way because they can earn the big bucks as long as it's a specialty.
But the reality is most of the machine learning techniques and training techniques are all sort of automated.
So, you know, you're saying, you know, do a softmax here.
OK, do a activation function, you know, do this layer, do this fully connected
layer, whatever. All of that stuff is completely automated at this point. You don't really need
to think about it. You can tune it, you can play with it if you want, but the gross majority of
machine learning these days is really just in data preparation. And if you look at the Twitch video
that I did a couple of years ago, we spent more time writing a scraper to download all of the blog posts than we did training and deploying the model on the SageMaker. So it's interesting to see
all this focus from all the different industries and blog posts talking about how
ML is this huge, big thing, but it's data science again. That's where everything's at,
is kind of doing feature engineering and data science. And I wish I would see more focus in the industry on that side of
things. That's the challenge though, is the entire industry is expanding and it's getting bigger all
the time, which reminds me of a lot of things from yesteryear and really the reasons years ago,
I wanted to have you on to have some of these discussions. So let's do it now, pivoting away from the machine learning morass for the moment. Let's talk about big cloud
providers and their growth and what we're seeing in the market. So let's begin with a provocative
thing that'll start a fight. If you could change one thing at AWS, what would it be? I've tweeted about this.
I would love for there to be within AWS an S-team level goal,
an Andy Jassy level goal that says
all of our APIs that we release
and all of our developer experience that we release
will be gatekept
until this number of internal developers agree that it's a pleasant experience.
I can rant more about this, but if I had my druthers, if I could change anything, I think
the single most important change that could happen would be to have developer experience
as a core goal for every single service team, except work docs.
Well, I love the idea in principle, but it sounds like it could turn into a few things that are
terrible. One being that then you wind up with the perfect becoming the enemy of the good and
never shipping anything. And the other is that it's become very apparent to me over the years,
watching AWS releases and having a launch day product that is, how to put it,
basically crap. And it learns from its customers. And two years later, you look at that product
again, and it is worlds better. I don't know that you'd be able to get it better if you had, A,
launched later in the process without that customer perspective, or if you had not built things
that were reflected by the use cases customers put them to?
I think that's a fair call out. And I'll just kind of say, yeah, but AWS is pretty well established
at this point, they can afford to take an extra second to get things done correctly or to do a
finesse and polish check. I don't know if you remember the Elastic Kubernetes service launch
back in, say, I can't remember, was it 2017? That was essentially at the time not a real service.
It was here, run this AMI, and that was the entirety of it. That's not a pleasant
developer experience. So everyone who had access to the service in preview was sitting there
complaining about the service and the complete really like lack of a coherent presentation layer.
They weren't asking for new features because they were too busy complaining about how
frustrating the API and that side of things were.
So I think there's a balance, obviously, and businesses want to move fast.
But here's the thing.
The kind of thinking that got AWS to where they are right now is not the kind of thinking that's going to take them into the future, because the problems that they've created
with this rush of onslaught of new services and new information is impossible for an
individual developer to overcome with a current developer experience. And I can go on at length
about this, but, you know, the particular services that I would call out. But if I hearken back to
the early days of AWS, when EC2 came out, for instance, one of the things that made me go to
work for AWS in the first place was they never had a run instance API.
So you think about launching an EC2 instance, right?
It's like, okay, cool, run instance.
It wasn't run instance.
It was always run instances.
It was plural from the beginning.
And that seems obvious to us in 2021.
In 2011 or 2010, people weren't thinking about launching multiple instances simultaneously.
I mean, some people were, obviously, but it wasn't a common theme back then.
So they were thinking ahead about the developer experience,
and they weren't painting themselves into a corner that's impossible to escape from.
This is why you have different versions of different APIs that are expressed across different services now.
And you'll see different things sort of reinventing themselves and relaunching as new services
because that lets them internally get more buy-in and get more effort kind of devoted to this new
shiny thing. And the real value, the things that would drive tremendous value for customers, even if they're not necessarily
asking for it, would be to focus on the developer experience, the hands-on keyboard experience of
using AWS. The challenge that I think we're seeing too is, if you go back to its inception,
AWS was very clearly aimed at engineers slash builders slash certainly sysadmins,
folks who for one reason or another were working with the computers an awful lot.
And it's at the time of this recording turned them into what is basically a 46 billion to 100
year business. And it's gotten them super far and it's carried them super well.
The challenge, though, is that their next $46 billion a year is not going to come from the same places that the first one did.
I mean, we talk about a $2 trillion IT industry and growing.
A lot of the folks that are still running on-premises are now not building net new on top of cloud, but they're migrating in.
They have a different philosophy. They have a different approach. And when you say developer experience, their response quite reasonably is, yeah, we don't hire developers. We're just trying
to run our corporate IT somewhere. And there's a lot wrong with that, but that is their position.
And I understand that, but I think it's a little myopic to have that view. I want to look at
Microsoft for a second. So Microsoft has made a pretty
ingenious acquisition of GitHub. Microsoft also runs VS Code. The gross majority of new developers
are learning and getting started with VS Code as the platform. That is their IDE. That is what they
spend all their time in from high school and on. That is what they know. So if you hire someone who's 21 today,
it's very likely the only IDE they've ever used is VS Code.
So Microsoft now owns the code where it lives.
They own, with GitHub Actions, the CINCD of the code.
They own the IDE, which is the developer experience.
And admittedly, thankfully, this is all open source.
But where is AWS in that developer experience story?
Because the acquisition of developers is the funnel that drives your long-term business success. I agree that you have these great efforts that can go on in parallel for
enterprise sales and inside sales and all this other go-to-market activities with these much
larger customers. However, think about Twilio. Twilio started by an AWS or an Amazon product
manager. Twilio became what they are today by focusing and obsessing over the developer
experience. They have by far one of the best developer experiences.
You know, they set the industry standard there.
As long as we're not talking about their SendGrid division,
I would wholeheartedly agree with you.
Yes, no comment on SendGrid.
I haven't used it in probably a decade.
I used to.
I love what it does as far as an email cannon goes,
but the API is kind of sad.
If you're listening to this at SendGrid, please reach out.
So I don't see these two efforts as mutually exclusive.
Improving developer experience will improve your sales across the board because companies
have to hire new people.
You know, even if they're running on-prem, they're still hiring new people if they're
growing business.
And if they're not a growing business, does it really matter if you're getting their business right?
I mean, the economy is going to move on.
You don't necessarily need to capture
these dying companies.
You want to capture the companies
that are going to grow along with your business.
And the companies that are growing
are hiring new developers.
Those new developers have a very
Microsoft-oriented worldview
with VS Code and GitHub
and all of these other things.
Where's AWS in that story?
And that's what I would love to see change at AWS is this huge focus on developer experience.
I think it would transform the company. Incidents happen fast, but they don't come
out of nowhere like AWS bills do. If they're watching, your team can catch the sudden shifts
in performance, but who has time to
constantly check thousands of hosts, services, and containers? That's where New Relic Lookout
comes in. Part of full-stack observability, it compares current performance to past performance
just like you're not supposed to do in the stock market, then displays it in an estate-wide view
of your entire system. Sign up for free at newrelic.com and start moving faster than ever.
I would agree with you.
Part of the challenge is that AWS is fantastic at building the plumbing,
and they have trouble with the porcelain.
Sure, you can view that through whatever toilet analogy lens you want,
but they're great at building the blocks you can use to construct something.
What I think that they abdicate almost entirely,
and we talked about this with the approach you take to telling stories,
is that they don't tell the story about what you can do with those things that they're giving for you.
Sure, you give me a bunch of bricks and talk to me about building a house,
but you haven't demonstrated to me what that house might look like.
Help me out.
And sure, your customers can get on stage, but I kind of want something that
stands somewhere in between the two continuum endpoints of Hello World and Netflix. Yes,
I agree. And it's not like all of AWS suffers from this, right? It's just a majority. I would say
there are two projects, maybe three projects that I'm a huge fan of right now. One is AWS Amplify.
So Amplify is an open source project.
It's a service and it's a console.
And a breakfast cereal too, I kid.
Oh, I could use a breakfast cereal.
So Amplify has really just been completely focused on the developer experience.
And you can tell, you can see it
in their documentation. You can see it in the way that their advocates go out and are speaking to
developers and telling stories. And they're hitting a whole new generation of developers.
So a lot of developers these days, I don't know if you see all this Twitter controversy that goes on.
Somebody tweeted about like how they would never hire a front-end developer or front-end developers are only junior.
Did you see that?
Yes, I did.
And, in fact, that Twitter rando was the CEO of Shopify, which is a bit of a challenge.
I understand aspects of the sentiment, which is, for example, you will become a better engineer by understanding other aspects than just the area that you're focusing on. But front-end
is just for junior crappy engineers is absolutely not a helpful sentiment. I'm also not entirely
convinced that that was the point he was trying to make. But there is the nuance, of course,
that I'm not here to do communications, PR, spin, marketing, messaging, etc cetera, for the CEO of Shopify. He can clarify
his own statement. The way he put it was tone deaf and bad. Right. And I see where that stigma or
whatever is coming from. But the reality of the situation is front engineers today are by necessity
some of the best engineers out there.
And it requires a complete understanding of a complex set of services.
If I go right now and I want to build a backend
that can scale to millions and millions of users,
billions of messages per second, whatever,
with AWS, as long as I'm blindly swiping my credit card,
that is not a problem.
I can make that work, right? On the front end, you have so much more nuance, you have so much
more complications. And one of the things that AWS Amplify is doing is they're making that front end
more accessible to more developers. And they're taking that front end skillset and gently
introducing the cloud skillset in addition to it. And I don't think people fully grok how good of an onboarding tool that is,
because while developers might start with AWS Amplify, a couple years later, they're not just
using Amplify, they're using a slew of AWS services. And when they go to work at other places,
and they want to rapidly prototype something, they're not just going to pick Amplify, they're
going to pick all of AWS. And, you know, Elastic Beanstalk did something similar to this back in, I would say, 2014-ish. It never
really lived up to the vision, I would say. I think Elastic Beanstalk is awesome. It's my baby.
I love it. But if I'm being perfectly blunt, of course, no, it didn't do what we wanted. And
there was this other service. I don't know if you remember. Do you remember CodeStar?
Of course I do. You have to understand, I'm a walking encyclopedia of everything that
AWS has ever done. It was sort of their unifying approach to take an opinionated build of,
I want to set up a new project. Well, it's going to spin up a code commit repository. It's going
to set up a code deploy pipeline. It's going to use CodeBuild to wind up building these things.
Effectively, it was a,
I almost want to use the term Potemkin Village, of all the build tools that AWS offers, which you
only use if you don't have the option of using things that are way better at each of the individual
things that they do. Yeah, pretty much. And CodeStar was supposed to be this onboarding tool.
But what CodeStar focused on was kind of the console experience. They didn't focus on the command line tool or the developer experience or the APIs. So like the only real way to access
CodeStar in the beginning was through the console. So what Amplify has done is they have you using
the AWS Amplify SDK before you ever even create an AWS account. That's a huge, huge difference
in developer experience, especially in onboarding.
But even going beyond the kind of beginner side of things, Amplify is able to help experienced AWS developers take pain points with things like Cognito and make them much simpler to deal with.
And there's also this concept of evolving. So there's this thing called LightSail that AWS launched a couple of years ago
to make it easier for developers to go
and launch stuff without worrying
about specific costs and overruns
because they would just charge you
this flat monthly fee.
LightSail is great, but again,
it's not Heroku.
It's not a command line
that you're doing to deploy your app.
It was more complicated than that.
So Amplify is one project. I'm a huge fan of it.
I can target LinkedIn better than what I think they're doing
right. They're really focused on developer experience.
The other thing that I bet
this is one you'll disagree with me on
is CDK.
Oh, I'm thrilled to have that particular debate
with you. In fact, as of the time of this recording,
roughly a week ago, I did an article
on building a toy app to build
out a bot that counts my
Twitter followers and writes it to Dynamo as an excuse to play with the CDK. My experience was,
is after an hour, I gave up and went back to Sam CLI. So first of all, I love Sam. Chris Munns and
I have spent years working together and presenting together at various conferences. Chris Munns is an
amazing speaker and the whole serverless AWS developer advocate team is a group of really amazing and talented
and very, very eloquent storytellers. I think that SAM is really playing catch up to a lot of
other frameworks. So even is doing more faster than SAM is. And part of that is because Sam is based on CloudFormation and CloudFormation transforms and all of these other different CloudFormation techniques are being developed in parallel to the things that Sam should be doing.
For a long time, just doing something as simple as S3 event notifications in SAM was phenomenally difficult.
And it only really got addressed, I would say, in 2019. So it took three or four years for it to
really become a solved problem. What I like about CDK is, well, let's take a step back.
When you define infrastructure, what do you think about?
Usually, I think of, first, I have a workload, usually,
that I want to put on that infrastructure that's already built or defined somehow, or some form of
code I want out there. I think of infrastructure deployment usually being more of a one-and-then-done
approach, as opposed to continuing to iterate on the application that lives in that infrastructure.
Now, that's a bit of an outmoded way of thinking in some respects,
but every time I push code, I don't necessarily want to reprovision the database
that holds the data that code talks to, for instance.
Gotcha. And if you were to walk back that thought to, say, the mid-2000s,
how do you think about provisioning hardware?
In the mid-2000s, my entire approach, because I was provisioning hardware then,
was you had to do a lot more capacity planning. There was a six-week lead time instead of a
six-second lead time. There was a lot of building excess capacity in, and you were just getting
around to this idea of virtualizing things, because it was way faster to spin a new vm even at slow
speeds than it was to provision wipe and reinstall something on bare metal right so even that it was
starting to get in that direction now i love that everything's an api call away and you can have more
compute power than like the entire 20th century of humanity yes it's amazing to me that a T2 nano or a T3 nano has something like one and that was a little bit of a foreign concept to a lot of people.
But I think code is, one, more expressive and, two, better suited to cloud deployments.
Now, I have lots of reasons for this, but I do want to issue one point of caution because a lot of times, this is something I've
even seen very recently in some folks that I've been working with. There are different kinds of
engineers who approach problems in different ways. So a person who is primarily a software engineer,
they will follow up most likely a principle of DRY. So don't repeat yourself. And that becomes
something that they want to take into their infrastructure deployment
and into their continuous integration deployments and things like that as well.
My strong suggestion, and of course, this is not always true, is that don't repeat yourself
is somewhat the enemy of a lot of infrastructure deployment.
Because when you try to get too clever with CDK,
when you try and make everything, you know, oh, let me add one aspect to this one stack and have
everything deploy magically, you know, with one line of code, you've probably over-engineered
the problem. It is perfectly okay to repeat yourself a few times
when defining CDK code. I have a cool story about a CDK project, which is what brought me onto it.
But originally, I was kind of like you. I was like super, super skeptical. I didn't really
buy into the value of it. And then I used it in a real-world production project.
I love the concept at a high level of what the CDK offers.
And it's better than it was when I first played with it a while back.
But there are problems with it.
There is on some level, in many cases, a distance between the developer and the infrastructure.
That's a different philosophy.
And it takes time for companies to get used to that and cultures to shift.
Let's skip past that because eventually you're right.
It's going to be unified. The idea of having all of my infrastructure
defined in my code base for the application means that suddenly I have to integrate CICD
in a much more meaningful, thoughtful way for all of the application workloads. It means
that my code and the structure of my code projects in its repository is dictated by how the infrastructure looks.
It opens up a number of cans of worms.
It does absolutely speed iteration and make this more accessible to developers.
That's no small thing.
But there is a challenge to that.
It requires a different way of thinking.
And it's very challenging in my experience to wind up mapping that to something that isn't Greenfield.
I would agree with that, actually. So I had tried port wanted to kind of move into CDK. And I found it challenging just
because I had too much kind of random things going on. And that side of the infrastructure,
surprisingly, was the most reliable, but the least iterable, if that makes any sense.
It was very reliable as long as you don't touch it.
Whereas with the CDK project that I built, so this is, I don't know if you saw Amazon Connect chat. They released a couple of things around reInvent timeframe. And one of the things that
they released is built on top of CDK. So that project was kind of my baby. It was something
that I worked on pretty aggressively and it was a Greenfield project.
And I started with CDK because there was a little bit of a push internally to explore it for new use cases and stuff.
And I was like, you know, I was kind of hesitant.
I was like, I don't like this project structure.
I don't like any of this.
But over time, you know, that's the same way I felt about DynamoDB in the beginning, to be honest.
Do you remember single table design?
I do indeed. I remember a lot of use cases where it solves problems and twice as many use cases
where it gets even worse. Yeah. And it's a different way of thinking about problems
that once you grok it, once you kind of have that mental model in place, you can see the value of it.
You can see where it can be applied. So I'm not saying CDK is perfect for
everything. What I'm saying is if you do have this greenfield project and you can kind of work around
some of the new folder structures and where you're keeping things and how you're thinking about the
construction of your stack and your CICD, I was able to take a project from literally nothing to deployed in multiple regions,
running production workloads over the course of about three hours.
So in three hours, that one stack was working.
And then as I wanted to implement more things,
let's say I wanted to put IAM permission boundaries in there.
Traditionally, with CloudFormation or Terraform or something like that, I'd have to write
a whole section that would go and apply either a transform or something else that would go
and apply to all of these different resources that were being provisioned.
On the CDK side, I applied one aspect of the stack and those IAM permission boundaries
were applied across the board.
Let's say I want to tag things.
This is easier now in CloudFormation than it was. But again, I can have programmatically generated tags
as the stack is being deployed. It can look up what region it's in. It can look up what account
it's in. It can roll up. It can do all kinds of good things with organizations and cost reporting.
I've found the speed of iteration with CDK and the reliability of that iteration to be drastically improved over, say, traditional
CloudFormation or Terraform or anything like that. And that was the huge win. And that's why I think
CDK is such a valuable project for Greenfield especially. I will meet you in the middle and
agree to suspend judgment pending further explorations with it then. Yeah, we should
chat about this sometime. I can give you a guided introduction,
a no-nonsense tour.
I think you'd like it a lot.
You would be the third person to have done so
if we were to go down that path.
But I will keep my mind open
and maybe we'll even live stream it for fun.
That would be fun.
I want to thank you for taking the time to speak with me
now that AWS couldn't keep the fire
and the gunpowder keg separate anymore.
Yeah, and hey, I'll say this. I feel like there's a little bit of a sentiment that AWS couldn't keep the fire and the gunpowder keg separate anymore.
Yeah. And hey, I'll say this. I feel like there's a little bit of a sentiment that I dislike AWS or that I have bad feelings towards them. And that's not the case at all. I'm actually
a pretty huge AWS fan. I joined AWS because I was a customer of AWS first, and I was obsessed
with the product. I loved it. I met Jeff Barr in my
interview and he was like, okay, well, let's fix this blog post. Here's how you log in the Emacs
and stuff. And that was probably my first month of work there. And I got to spend the next several
years working on a series of really, really exciting projects. But the thing that I loved
most about my time at AWS, if I could just take everything in its entirety, the customers that I
met and the amount of the community and just going to reinvent every year. Holy smokes. I actually
left AWS for a year and came back and I made the decision to come back my first day at reinvent as
a customer. I guess it's a palpable energy and I was sad to miss that this year. I really hope that there's a better story at reInvent this year than there was last year
in terms of getting people together. But I also want to keep that strong online component because
it's so much more accessible to folks who can't take a week to fly to Las Vegas and
not do work for that entire timeframe and pay $2,000 for a ticket.
Yep.
So combine the best of both, I think.
I desperately need to get out of my home office and meet people again.
And there's another aspect of that is that every demo and every talk I've ever made
was built on a train or a plane or in the back of a van
driving through the floods of Bangkok
on my way to my next meetup or something. I need that kind of forcing function of the travel to
help me think creatively and build new compelling stories to tell developers.
Yeah, I think we could absolutely come up with something terrifying in the somewhat near future.
More to come on that as it unfolds. Randall,
thank you so much for taking the time to speak with me today. If people want to hear more about
what you're up to, what ridiculous ideas you have, or mostly just want to see you kick people in the
shins, where can they find you? I am pretty much just on Twitter. So it's twitter.com slash JR
Hunt is my handle. And I post a lot about AWS and a lot about machine learning and occasionally about
sci-fi books. Excellent. We will, of course, put links to that in the show notes. Thanks so much.
I appreciate your time. Thanks for having me. Randall Hunt, developer advocate at Facebook.
I'm cloud economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this
podcast, please leave a five-star review on Apple Podcasts.
Whereas if you've hated this podcast,
please leave a five-star review on Apple Podcasts
or your podcast platform of choice,
along with a comment that is entirely generated
by machine learning.
This has been this week's episode
of Screaming in the Cloud.
You can also find more Corey
at screaminginthecloud.com
or wherever Fine Snark is sold.
This has been a HumblePod production.
Stay humble.