Screaming in the Cloud - Opening the Managed NAT Gateway with Malith Rajapakse
Episode Date: May 15, 2025How does one manage to simplify the complexities of the NAT Gateway? In this episode of "Screaming in the Cloud," Corey Quinn interviews Malith Rajapakse, a DevOps engineer who has recently r...eceived acclaim for his blog post discussing the Managed NAT Gateway. Where AWS lacks in its documentation, Malith is a NATural at breaking things down. He’s so great at it that Corey had to invite him on the show! Malith shares the story behind his popular post, his creative process, and his use of interactive diagrams and engaging content. He and Corey also discuss the challenges of documentation and making technical subjects more appealing. Thankfully, Malith has already done that in written form, so enjoy this episode as he speaks it into the world!Show Highlights(0:00) Intro(1:24) The Duckbill Group sponsor read(1:58) Malith's background before his blog post (4:21) Why Malith wrote about the Managed NAT Gateway(5:38) Corey's problems with Managed NAT Gateway and why Malith's blog post impressed him(10:05) The interactive elements of Malith's blog post and how they were made(12:21) Maltih's front-end experience(14:47) Transitioning from front-end to DevOps through JavaScript(16:20) The juxtaposition of Malith's blog post vs. AWS's official documentation(18:05) How AWS's documentation of the managed NAT gateway isn't user-friendly(22:27) Why Malith went all out for his first blog post(23:17) Corey's constructive feedback for Malith(26:05) Where you can find more from MalithAbout Malith RajapakseMalith is a Devops engineer creating visualisations at https://malithr.com/.LinksMalith’s blog: https://malithr.com/Interactive AWS NAT Gateway: https://malithr.com/aws/natgateway/LinkedIn: https://www.linkedin.com/in/malith-rajapakse/Bluesky: https://bsky.app/profile/malithr.comTwitter: https://x.com/malithrajReddit: https://www.reddit.com/user/mdilraj/Sam Rose’s blog: https://samwho.dev/Benjamin Dicken’s blog post on IO devices and latency: https://planetscale.com/blog/io-devices-and-latencyJosh W Comeau’s blog: https://www.joshwcomeau.com/Killed By Google: https://killedbygoogle.com/SponsorThe Duckbill Group: duckbillgroup.com
Transcript
Discussion (0)
I wouldn't read it.
That's the first reason.
So if I wouldn't read it,
I can't expect others to read it either.
So that was the first reason.
The second reason is I think for me,
sitting down and writing is the hard part.
And like I said earlier,
I started with the diagrams, right?
I started with the diagrams
and with the intention that once I create the diagrams,
I'll be forced to write the content.
It did happen.
I was dragging my feet, trying to like put words on the page.
Welcome to Screaming in the Cloud. I'm Corey Quinn and my guest today is
not typical of most guests I have on the show and I don't even slightly care because I am tickled think to have him.
Malith Rakhpasha is a DevOps engineer who self describes as making interesting
visualizations and okay great that can go in a lot of different directions but
what first came to my notice is a blog post that he put out about oh let's call
it three weeks ago or so for before the recording of this show,
that purports to explain how the managed NAT gateway works.
Insert obvious, it runs on money joke here,
but it had some embedded visualizations
that I think did a better job of explaining
how this thing actually works
than anything AWS has ever done.
Malith, thank you for speaking to me.
Yeah, great to be here.
This episode is sponsored
in part by my day job, the duck bill group. Do you have a horrifying AWS bill? That can mean a lot of
things predicting what it's going to be determining what it should be negotiating your next long term
contract with AWS or just figuring out why it increasingly resembles a phone number, but nobody seems to quite know why that is.
To learn more, visit DuckBillGroup.com. Remember, you can't duck the duck bill bill.
And my CEO informs me that is absolutely not our slogan.
So I have to ask, where did you come from?
This is the sort of thing that usually someone puts out after a long period of time having worked in the space,
having done a bunch of more traditional blog posts
that fit in Markdown and you can just slap
into some third party system as they are
and then they wind up partnering with some sort of designer
to build something like this in my case
because I don't have that skill set.
And then it kind of comes out and usually lands with a thud. And you have basically done absolutely none of that. Where did this
idea come from?
Yeah, I guess we need to go a couple months back. So this was around November that I'm
talking about. And my story is quite common these days. I was made redundant around November and had all this time, right?
And for the first two months, it was okay.
I really enjoyed being, you know, like very slothful doing nothing.
But then you had Jared come out, come in and a lot of these people from LinkedIn, you know,
coming out of the woodworks telling you how they're going to be very productive during
the year, sort of made me feel bad. And I was thinking, okay, what do I do, right? I have all this time on my hands.
And I was thinking, maybe let's look at some sort of infra project that I could do. But it turns out
that, at least for me, they weren't that interesting. Imagine you spin up all these infra,
then you have just bring it back down once you're
done with it.
And so I didn't really want to do any infrastructure project, stick on backend projects.
Couldn't think of anything I really wanted to.
And then finally it was front end.
I did a bit of front end before, like in my own personal time.
So I definitely could do some front end project.
I just did not know what.
Then one day I was just making circles around the corridor,
trying to get in my steps. And then I remembered this blog by Josh Korman. If you don't know who
Josh Korman is, he just makes some of the most incredible front-end content. It's very interactive.
It's sort of like going to a digital carnival. At least that's what it feels like to me. I thought
I could do something like that, something really interactive, but for the infrastructure. I started creating
this around the end of Jan. And I think I spent around two months and then published it on the
20th of March on Reddit. And then you saw it. I have to ask, first, there's a lot to unpack there.
So I'm going to get there in a minute.
But first I want to ask of all the topics
you could have picked to write about,
why pick something as truly cursed
as the Manage NAT gateway?
There's so much out there in the AWS space.
This is one of the first deep dive treatments
I've seen on this thing.
I mean, I tore apart its economics in a post a few years back, but this is something else entirely.
Yeah, so there's multiple reasons.
The first reason is I thought it was a bit easier than the first topic that I was looking at, which is transit gateways.
I know if it was transit gateways, it'd be much, much longer and I couldn't give it the treatment that it deserves with the skills that I had at the time.
Oh, that's fun. go for CloudWan.
And then at that point, you have an entire career
building your blog post over that.
Yeah, I looked at that as well, I think, not yet.
And then I also knew that I needed to write about technology
that someone might be interested in.
And here's the thing, I knew, right?
I knew that you are in the Reddit, in the subreddit.
And so this is sort of engineered in a way.
I knew that if I wrote about NAT gateways,
it'll catch your eye.
I know how much you love NAT gateways.
And so, yeah, yeah.
And the gambit paid off.
It worked.
It was, I tend to cast a pretty wide net.
I caught your blog post in my intake system in three different places.
The first one I saw was my Reddit filter, which great that it all
centralized to one place.
So I popped over to Reddit and wound up commenting.
And this is amazing.
The I also caught it in a hacker news,
and it was also posted on Blue Sky, which I wound up finding slightly later
because I haven't really finalized that search pattern yet, which is neither here nor there. And it is spectacular because first off, for those who
aren't familiar, I spend a lot of time making fun of the Manage NAT Gateway because its pricing is
terrible on both the high and the low end. On the low end, it costs four and a half cents per hour
whenever you're running it, and it likes to spin itself up as a part of other stacks. So it costs
about 35 bucks a month
and there is no free tier for this.
So if you're building something for funsies
in a free account as you view it,
you're suddenly getting a $35 bill surprise
just for the absolute hell of it.
And at large company scale,
you're also charged 4 1 1 2 cents per gigabyte
that passes through it,
which does not generally discount with a volume
and also can be significant.
I've talked to companies that are spending
high tens of thousands of dollars every month on this
and collectively hundreds of thousands of dollars.
When I'm doing a client project,
one of the first things I look at around NAT gateways
where the bill for them is high is the ratio.
What are the instance hours compared
to the traffic passing through them?
And both ways people get very upset
to the point where Ben Whaley over at Chime Financial
created Alternat, which is effectively using one of these
as a fallback option when you have,
when you're running that instance in front of it.
And there's the aptly named NAT project,
FCK-NAT, which I think has a bit more adoption
in the industry, but no one likes these things.
They serve a valuable
purpose, which is letting traffic in and out of private subnets, but they charge for it
like they're made out of the most precious substance on earth. And it may be it's like,
well, we built this based on customer requests was the response a distinguished engineer at AWS gave
publicly years ago, to which someone responded, not me,
that, well, apparently you built the wrong thing then
because this is not what I want or need
for this very simple use case.
So it is maligned, but it is also a phenomenal feat
of technical engineering.
It leverages a bunch of internal AWS constructs
like Hyperplane, and everyone tends to use them,
but not a lot of folks know how they work.
And what I really appreciated about
this is I didn't know what your background was, but I had the sneaking suspicion, please correct me if
I'm wrong, that it wasn't based around networking. No. It's, uh, the reason I ask is that there,
when people start to dive deep on things, they tend to focus on the areas historically, the smart
ones anyway, they tend to focus on the areas that they have not spent a lot of time on.
I got into networks during the Great Recession
for that exact reason.
It was always the thing,
it's like, and then you put whatever a subnet mask
is supposed to be over here,
don't know what it does,
but if you get the numbers wrong,
half your stuff doesn't work.
And that was basically a way of understanding
what's going on under the hood.
And the reason I suspected you didn't have
a networking background is because this strikes me
as someone trying to fill in a gap in their own knowledge
because no one else cares enough to do stuff like this,
to be perfectly honest with you.
It was masterfully done.
Thank you.
Yeah, like you said, I mean, I know basics of networking.
I would not, I don't know what a spanning tree is.
I've heard of it. It would not, I don't know what a spanning tree is. I've learned it.
It's the reason you plug into a network
and for some reason for 30 seconds, nothing works.
So you're convinced you didn't plug it in correctly.
And then there's rapid spanning tree,
which is only what, five to 15 seconds.
It causes the same thing,
but they basically serve the same purpose
is to aggravate the hell out of network people.
Right, right.
But yeah, this was also learning.
I was learning quite a bit as well
while I was doing this,
but mainly she saw what I was talking about.
It's a basic of networking,
but I was trying to focus more on the cost aspects
and trying to find a way to make it interesting
because cost is quite a boring subject for most people,
at least I think.
Believe it or not, I used to as well,
but please continue. Yeah.
So that's why I had those interactive elements. So I tried to sort of grab the attention of the
reader by having them do that exercise of dragging the NAT gateway and the elastic IP. And so that
would sort of make the reader more interested to go to the more dry stuff. At least that's how I planned it.
It's what I like about, for those,
fortunately for most people listening to this,
it is an audio podcast.
So I can't just assume everyone's watching a video
and throw up a screenshot of this thing,
but it is absolutely worth visiting.
And it will be featured prominently in the show notes
when I put this thing out.
But it's a blog post that explains
the managed NAT gateway works and what it is.
The baseline stuff that more or less you can see
in a bunch of other places,
increasingly some of it generated by AI
and giving factually incorrect information
along the way for funsies.
I wouldn't be raving about it
if that's where all that this was.
Additionally, on top of it,
there are a number of diagrams popping around the thing
that you can, it's not necessarily obvious,
but you can pick it up and drag and drop
with a managed NAT gateway.
The first one is here's a public subnet,
here's a private subnet, drag and drop the NAT gateway
to the place that it lives in.
Does it live in the public subnet or the private subnet?
And if you get it right, it winds up doing the typical thing that it lives in. Does it live in the public subnet or the private subnet?
And if you get it right, it winds up
doing the typical thing that you would expect.
And then you iterate through it.
OK, now attach where the elastic IP goes, and so on and so
forth.
And that is, it effectively takes you through the steps
in a highly visual, highly interactive way
that, frankly, I'm just stuck marveling at here.
Yeah, thanks.
Yeah, so I wanna talk about actually creating it.
In my opinion, creating the diagrams was not the hard part.
In fact, I started with the diagrams first.
It was actually writing the content around it
was the tough part, at least for me,
especially because this was my first blog post and trying to make sure that everything flows, everything makes sense. That
took a lot of time. It does and it works super well. What I find fascinating is that it is,
the interface is very clean. It is well designed and visually appealing with the obvious exception
for the crap ass icons that AWS uses for their services. It got bad enough that for a brief stint,
I had a designer do some in-house icons
for the more commonly used services
in the last week in AWS icon style.
A bucket that looks like it's something
Bill of the Platypus might carry around, for example.
But it makes sense.
It flows, it works super cleanly,
and I have to ask, what is your front end background that got you here?
I studied chemical engineering at university.
And I was looking at, you know, I really did not enjoy the practical side of chemical engineering.
I really liked the theory.
But anyways, but we had this one unit where we had this programming class and I hated it.
I was like the first six weeks I hated it.
I don't know if you know how much JavaScript, I don't know how much JavaScript you know,
but do you remember this disk keyword in JavaScript?
It's sort of like objects.
I do not.
I, to be clear, my programming languages extend primarily to brute force and enthusiasm.
And that can carry you surprisingly far,
but I don't have a formal computer science education
in that direction.
Right, right.
But anyways, this, this, this keyword was so annoying.
And once I, but I spent some time, I understood it.
And then, you know, I started liking programming,
but still I continued on with my degree.
And once I graduated, I wanted to do something else.
And so I picked up, you know, I went to YouTube,
how can I break into this industry?
And the most common video that I saw was,
hey, learn CSS, JavaScript and HTML,
and you can apply as a front end engineer.
And, you know, it's possible.
Where we wind up having the interview process
being a whole bunch of algorithmic questions and the rest
and the actual job is, okay, can you center a diff,
which is apparently way harder than you think in CSS,
but then things like Tailwind apparently make it
far easier to do if I'm understanding the grousings
of front-end engineers who let me hang out
with them sometimes.
Yeah, that's sort of correct, yeah.
And I spent, yeah, so I spent
around maybe a year or so improving my front end skills, but I did not become a front end engineer.
Somehow I became a doubts engineer, but I still kept those skills sharp in case I would need them.
And it just so happens that I needed it for this blog post. This episode is sponsored by my own
company, the Duckbill Group.
Having trouble with your AWS bill?
Perhaps it's time to renegotiate a contract with them.
Maybe you're just wondering how to predict what's going on in the wide world of AWS.
Well, that's where the Duckbill Group comes in to help.
Remember, you can't duck the duckbill bill, which I am reliably informed by my business
partner is absolutely not our motto. Remember, you can't duck the duck bill bill, which I am reliably informed by my business partner
is absolutely not our motto.
I'm sorry, I'm still sort of boggling
at that trajectory there,
because foundationally,
I don't think I've ever spoken to a DevOps person before
who came from a traditionally front end background.
I used to be in the DevOps SRE space myself,
managing teams and whatnot. I
love shooting my mouth off at the DevOps Days conferences because I love the sound of my
own voice. And that was an awful lot of fun, but I don't I've met people who came from
a systems engineering background. I came from the ops side of the world, which is basically
sys admins who realized that by changing your job title, you can get a 40% raise. So do
it. But the first programming language most of these folks
have encountered have been things like Python,
or in some cases Go these days is popular.
C is where I first started learning many, many years ago,
because we all have to haze different generations
in different ways.
But I never heard of JavaScript being applied
to the idea of, I guess, the traditional DevOps roles.
I mean, that's not where most of the tools are written in.
It's fascinating to think about.
Yeah. And now that you say it like that, it is true.
But then again, you have a lot of adoption of JavaScript as well in the AWS space.
So for example, CDK is now all, it's mainly JavaScript and TypeScript.
And a lot of these other libraries are also in JavaScript.
So JavaScript is popular, right?
It's getting, and I think it's getting more and more popular
because for a lot of people, I think it's like the gateway
programming language, at least now, compared to like many,
many years ago.
It's so odd just seeing the, I guess, the juxtaposition
where I've seen some
beautiful interactive designs of blog posts in a bunch of different places, but they don't tend
to get into the nitty gritty of AWS services, much less something that goes this deep.
I don't mean to start a war in Seattle, which I tend to accidentally do every time I open my mouth,
but with all of the people they hire on their dev rel teams at AWS,
with their massive documentation teams,
the training and certification group,
all of them do excellent work, I wanna point out.
Why does it feel like whenever they're describing
how these things work, they're basically trying
to describe a painting by gesticulating with their hands,
as opposed to, well, here's a real quick animation
that makes sense, in fact, now you try it and see if that helps it set in,
which by the way, it absolutely does.
You said this took you a few months to build.
They've had years.
Where is their version of this?
Yeah, you know, good question.
I also agree with you.
I feel like the documentation is not too great.
If you look at some of the other products,
I'm not talking about cloud providers,
but in general, like Next.js,
I really like reading their documentation.
It's very clear.
I wish more documentation was like that.
I agree wholeheartedly.
Vercell's doing some fascinating work there,
and they sweat a lot of details that other folks don't.
I mean, to be clear, I am slightly biased.
My friend Cody, who's the voice behind
Killed by Google works there,
and I periodically make it a point
to go dark in their doorstep.
Yeah, and you know, right, if you've seen that document,
it's actually a delight, you know?
You actually like reading these documentaries,
but when I look at AWS documentation,
it feels like, I know it's like work.
I mean, it is work, but it feels like more work
than it needs to be. My problem with it is work, but it feels like more than it needs to be.
My problem with it, I think, is that so much of it
is written from the perspective of either trying
to be exhaustive and act like a library,
like API library documentation and list every call,
or it just gives this topical, glossing treatment of it.
But it becomes remarkably clear that whatever it is
that has led to this
did not come from someone sitting down to use the product
who was smart at many things,
but had not used this product before
and tried to accomplish a number of common tasks.
That's where the gulf seems to be,
that the people writing the documentation
are very often far too close to the product
that they're attempting to document.
Like take a giant step back and assume someone
is far less smart about the specifics
of this thing you've worked on than you are,
how would you explain it to someone new?
And every bit of documentation I've seen officially
on the Managed NAT Gateway presupposes
that you are already deep in the AWS networking weeds,
even though this is one of the first
AWS networking constructs you're going to encounter
because you find it staring at you on your bill
and what you thought was a free account.
Yeah, and to be honest, I actually don't know
what the answer, it's hard, right, for them as well,
who is their audience, right?
Maybe they assume that you know,
they assume that the audience is very technical, and so they don't want to write too much information that might just turn off the the world.
I'm not too sure myself.
But I think there are some opportunities to make it at least more interesting.
I would agree.
That is it's one of those big problem spaces where if you look at this through most common lenses,
this stuff is incredibly boring.
No one cares about how this networking stuff
works internally until they decide they're tired
of not knowing and they want to go diving into it.
And even then it's a slog because so much
of this stuff is so dry.
It's, you didn't just build something
with a great visualization approach.
You take the user through it from the simple to the complex
and you're engaging about it.
That is something that's very hard to find.
I would still like this blog post
even if it just had typical static images on it.
But the fact that it goes so far beyond that
means that it's something that is significantly neat.
In fact, when it was posted on Blue Sky and on Reddit,
you saw the response to it.
Most people's response to people said,
hey, I wrote a blog post is complete and utter silence
because it just feels like it's odds are
it's going to be part AI generated,
going to get some things wrong.
And basically people are just looking for clicks,
attention or clout.
This legitimately comes from a place
of wanting to educate people and it shows.
Yeah, and I think I also was inspired by Josh Como's website,
because I've done many courses, right?
But I feel like I forget about them once I'm done.
But Josh's one is so different.
I'm still going and just looking stuff up because I just enjoy it.
And that's sort of what I wanted to create as well.
There's the education part, but I also wanted other people to sort of experience,
you know, this sort of joy that you can actually get from a good website, you know,
because the web is really interactive and I feel like it's not well utilized.
Yeah, it's very much the it's become more or less a you download a more static blog post that also
comes along with 75 megabytes
of JavaScript when all is said and done loaded serially because of course it is and 95% of
it is basically there to act as a surveillance apparatus.
One of the fastest and best days on the internet that I can recall was the day that GDPR came
into force and so many companies had not planned sufficiently far ahead like the major news
organizations that for a couple of days they turned their website into pure text only. And it was lightning
fast to load anything. It was this is what we could have had, except we've somehow decided it's
impossible to advertise things to people without basically spying on them and building a fairly
complete picture of their entire lives, which I'm sorry I have challenges with that entire model but this is if you
actually have used that platform for something in a way that is creative, fun
and at best of all which I don't know if it's clear or not from the conversation
from everything I can tell you're not trying to sell me anything. No I'm not. I wish'm not. I wish I could sell you something, but I can't.
I don't have anything to sell.
Another question I have for you is that you mentioned
at the end that this was your first blog post,
and that really was sort of the second big shock
of all of this, because normally,
as you mentioned, this took months for you to write.
Why didn't you start with something, I guess,
easier from a production perspective to put out there?
Namely, you know, text on a screen,
just hypothetically there.
Yeah, because I feel like I wouldn't read it.
That's the first reason.
So if I wouldn't read it,
I can't expect others to read it either.
So that was the first reason.
And the second reason is I think for me,
sitting down and writing is the hard part.
And like I said earlier, I started with the diagrams, right?
I started with the diagrams and with the intention
that once I create the diagrams,
I'll be forced to write the content.
It did happen.
I was dragging my feet trying to put words on the page.
Yeah, I think that you have a gift here.
You're opinionated in how you write.
The things I take issue with on some level on this are, I guess, not what you might expect.
Your big warning at the top where you say the intended audience, this blog post assumes
you already have a strong grasp of networking in the cloud.
And for 98% of blog posts on stuff like this, I would wholeheartedly agree.
You've made it so approachable that I don't believe
that that holds true.
Worst case, I feel like you could have put maybe
a hyperlink in there when you talk about
what a route table is, for example.
If you don't know what it is, here's a link
to the Wikipedia page, go read and come back
for folks to figure this stuff out.
Honestly, I feel like that warning at the top
might scare away people who otherwise
are perfectly situated to pick it up.
The other thing that I'm wondering on is great,
all you link at the bottom is your now X account,
I suppose, your Blue Sky account and then LinkedIn.
Great, where's the other blog post?
I wanna read more of what it is that you're working on.
And at the moment, this is the only one.
Yeah, so I am working on another blog post,
but I was initially thinking maybe I might
start writing about VPC endpoints.
It feels like a natural next step, but I thought maybe I'll, I want to do something else instead
rather than just, rather than just, I want to say this rather than just going sequentially.
I'm using Cloudflare and I really love the product.
I feel like I really want to write something about it, especially comparing Cloudflare and I really love the product. I feel like I really want to write something about it,
especially comparing Cloudflare workers
with something like CloudFront Lab.Edge.
I feel like there's some content that I can write over there,
so that that's gonna be my next blog post
and the one that I'm working on.
But while if you enjoyed this sort of content,
there are other content creators like this in the space, but not AWS.
But I think you'll still enjoy it.
One of them is Sam Rose.
Sam Rose creates some incredibly interactive computer science related topics.
And then there's Benjamin from PlanSkill.
I don't know if you've seen his blog post on latency and IO.
Yeah.
Have you seen it? I was at, I was at their office when the, where they put that out.
I was talking to a few folks over there about how to, how to
contextualize some of this stuff.
Yeah.
I don't find it's go very well.
Huge fan of what they do.
Yeah.
I love that.
Bluff.
That was amazing.
I could tell as someone who's was done a bit of this visualization, I could tell
that must have taken such a long time
for him to put that together.
And then at least in the info,
in front and back, I think that's it.
And then there's Josh Komo.
If you want to have a look at his stuff,
I would highly recommend it.
I was inspired by his blog post.
Yeah, I want to put links to all of this stuff
in the show notes as well,
because I think if it's,
it empires you to do things like this,
I want more people to read these things.
Definitely. I really want to thank you for taking the time to do things like this, I want more people to read these things. Definitely.
I really want to thank you for taking the time
to speak with me. If people want to learn more,
where should they go to keep up to date and
figure out when your next blog post drops,
where they can read it?
Yeah, so I am quite well,
I'm relatively active on Blue Sky and
on Reddit.
I'm not that active on Twitter, though.
No, I used to be extraordinarily active and it turns out that populations migrate, platforms
change and people decide where they want to invest their attention. The fact that so many
people found this worthy of investing its attention in tells me that you've written
something phenomenal.
Thank you.
And we'll of course put links to all of those things in the show notes. Thank you so much
for taking the time to speak with me.
I appreciate it.
Likewise.
Malith Rakhbasha, DevOps engineer who just excels at telling fantastic stories through
art.
I'm cloud economist Corey Quinn and this is Screaming in the Cloud.
If you've enjoyed this podcast, please leave a five-star review on your podcast platform
of choice.
Whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice. Whereas if you've hated this podcast
Please leave a five-star review on your podcast platform of choice along with an angry insulting comment
That still requires 75 megabytes of JavaScript to load so I can fully appreciate the lack of content Bye!