Screaming in the Cloud - Best Practices in AWS Certificate Manager with Jonathan Kozolchyk
Episode Date: July 6, 2023Jonathan (Koz) Kozolchyk, General Manager for Certificate Services at AWS, joins Corey on Screaming in the Cloud to discuss the best practices he recommends around certificates. Jonathan walk...s through when and why he recommends private certs, and the use cases where he’d recommend longer or unusual expirations. Jonathan also highlights the importance of knowing who’s using what cert and why he believes in separating expiration from rotation. Corey and Jonathan also discuss their love of smart home devices as well as their security concerns around them and how they hope these concerns are addressed moving forward. About JonathanJonathan is General Manager of Certificate Services for AWS, leading the engineering, operations, and product management of AWS certificate offerings including AWS Certificate Manager (ACM) AWS Private CA, Code Signing, and Encryption in transit. Jonathan is an experienced leader of software organizations, with a focus on high availability distributed systems and PKI. Starting as an intern, he has built his career at Amazon, and has led development teams within our Consumer and AWS businesses, spanning from Fulfillment Center Software, Identity Services, Customer Protection Systems and Cryptography. Jonathan is passionate about building high performing teams, and working together to create solutions for our customers. He holds a BS in Computer Science from University of Illinois, and multiple patents for his work inventing for customers. When not at work you’ll find him with his wife and two kids or playing with hobbies that are hard to do well with limited upside, like roasting coffee.Links Referenced:AWS website: https://www.aws.comEmail: mailto:koz@amazon.comTwitter: https://twitter.com/seakoz
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
In the cloud, ideas turn into innovation at virtually limitless speed and scale.
To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats.
What's runtime insights,
you ask? Visit sysdig.com slash screaming to learn more. That's S-Y-S-D-I-G dot com slash
screaming. My thanks as well to Sysdig for sponsoring this ridiculous podcast.
Welcome to Screaming in the Cloud. I'm Corey Quinn. As I record this, we're about a week and a half from Reinforce in Anaheim, California.
I am not attending, not out of any moral reason not to,
because I don't believe in cloud security or conferences that Amazon has that are named after subject lines,
but rather because I am going to be officiating a wedding on the other side of the world, because I am an ordained minister of the church of There Is A Problem With This Website Security
Certificate. So today, my guest is going to be someone who's a contributor in many ways to that
religion. Jonathan Kozolchik, but we all call him Kaz, is the general manager for certificate
services at AWS. Kaz,
thank you for joining me. Happy to be here, Corey. So one of the nice things about ACM,
historically, the managed service that handles certificates from AWS is that for anything
public-facing, it's free, which is always nice. You should not be doing up charges for security,
but you also don't let people have the private portion of the cert. You should not be doing up charges for security, but you also don't let people have
the private portion of the cert. You control all of the endpoints that terminate SSL, whereas when
I terminate SSL myself, it terminates on the floor because I've dropped things here and there,
which means that suddenly the world of people exposing things they shouldn't or expiry concerns
just largely seem to melt away.
What was the reason that Amazon looked around at the landscape and said,
ah, we're going to launch our own certificate service, but bear with me here,
we're not going to charge people money for it?
It seems a little bit out of character.
Well, Amazon itself has been battling with certificates for years,
long before even AWS was a thing.
And we learned that you have to automate, and even that's not enough. You have to inspect,
and you have to audit. You need a controlled loop. And we learned that you need a closed loop to truly manage it and make sure that you don't have outages. And so when we built ACM,
we built it saying, we need to provide that same functionality
to our customers, that certificates should not be the thing that makes them go out,
that we need to keep them available, and we need to minimize the sharp edges customers have to deal
with. I somewhat recently caught some flack on one of the Twitter replacement social media sites for complaining
about the user experience of expired SSL certs. Because on the one hand, if I go to my bank's
website and the response is that instead the server is sneakyhackerman.com, it has the exact
same alert and failure mode as, holy crap, this certificate reached its expiry period 20 minutes
ago. And from my perspective, one of those is a lot more serious than the other. But also,
I wind up encountering this not just when I'm doing banking, but when I'm trying to read some
random blog on how to solve a technical problem. I'm not exactly putting personal information into
the thing. It feels like that was a missed opportunity. Agree or disagree? Well, I wouldn't categorize it as a missed opportunity. I think one of the
things you have to think about with security is you have to keep it simple so that everyone,
whether they're a technologist or not, can abide by the rules and be safe. And so it's much easier
to say to somebody, there's something wrong, period, stop, versus saying
there are degrees of wrongness. Now, that said, boy, do I wish we had originally built PKI and
TLS such that you could submit multiple certificates to somebody in a connection,
for example, so that you could always say, my certificate's going to expire, but I've got two,
and they're off by six months, for example. Or do something so that you don't have to close failed because the certificate expired.
It feels like people don't tend to think about what failure modes are going to look like because
a expired certificate, what kind of irresponsible buffoon would do such a thing? But I've worked in
enough companies where you have historically the
wildcard cert because individual certs cost money once upon a time. So you wound up getting the one
certificate that could work on all of the stuff that ends in the same domain. And that was great.
But then whenever it expired, you had to go through and find all the places that you put it.
You always miss some. So things would break for a while. And the corporate response was,
oh, that was awful. Instead of a one yearyear certificate, let's get a five-year or a
ten-year certificate this time. And that doesn't make the problem better. It makes it absolutely
worse, because now it proliferates forever. Everyone who knows where that thing lives is
now long gone by the time it hits again. Counterintuitively, it seems the industry
has largely been moving toward short-lived certs
let's encrypt for example winds up rotating every 90 days by my estimation acm is a year if memory
serves so acm certs are 13 months and we start rotating them around the 11th month and let's
encrypt offers you 90 day certs but they don't necessarily require you to rotate every 90 days.
They expire 90 days.
My tip for everybody is divorce expiration from rotation.
So if your cert is a 90-day cert, rotate it at 45 days.
If your cert is a year cert, give yourself a couple months before expiration to start the rotation.
And then you can alarm on it on your own timeline when something fails and you still have time to fix it. This makes a lot of sense in the second time, but then you start remembering, okay,
everywhere I use this cert, I need to start having alarms and alerts and people are bad at these
things. What ACM has done super well is that it removes that entire human from the loop because you control all of the endpoints. You folks have the ability to rotate it
however often you'd like. You could have picked arbitrary timelines of huge amounts of time or
small amounts of time, and it would have been just fine. I mean, you log into an EC2 instance role,
and I believe the credentials get passed out of either a six or a 12-hour validity window, and they're consistently rotating on the back end, and
it's completely invisible to the customer.
Was there ever thought given to what that timeline should be, what that experience should
be?
Or did you just throw a dart at a wall?
Like, yeah, 13 months feels about right.
We're going to go with that and never revisited it.
I have a guess which side of that it was.
Did you think
at all of what you were doing at the time? So I will admit this happened just before I got there.
I got to HCM after. Blame the predecessor. Always a good call.
It's a God-given right to blame your predecessor. Oh, absolutely. That's their entire job.
I think they did a smart job here. What they did was they took the longest lifetime cert that was then allowed
at 13 months, knowing that we were going to automate the rotation and basically giving us
as much time as possible to do it without having to worry about scaling issues or having to rotate
overly frequently. There are customers who, while I strongly disagree with pinning, for example,
but there are customers out there who don't like certs that change very often. I don't recommend
pinning at all, but I understand these cases are out there, and changing it once every year
can be easier on customers than changing it every 20 minutes, for example. If I were to pick an ideal rotation time, it'd probably be
under 10 days, because an OCSP response is good for 10 days. And if you rotate before then,
then I never have to update an OCSP response, for example. But changing that often would play havoc
with many systems, because of just the sheer frequency you're rotating
what is otherwise a perfectly
valid certificate. It is computationally expensive to generate certificates at scale, I would imagine.
It starts to be a problem. You're definitely putting a lot of load on the HSMs
at that point when you're generating, you know, when you have millions of certs out in deployment,
you're generating quite a few at a time.
There is an aspect of your service that used to be part of ACM and now it's its own service,
which I think is probably the right move because it was confusing for a lot of customers. Amazon looks around and sees who can we compete with next, it feels like sometimes. And it seemed like
you were squarely focused on competing against your most desperate of all enemies,
my crappy USB key, where I used to keep the private CA I used in any given job.
At the time, I did not keep it after I left, to be very clear,
for whenever I'm signing things for certificates for internal use.
You're like, ah, we can have your crappy USB key as a service.
And sure enough, you wound up rolling that out.
It seems like adoption has been relatively brisk on that just because I see it in almost every client account I work with.
Yeah. So you're talking about the private CA offering.
That's right. Private CA was the new service name. Yes. It used to be a private certificate
authority was an aspect of ACM. And now we're just going to move that off.
We split it out because like you said, customers got confused. They thought they had to only use
it with ACM. They didn't understand it was a full standalone service. And it was built as a
standalone service. It was not built as part of ACM. Before we built it, we talked to customers.
And I remember meeting with people running fairly large startups saying, yes, please run this for me. I don't know
why, but I've got this piece of paper in my sock drawer that one of my security engineers gave me
and said, if something goes wrong with our CA, you and two other people have to give me this piece
of paper. And others were like, oh, you have a piece of paper? I have a USB stick in my sock
drawer. The startup world was running their
CAs from sock drawers, as far as I can tell. A piece of paper, someone wrote out the key by hand.
That sounds like hell on earth. It was, it was a sharding technique where you needed, you know,
three of five or something like that. Oh, uh, the, uh, Shamir's secret sharing service, the SSSS.
Yeah. Yes. You know, and we looked at it, and the other alternative was people would use open source or free certificate authorities, but without any of the security you'd want, like HSM backing, for example, because that gets really expensive.
And so, yeah, we did what our customers wanted.
We built this service.
We've been very happy with the growth it's taken.
And like you said, we love
the places we've seen it. It's gone into all kinds of different things from the traditional enterprise
use cases to IoT use cases. At one point, there's a company that tracks sheep and every collar has
one of our certs in it. And so I am active in the sheep tracking industry. I am certain that some wit
is going to comment on this. Oh, there's a company out there that tracks sheep. Yeah,
it's called Apple or Facebook or whatever crappy, whatever act someone has to grind against any
particular big company. But you're talking actual sheep as in bah, smell bad, count them when going
to sleep? Yes, actual sheep. Excellent. Excellent.
The certs are in drones. They're in smart homes. So
they're everywhere now.
That is something I want to ask you about.
Because I found that there's a competition going on
between your service, ACM,
because you won't give me the private keys for reasons
that we already talked about, and
Let's Encrypt. It feels like you two are both
competing to not take my money,
which is an odd sort of competition.
You're not actually competing.
You're both working for a secure internet in different ways.
But I wind up getting certificates made automatically for me
for all of my internal stuff using Let's Encrypt
and with publicly resolvable domain names.
Why would someone want a private CA instead of an option
that, okay, yeah, we're only using it internally, but there is public validity to the certificate?
Sure. And just because I have to nitpick, I wouldn't say we're competing with them.
I personally love Let's Encrypt. I use them at home too. Amazon supports them financially. We
give them resources. I think they're great. I think as long as you're getting
certs, I'm happy. The world is encrypted. And people use private CA because fundamentally,
before you get to the encryption, you need secure identity. And a certificate provides identity.
And so Let's Encrypt is great if you have a publicly accessible DNS endpoint that you can
prove you own and get a certificate for, and you're willing
to update it within their 90-day windows. Let's use the sheep example. The sheep don't have
publicly valid DNS endpoints. Or to be very direct with you, they also tend to not have
terrific operational practices around updating their own certificates.
Right. Same with drones, same with internal corporate. You may not want your DNS
exposed to the internet, your internal sites. And so you use a private certificate where you own
both sides of the connection, right? Where you can say, because you can put the CA in the trust
store, and then that gets you out of having to be compliant with the CA browser
form and the web trust rules. A lot of the CA browser form dictates what a public certificate
can and can't do and the rules around that. And those are built very much around the idea of
a browser connecting to a client and protecting that user.
And most people are not banking on a sheep.
Most people are not banking on a sheep, yes. But if you have, for example, a database that requires
a restart to pick up a new cert, you're not going to want to redo that every 90 days.
You're probably going to be fine with a five-year certificate on that because you want to minimize
your downtime. Same goes with a lot of these IoT devices, right? You may
want a thousand-year cert or a hundred-year cert or a cert that doesn't expire because this is a
cert that is generated at creation for the device, and it's at birth. The machine is manufactured,
and it gets a certificate, and you want it to live for the life of that device.
Or you have supersecretproject.internal.mycompany.com,
and you don't want a publicly visible cert for that because you're not ready to launch it,
so you'll start with a private cert. Really, my advice to customers is if you own both
pieces of the connection, if you have an API that gets called by a client you own,
you're almost always better off with a private certificate
and managing that trust store yourself.
Because then you are subject not to other people's rules,
but the rules that fit the security model
and the threat assessment you've done.
For the publication system for my newsletter,
when I was building it out,
I wanted to use client certificates
as a way of authenticating that it was me.
Because I only have a small number of devices that need to talk to this thing.
Other people don't.
So how do I submit things into my queue and manage it?
And back in those ancient days, the API gateways didn't support TLS authentication.
Now they do.
I would redo it a bunch of different ways.
They did support API key as an authentication mechanism, but the documentation back then was so terrible, or I was so new to this stuff, I didn't realize what it was and introduced it myself from first principles, where there's a hard-coded UUID.
And as long as there's the right header with that UUID, accept it, otherwise drop it on the floor. Which which there are probably better ways to do that.
Sure. Certificates are a very popular way to handle that situation because they provide that secure identity.
You can be assured that the thing connecting to you can prove it is who they say they are.
And that's a great use of a private CA.
Changing gears slightly, as we record this, we are about two weeks before Reinforce, but I will be off doing my own thing on that day.
Anything interesting and exciting coming out of your group that's going to be announced?
With the proviso, of course, that this will not air until after Reinforce.
Yes. So we are going to be pre-announcing the launch of a connector for Active Directory. So you will be able to tie your private CA instance to your Active Directory tree
and use private CA to issue certificates
for use by Active Directory for all of your Windows hosts,
for all of the users in that Active Directory tree.
It has been many years since I touched Windows in anger, but in 2003 or so, I was a
mediocre small business Windows server admin. Doesn't Active Directory have a private CA built
into it by default for whenever you're creating a new directory? It does. Is that one of the
FISMO rules? I'm trying to remember offhand.
What's a FISMO? FISMO. F-S-M-O. It's some trivia question that people love to haze each other with in Microsoft interviews. What are the seven FISMO roles, at least back then, and will have to be
moved before you decommission a domain controller or you're going to have tears before bedtime?
Yes. Microsoft provides a certificate authority for use with Active Directory.
They've had it for years, and they had to provide it because back then nobody had a certificate
authority, but AD needed one. The difference here is we manage it for you, and it's backed by
HSMs. We ensure that the keys are kept secure. It's a serverless connection to your Active
Directory tree. You don't have to run any software of ours on your hosts. We take care of all of it.
And it's been the top request from customers for years now. It's been quite a bit of effort to
build it, but we think customers are going to love it because they're going to get all the
security and best practices from private CA that they're used to, and they can decommission their on-prem
certificate authority and not have to go through the hassle of running it.
A big area where I see a lot of private CA work has been in the realm of desktops for
corporate environments. Because when you can pass out your custom trusted root or trusted
CA to all of the various nodes you have and can control them, it becomes a lot easier. I always
tended to shy away from it just because in small businesses like the one that I own, I don't want
to play corporate IT guy more than I absolutely have to. Yeah. Trust store management is always a painful part of PKI. As if there weren't enough
painful things in PKI, trust store management is yet another one. Thankfully, in the large
enterprises, there are good tooling out there to help you manage it for the corporate desktops and
things like that. And with private CA, you can also, if you already have an offline route that is in all of your trust stores
in your enterprise, you can cross-sign the route that we give you from private CA into that
hierarchy. And so then you don't have to distribute a new trust store out if you don't want to.
This is a tricky release, and I'm very glad I'm taking the week off. It's getting announced because there are two reactions that are going to
happen to any snarking I can do about this.
The first is no one knows what the hell this is and doesn't have any context
for the rest.
And the other folks are going to be,
yeah,
shut up clown.
This is going to change my workflow in amazing ways.
I'll deal with your
nonsense later. I want to hear this. And I feel like one of those constituencies is very much
your target market and the other isn't, which is fine. No service that AWS offers, except the bill,
is for every customer, but every service is for someone.
That's right. We've heard from a lot of our customers, especially as they, you know, the large international ones, right?
They find themselves running separate Active Directory CAs in different countries because they have different regulatory requirements and separations that they want to do.
They are chomping at the bit to get this functionality because we make it so easy to run a private CA in these different regions.
There is certainly going to be that segment at Reinforce that's just happy certificates happen
in the background and they don't think anything about where they come from and this won't resonate
with them. But I assure you, for every one of them, they have a colleague somewhere else in
the building that is going to do a happy dance when this launches.
Because there's a great deal of customer heavy lifting and just sharp edges that we're taking away from them. And we'll manage it for them and they're going to love it.
One thing that I have seen the industry shift to that I love is the Let's Encrypt model, where the certificate
expires after 90 days. And I love that window because it is a quarter, which means, yes,
you can do the crappy thing and have a calendar reminder to renew the thing. It's not something
you have to do every week, so you will still do it, but you're also not going to love it. It's just enough friction to inspire
people to automate these things. And that I think is the real win. There's a bunch of things like
CertBot. I believe the protocol is called ACME, A-C-M-E, always in caps, which usually means an
acronym or someone has their caps lock key pressed, which is of course cruise control for cool.
But that entire idea
of being able to have a back and forth authentication pass and renew certificates
on a schedule, it's transformative. I agree. ACM, even Amazon before ACM,
we've always believed that automation is the way out of a lot of this pain. As you said earlier,
moving from a one-year cert to a five-year cert doesn't buy you anything other than you lose even more institutional knowledge when your cert expires.
I think that the move to further automation is great.
I think Acme is a great first step.
One of the things we've learned is that we really do need a closed loop of monitoring to go with certificate issuance.
So at Amazon, for example, every cert that we issue,
we also track. And the endpoints emit metrics that tell us what cert they're using. And it's not
what's on disk, it's what's actually in the endpoint and what they're serving from memory.
And we know because we control every cert issued within the company,
every cert that's in use. And if we see a certain use that, for example,
isn't the latest one we issued, we can send an alert to the team that's running it.
Or if we've issued a cert and we don't see it in use, we see the old one still in use,
we can send them an alert.
They can alarm and they can see that, oh, we need to do something
because our automation failed in this case.
And so I think Acme is great.
I think the push Let's Encrypt did to say,
we're going to give you a free certificate,
but it's going to be short-lived, so you have to automate.
That's a powerful carrot and stick combination they have going.
And I think for many customers, CertBot's enough.
But you'll see even with ACM,
where we manage it for our customers, we haveot's enough. But you'll see even with ACM, where we manage it for our customers,
we have that closed loop internally as well
to make sure that the cert,
when we issue a new cert to our client,
you know, to the partner team,
that it does get picked up and it does get loaded.
Because issuing you a cert isn't enough.
We have to make sure that you're actually using the new certificate.
I also have learned as a result of this, for example, that AWS certificate manager,
Amazon certificate manager, the ACM, the certificate thingy that you run,
so many names, so many acronyms, it's great. But it has a limit by default of 2,500 certificates.
And I know this because I smacked into it. Why? I wasn't sitting there clicking
and adding that many certificates, but I had a delightful step function pattern called the
lambda invokes itself. And you can exhaust an awful lot of resources that way because I am
bad at programming. That is why for safety, I always recommend that you iterate development-wise
an account that is not production and preferably one that belongs to someone else.
We do have limits on cert issuance.
You have limits on everything in AWS, as it should,
because it turns out that whenever there's not a limit,
A, free database just dropped,
and B, things get hammered to death.
You have to harden these things.
It's one of those things that's obvious
once you've operated a certain point of scale, but until you do, it just feels arbitrary and capricious. It's one of those things that's obvious once you've operated a certain point of scale,
but until you do, it just feels arbitrary and capricious.
It's one of those things where I think Amazon is still, and all the cloud companies who
do this, are misunderstood.
Yeah.
So in the case of the ACM limits, we look at them fairly regularly.
Right now, they're high enough that most of our customers, vast majority, never come close
to hitting it.
And the ones that do tend to
go way over. And it's been a mistake in my case as well. This was not a complaint, incidentally.
It was like, well, I want to wind up having more waste and more ridiculous nonsense.
It was not my concern. No, no, no. But we do, for those customers who have
not mistake use cases, but actual use cases where they need more,
we're happy to work with their account teams and with the customer and we can up those limits.
I have always found that limit increases with remarkably few exceptions.
The process is, explain to me what your use case is here. And I feel like that is a screen
for, first, are you doing something horrifying for which there's a better solution? And two,
it almost feels like it's a bit of a customer research approach. This is fine for most customers. What are you folks doing over there? And is there a use case we haven't accounted
for in how we use the service? I always find we learn something when we look at the P100
accounts that use the most certificates and how they're operating. Every time I think I've seen
it all in AWS, I just have to talk to one more customer and it's back to school I go. Yep. And I thank them for that education. Oh, that is the best part of working with customers
and honestly being privileged enough to work with some of these things and talk to the people who
are building really neat stuff. I'm just kibitzing from the sideline most of the time. Yeah.
So one last topic I want to get into before we call it a show. You and I have been talking a fair bit out of school, for lack of a better term, around
a couple of shared interests.
The one more germane to this is home automation, which is always great because especially in
a married situation, at least as I am, and I know you are as well, there's one partner
who is really into home automation, and the other partner finds
himself living in a haunted house.
I knew I had won that battle when my wife was on a work trip, and she was in a hotel,
and she was talking to me on the phone, and she realized she had to get out of bed to
turn the lights off because she didn't have our Alexa goodnight routine available to her
to turn all the lights off and let her go to bed.
And so she is my core customer when I do the home automation stuff and definitely make sure my use cases and my automations work for her.
But yeah, I love that space.
Coincidentally, it overlaps with my work life quite a bit because identity in smart home is a challenge.
We're really excited about the Matter standard.
For those listening who aren't sure what that is, it's a new end-all, be-all smart home standard for defining devices in a protocol-independent way that lets your hubs talk to devices
without needing drivers from each company to interact with them.
And one of the things I love about it is every device needs a certificate
to identify it.
And so Private CA has been a great partner with Matter.
It goes well with it.
In fact, we're one of the leading certificate authorities for Matter devices. Customers love the
pricing and the way they can get started without talking to anybody.
So yeah, I'm excited to see, as a smart home junkie and as a
PKI guy, I'm excited to see Matter take off. Right now I have a huge
amalgamation of smart home devices at home, and seeing them all go to
Matter will be wonderful.
Oh, it's fantastic. I am a little worried about aspects of this, though, where you have things
that get access to the internet and then act as a bridge. So suddenly, like, I have an IoT subnet
with some controls on it for obvious reasons. And honestly, one of the things I despise the
most in this world has been the rise of smart TVs, because I just want you to be a big
dumb screen. Well, how are you going to watch your movies? With the Apple TV, I've plugged into the
thing. I just want you to be a screen. That's it. So I live a bit in fear of the day where these
things find alternate ways to talk to the internet and, you know, report on what I'm watching.
Yeah, I think Matter's going to help a lot with this because it's focused on local control. And so you'll have to trust your hub, whether that's your TV or your
Echo device or what have you. But they all communicate securely amongst themselves.
They use certificates for identification, and they're building into Matter a robust
revocation mechanism. In my case at home,
my TV is not connected to the internet because I use my Fire TV to talk to it, similar to your
Apple TV situation. I want a device I control, not my TV doing it. I'm happy with the big dumb
screen. And I think what you're going to end up doing is saying, there's a device out there you'll
trust maybe more than others and say, that's what I'm going to use as my hub for my Matter devices. And that's what will speak to
the internet. And otherwise, my Matter devices will talk directly to my hub.
Yeah, there's very much a spectrum of trust. This is a Linux distribution on a computer that I
installed myself and vetted and wound up contributing to at one point on the one end of
the spectrum and the other end of the spectrum of things you trust the absolute least in this world,
which are, of course, printers. And most things fall somewhere in between.
Yes. Right now, it is a wild west of rebranded white label applications, right? You have all
kinds of companies spitting out reference designs as products and white labeling the control app for it.
And so your phone starts collecting these smart home applications to control each one
of these things because you buy different switches from different people.
I'm looking forward to Matter collapsing that all down to having one application and one
control model for all of the smart home devices.
Wemo explicitly stated that they're not going to be pursuing this because it doesn't let them
differentiate the experience, read as cash grab. I also found out that Wemo, which is, of course,
a Belkin subsidiary, had a critical vulnerability in some of the light switches it offered,
including the one built into the wall in this room until a week ago where they're not going to be releasing a patch for it
because those are end of life.
Really, because I log into the Wemo app
and the only way I would have known this
has been the fact that it's been a suspiciously long time
since there was a firmware update available for it.
But that's it.
The only way I found this out was via a security advisory,
at which point that got ripped out of the wall
and replaced with something that isn't, you know, horrifying.
But man, did that bother me.
Yeah, I think this is still a open issue for the smart home world.
Every company wants a moat of some sort,
but I don't want 15 different apps to manage this stuff.
You turn me on to Home Assistant,
which is an open source home control automation system.
And at some level, the interface is very clearly built by a bunch of open-source people.
Good for them.
They could benefit from a graphic designer or a user experience person to tie it all together.
But once you wrap your head around it, it works really well.
Where I have automations that let me do different things.
They even have an Apple Watch app that has complications on it.
So I can tap the thing and turn on the lights in my office at different levels if I don't want to talk to the
robot that runs my house. And because my daughter has started getting very deeply absorbed into some
YouTube videos from time to time, after the third time I ask her what I call her name, I tap a
different one and the internet dies to her iPad specifically. I wait about 30 to 45 seconds and she'll find me immediately.
That's an amazing automation. I love Home Assistant. It's certainly more technical than
I could give to my parents, for example, right now. I think things like Matter are going to
bring a lot of that functionality to the easier-to-use hubs, and I think Home Assistant
will get better over time as well. I think the only way to deal
with these devices that are going to end of life and stop getting support is to have them be local
control only. And so then it's your hub that keeps getting support, and that's what talks to the
internet. And so if there's a vulnerability in the TCP stack, for example, in your light switch,
but your light switch only talks to the hub and isn't allowed to talk to anything else, how severe is that? I don't think
it's so bad. Certainly, I wall off all of my IoT devices so that they don't talk to the rest of my
network, but now you're getting to a fairly complicated networking mojo that listeners to
your podcast are, I'm sure, capable of, but many people aren't. I had something that did something very similar, and then I had to remove a lot of those
restrictions trying to diagnose a phantom issue that it appears was an unreported bug in the
wireless AP when you use its second Ethernet port as a bridge, where things would intermittently
not be able to cross VLANs when passing through that. As in, the initial host key exchange for SSH would work,
and then it would stall and reset on both sides, and it was a disaster.
It was, what is going on here?
And the answer was, it was haunted.
So a small architecture change later, and the problem has not recurred.
I need to reapply those restrictions.
I mean, these are the kinds of things that just make me want to live in a shack in the woods.
I don't know how you manage something like that.
Like, these are just pain points all over.
I think over time they'll get better.
But until then, that shack in the woods with not even running water sounds pretty appealing.
Yeah, at some level, having smart lights, for example, one of the best approaches that all the manufacturers I've seen have taken,
it still works exactly as you would expect when you hit the light switch on the wall, because that's something that you
really need to make work. Or it turns out for those of us who don't live alone, we will not
be allowed to smart home things anymore. Exactly. I don't have any smart bulbs in my house. They're
all smart switches because I don't want to have to put tape over something and say, don't hit that switch. And then watch one of my family members pull the
tape off and hit the switch anyways. I have floor lamps with smart bulbs in them,
but I wind up treating them all as one device. And then I've taken the switch out from the root
because it's like too many things to wind up slicing and dicing. But yeah, there's a scaling
problem because right now a lot of this stuff, because Matter's not quite there, all winds up using either Zigbee, which is fine,
I have no problem with that, it feels like it's becoming Matter quickly, or Wi-Fi. And there is
an upper bound to how many devices you want or can have on some fairly limited frequency.
Yeah, I think this is still something that needs to be resolved.
I've got hundreds of devices in my house.
Thankfully, most of them are not Wi-Fi or ZigBee.
But I think we're going to see this evolve over time,
and I'm excited for it.
I was talking to someone where I was explaining that, well, how this stuff works.
Like, well, how many devices could you possibly have on your home network?
And at the time, it was about 70 or 80.
And they just stared at me for the longest time.
I mean, it used to be that I could name all the computers in my house.
I can no longer do that.
Sure.
Well, I mean, every light switch ends up being a computer.
And that's the weirdest thing, is that it's, I'm used to computers being a thing that requires maintenance and care and feeding and securities patches and, yes, relevant to your work, an SSL certificate.
It's like, so what does all of that fancy wizardry do?
Well, when it receives a signal, it completes a circuit.
The end.
And it's, aren't we really better off for some of these things?
There are days we wonder.
Well, my light bill, my electric bill is definitely better off having these smart switches because nobody in my house seems to know how to turn a light switch off.
And so having the house do it itself helps quite a bit.
To be very clear, I would skewer you if you worked on an AWS service that actually charged
money for anything.
For what you just said about the complaining about light bills and optimizing light bills
and the rest.
But I've never had to optimize your services certificate bill after you spun off the one
thing that charges because you can't cost optimize free, as it turns out.
And I've yet to find a way to the one optimization possible where now
you start paying customers money. I'm sure there's a way to do that somewhere,
but damned if I can find it. Well, if you find a way to optimize free,
please let me know and I'll share it with all of our customers.
Isn't that the truth? I really want to thank you for taking the time to speak with me today.
If people want to learn more, where's the best place for them to find you?
I can give you the standard AWS answer.
www.aws.com.
Yeah.
Well, I would have said koz at amazon.com.
I'm always happy to talk about certs and PKI.
I find myself less active on social media lately.
You can find me, I guess, on Twitter as Ccause and on Blue Sky as Kozolchik.com.
And we will put links to all of that in the show notes.
Thank you so much for being so generous with your time.
I appreciate it.
Always happy, Corey.
Jonathan Kozolchik, or Cause, as we all call him, General Manager for Certificate Services at AWS. Always happy, Corey. Leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that then will fail to post because your podcast platform of choice has an expired security certificate.
If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group.
We help companies fix their AWS bill by making it
smaller and less horrifying.
The Duck Bill Group works for you,
not AWS.
We tailor recommendations to
your business, and we get
to the point. Visit
duckbillgroup.com to get
started.