Screaming in the Cloud - Using SRE to Solve the Obvious Problems with Laura Nolan
Episode Date: December 12, 2023Laura Nolan, Principal Software Engineer at Stanza, joins Corey on Screaming in the Cloud to offer insights on how to use SRE to avoid disastrous and lengthy production delays. Laura gives a ...rich history of her work with SREcon, why her approach to SRE is about first identifying the biggest fire instead of toiling with day-to-day issues, and why the lack of transparency in systems today actually hurts new engineers entering the space. Plus, Laura explains to Corey why she dedicates time to work against companies like Google who are building systems to help the government (inefficiently) select targets during wars and conflicts.About LauraLaura Nolan is a software engineer and SRE. She has contributed to several books on SRE, such as the Site Reliability Engineering book, Seeking SRE, and 97 Things Every SRE Should Know. Laura is a Principal Engineer at Stanza, where she is building software to help humans understand and control their production systems. Laura also serves as a member of the USENIX Association board of directors. In her copious spare time after that, she volunteers for the Campaign to Stop Killer Robots, and is half-way through the MSc in Human Factors and Systems Safety at Lund University. She lives in rural Ireland in a small village full of medieval ruins.Links Referenced:Company Website: https://www.stanza.systems/Twitter: https://twitter.com/lauraliftsLinkedIn: https://www.linkedin.com/in/laura-nolan-bb7429/
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, Chief Cloud Economist at the
Duckbill Group, Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
Welcome to Screaming in the Cloud.
I'm Corey Quinn.
My guest today is someone that I have been low-key annoying to come on to this show for years.
And finally, I have managed to wear her down. Laura Nolan is a principal software
engineer over at Stanza. At least, that's what you're up to today, last I've heard. Is that right?
That is correct. I'm working at Stanza, and I don't want to go on and on about my startup,
but I'm working with Niall Murphy and Joseph Brunas and Matthew Girard and a bunch of other people who've more recently
joined us. We are trying to build a load management SaaS service. So we're interested in
service observability out of the box, knowing if your critical user journeys are good or bad
out of the box, and being able to prioritize your incoming requests by what's most critical
in terms of visibility to your customers. So an emerging space, not in the Gartner Group magic circle yet, but I'm sure at some point.
It is surreal to me to hear you talk about your day job, because for, it feels like,
the better part of a decade now, Laura, Laura, oh, you mean Usenix, Laura, because you are on
the Usenix board of directors. And in my mind, that is what is
always shorthanded to what you do. It's, oh, right. I guess that isn't your actual full-time job.
It's weird. It's almost like seeing your teacher outside of the elementary school. You figure that
they fold themselves up in the closet there when you're not paying attention. I don't know what
you do when SRECon is not in process. I assume you just sit there and wait for the next one, right?
Well, no, we've run four of them in the last year,
so there hasn't been very much waiting, I'm afraid.
Everything got a little bit smooshed up together during the pandemic,
so we've had a lot of events coming quite close together.
But no, I do have a full-time day job.
The work I do with Usenix is just as a volunteer,
so I'm on the board of directors, as you say,
and I'm on the steering committee for all of the global SRECon events and typically often serve on the program committee as
well. And I'm served there annoying the chairs to, hey, do your thing on time. Very much like
an elementary school teacher, as you say. I've been a big fan of Usenix for a while.
One of the best interview processes I ever saw was closely aligned with evaluating candidates
along the
Usenik sage levels to figure out what level of seniority are they in different areas. And it was
always viewed through the lens of in what types of consulting engagements will the candidates shine
within, not the idea of, oh, are you good or are you crap? And spoiler, if I'm asking the question,
I'm of course defaulting myself to good and you to crap. Like the terrible bespoke artisanal job interview process that so many companies do. I love how this company had
built this out. And I asked them about it. Oh yeah, it comes back to the Usenix Sage things.
That was one of my first encounters with what Usenix actually did. And the more I learned,
the more I liked. How long have you been involved with the group?
Relatively short period of time. I think I first got involved with the group relatively short period of time i think i first
got involved with these nicks in around 2015 going to lisa and then going on to srecon it was all by
accident of course um i fell onto the srecon program committee somehow because i was around
and then because i was still around and doing stuff i got eventually you know got co-opted
into chairing and onto the steering committee and so forth. And, you know, it's like everything volunteer.
I mean, people who stick around and do stuff tend to be kept around. But E-Synix is quite
important to me. We have an open access policy, which is something that I would like to see a
whole lot more of. You know, we put everything right out there for free as soon as it is ready.
And we are constantly plagued by
people saying, hey, where is my SREcon video? The conference was like two weeks ago. And we're like,
no, we're still processing the videos. We'll be there. They'll be there. We've had people
like literally offered to pay extra money to get the videos sooner. But we're like, we are open
access. We are not keeping the videos away from you. We just aren't ready yet. So I love the open access policy. And I think that what I like about it more than anything else is
the fact that we are staunchly non-vendor. We're non-technology specific and we're non-vendor.
So it's not like, say, AWS reInvent, for example, or any of the big cloud vendor conferences, we are picking vendor-neutral content by quality.
And as well, anyone who's ever sponsored SREcon
or any of the other events will also tell you
that that does not get you a talk in the conference program.
So the content selection is completely independent.
And in fact, we have a complete Chinese wall
between the sponsorship organization
and the content organization.
So, I mean, I really like how we've done that. I think as well, it's for a long time been one of
the family of conferences or organizations of conferences that has had the best diversity.
Not perfect, but certainly better than it was. Although very, very unfortunately, I see
conference diversity everywhere going down after the pandemic, which is particularly gender diversity, which is a real shame.
I've been a fan of the SREcon conferencesMEA that I co-presented with John Looney,
which was fun because he and I met in person for the first time three hours beforehand,
beat together our talk, then showed up to an hour beforehand, found there would be no
confidence monitor, went away for the next 45 minutes and basically loaded it all into
short-term cash and gave a talk that we could not repeat if we had to for a million dollars, just because it
was so, you're throwing the ball to your partner on stage and really hoping they're going to be
able to catch it. And it worked out. It was an anger subtext translator skit for a bit, which
was fun. All the things that your manager says, but actually means, you know, the fun sort of
approach. It was zany, ideally had some useful takeaways to it,
but I loved the conference.
That was one of the only SRE cons
that I found myself not surprised to discover
was coming to town the next week.
Because for whatever reason,
there's presumably a mailing list that I'm not on somewhere
where I get blindsided by,
oh yeah, hey, didn't you know SRE con is coming up?
There's probably a notice somewhere
that I really should be paying attention to, but on the plus side, I get to be delightfully surprised every
time. Indeed. And hopefully you'll be delightfully surprised in March 2024. I believe it's the 18th
to the 20th when SRECon will be coming to town in San Francisco, where you live.
So historically, in addition to the work with Usenix, which is, again, not your primary
occupation most days, you spent over five years at Google, which of course means that you have
strong opinions on SRE. I know that that is a bit dated, where the gag was always, it's only called
SRE if it comes from the Mountain View region of California. Otherwise, it's just sparkling DevOps.
But for the initial take of
a lot of the SRE stuff was, here's how to work at Google. It has progressed significantly beyond
that to the point where companies who have SRE groups are no longer perceived incorrectly as,
oh, we just want to be like Google or we hired a bunch of former Google people.
But you clearly have opinions to this. You've contributed to multiple books on SRE. You have spoken on it at length. You have enabled others to speak on it
at length, which in many ways is by far the better contribution. You can only go so far
scaling yourself, but scaling other people, that has a much better multiplier on it, which feels
almost like something an SRE might observe. It is indeed something an SRE might observe.
And also, you know, good catch,
because I really felt you were implying there
that you didn't like my book contributions.
Ah, the shock.
No, to be clear, I meant it,
because I was going to say that strictly speaking,
books are also a great one-to-many multiplier,
because it turns out you can only shove so many people
into a conference hall,
but books have this ability to just carry your words
beyond the room that you're in, in a way that video just doesn't seem to.
Ah, but open access video that is published on YouTube, like six weeks ahead. That's girls.
I wish. People say they want to write a book and I think they're all lying. I think they want to
have written a book. That's my philosophy on it. I do not understand people who've written a book.
Like, so what are you going to do now? I'm going to write another book. Okay. I'm going to smile, not take my eyes
off you for a second and back away slowly. Cause I do not understand your philosophy on that,
but you've worked on multiple books with people. I actually enjoy writing. I enjoy the process of
it because I always, I always learn something when I write. In fact, I learned a lot of things when
I write and I enjoy that crafting. I will say I do not enjoy having written things because for me,
any achievement, once I have achieved it, is completely dead. I will never think of it again
and I will think only of my excessively lengthy to-do list. So I clearly have problems here.
But nevertheless, it's exactly the same with programming projects, by the way.
But back to SRE.
We were talking about SRE.
SRE is 20 now.
SRE can almost drink alcohol in the US.
And that is crazy.
So 2003 was the founding of it then?
Yes.
Yeah, I can do simple arithmetic in my head still.
I wondered how far my math skills had atrophied.
Yes, good job.
Yes, apparently invented in roughly 2003.
So the, I mean, from what I understand,
Google's publishing of the 20 years of SRE at Google, they have, in the absence of an actual
definite start date, they've simply picked Ben Trainor's start date at Google as the start date
of SRE. But nevertheless, discipline about 20 years old. So is it all grown up? I mean,
I think it's become heavily commodified. My feeling about SRE is that it's always been this, I mean, you said it earlier, like it's about, you know, how do
I scale things? How do I optimize my systems? How do I intervene in systems to solve problems, to
make them better, to see where we're going to be in pain in six months and work to prevent that.
That's kind of SRE work to me, you know, figure out where the problems are, figure out good ways to intervene and to improve. But there's a lot of SRE as bureaucracy around at
the moment where people are like, well, we're an SRE team. So, you know, you will have your
SLO golden signals and you will have your production readiness checklists, which will
be the things that we say, no matter how different your system is from what we designed this checklist for. And that's it. We're doing SRE now. It's great. So I think we miss a lot there.
My personal way of doing SRE is very much more about thinking not so much about the day-to-day
SLO excursion type things, because not that they're not important, they are important,
but they will always be there. I always tend to spend more time thinking about how do we avoid the risk of, you know, a giant production fire that will take
you down for days or, God forbid, more than days. You know, the sort of big Roblox fire or the time
that Meta nearly took down the internet in late 2021, that kind of thing. So I think that modern SRE misses quite a lot of that. It's a little bit like,
so when BP, when they had the Deepwater Horizon disaster, on that same, very same day,
they received an award for minimizing occupational safety risks in their environment. So, you know,
it's just things like people tripping. Must've been fun the next day. Yeah, we're going to need that back.
People tripping and falling
and, you know,
hitting themselves with a hammer.
They got an award
because they were so safe.
They had very little of that.
And then this thing goes boom.
And now they've tried to pivot
into an optimization award
for efficiency.
Like we just decided to flash fry
half the sea life in the Gulf at once.
Yes, extremely efficient.
So, you know,
I worry that we're doing sre a little bit
like bp we're doing it back before deepwater horizon i should disclose that i started my
technical career as a grumpy old unix sysadmin because it's not like you ever see one of those
who's happy or young didn't matter that i was 23 years old i was grumpy and old. And I have viewed the evolution since then of I going from
calling myself a sysadmin to a DevOps engineer, to an SRE, to a platform engineer, to whatever
we're calling it this week. I still view it as fundamentally the same job in the sense that
the responsibility has not changed. And that is keep the site or environment up.
But the tools, the processes, and the techniques we apply to it have evolved.
Is that accurate? Does it sound like I'm spouting nonsense?
You're far closer to the SRE world than I ever was.
But I'm curious to get your take on that perspective.
And please, feel free to tell me I'm wrong.
No, no, I think you're completely right.
And I think one of the ways that I think has shifted, and it's really interesting, but when you and I were, when we
were young, we were, we could see everything that was happening. We were deploying on some sort of
Linux box or other sort of Unix box somewhere, most likely. And we, if we wanted, we could go
and see the entire source code of everything that our software was running on. And kids these days, they're coming up and
they are deploying their stuff on RDS and ECS. And, you know, how many layers of abstraction
are sitting between them? I run Kubernetes. That means I don't know where it runs and neither does
anyone else. It's great. Yeah. So there's no transparency anymore in what's happening. So it's very easy. You get to a point where sometimes you hit a problem
and you just can't figure it out
because you do not have a way to get into that system
and see what's happening.
Even at work, we ran into a problem
with Amazon-hosted Prometheus.
We were like, this will be great.
We'll just do that.
And we could not get some particular type
of remote write operation to work. We just
could not. Okay. So we'll have to do something else. So one of the many, many things I do when
I'm not trying to run the SREcon conference or do actual work or definitely not write a book,
I'm studying at Lund University at the moment. I'm doing this master's degree in human factors
and system safety. And one of the things I've realized since doing that
program is in tech, we missed this whole 1980s and 1990s discipline of cognitive systems theory,
cognitive systems engineering. This is what people were doing. They were like, how can people in the
control room, in nuclear plants and in the cockpit, in the airplane, how can they get along with their
systems and build a good
mental model of the automation and understand what's going on? We missed all that. We came of
age when safety science was asking questions like, how can we stop organizational failures
like Challenger and Columbia, where people are just not making the correct decisions?
And that was a whole different sort of focus. So we've missed all of this 1980s and 1990s
cognitive system stuff. And there's this really interesting idea there where you can build two
types of systems. You can build a prosthesis, which does all your interaction with the system
for you. And you can see nothing, feel nothing, do nothing. It's just this black box. Or you can
have an amplifier, which lets you do more stuff than you could do just by yourself but but lets you
still get into the details and we build mostly prostheses we do not build amplifiers we we're
hiding all the details we're building these very very opaque abstractions and i think it's to the
detriment of i mean it makes our life harder in a bunch of ways but i think it also makes life
really hard for
systems engineers coming up because
they just can't get into the systems as easily anymore
unless they're running them themselves.
I have to
confess that I have a
certain aversion
to aspects of SRE
and I'm feeling echoes of it
around a lot of the human factor stuff
that's coming out of that Lund program.
And I think I know what it is.
And it's not a problem with either of those things, but rather a problem with me.
I have never been a good academic.
I have an eighth grade education because school is not really for me.
And what I loved about being a systems administrator for years was the fact that it was solving puzzles every day.
I got to do interesting things.
I got to chase down problems and firefight all the time. And what SRE has represented is a step
away from that to being more methodical, to taking on keeping the site up as a discipline rather than
an occupation or a task that you're working on. And I think that a lot
of the human factors stuff plays directly into it. It feels like the field is becoming a lot
more academic, which is a luxury we never had when, holy crap, the site is down. We're going
to go out of business if it isn't back up immediately. Panic mode. I'm going to confess
here. I have three master's degrees. Three. I have problems, like I said before. I get what you
mean. You don't like when people are speaking in generalizations and sort of being all theoretical
rather than looking at the actual messy details that we need to deal with to get things done,
right? And I know what you mean. I feel it too. And I've talked about the human factors stuff
and theoretical stuff a fair bit at conferences. And what I always try to do is I always try and illustrate with the details because I think it's very easy to get away from the actual problems and spend too much time in the models and in the theory.
And I like to do both.
I will confess I like to do both.
And that means that the luxury I miss out on is mostly sleep.
But here we are. I am curious as far as what you've seen,
as far as the human factors adoption in this space, because every company for a while
claimed to be focused on blameless postmortems, but then there'd be issues that quickly turned
into a blame Steve postmortem instead. And it really feels, at least from a certain point of view, that there was a time
where it seemed to be gaining traction, but that may have been a zero interest rate phenomenon,
as weird as that sounds. Do you think that the idea of human factors being tied to
keeping systems running in a computer sense has demonstrated staying power? Are you seeing a
recession? It could be I'm just looking at headlines too much. It's a good question. There's still a lot of people interested in it.
There was a conference in Denver last February that was decently well attended for, you know,
a first initial conference that was focusing on this issue and this very vibrant Slack community,
the LFI and the learning from incidents and software community. I will say everything is
a little bit stretched at the moment in industry,
as you know, with all the layoffs and a lot of people are just,
there's definitely a feeling that people want to hunker down and do the basics
to make sure that they're not, you know,
not seen as doing useless stuff and on the line for layoffs.
But the question is, is this stuff actually useful or not?
I mean, I contend that it is.
I contend that we can learn from failures,
we can learn from what we're doing day to day,
and we can do things better.
Sometimes you don't need a lot of learning
because what's the biggest problem is obvious, right?
You know, in that case, yeah,
your focus should just be on solving your big obvious problem, for sure.
It feels there's a hierarchy of needs here on some level.
Step one, is the building currently on fire? Maybe solve that before thinking about the longer term
context and what this does to corporate culture. Yes, absolutely. And I've gone into teams before
where people are like, oh, well, you're an SRE, so obviously you wish to immediately introduce
SLOs. And I can look around me and go, nope, not the biggest problem right now. Actually,
I can see a bunch of things are on fire. We should fix those specific things. I actually personally think that if you
want to go in and start improving reliability in a system, the best thing to do is to start a weekly
production meeting if the team doesn't have that. Actually create a dedicated space and time for
everyone to be able to get together, discuss what's been happening, discuss concerns and risks,
and get all that stuff out in the open.
I think that's very useful, and you don't need to spend however long it takes to formally sit down
and start creating a bunch of SLOs,
because if you're not dealing with a perfectly spherical web service
where you can just use the golden signals,
and if you start getting into any sorts of thinking about
data integrity or backups or any sorts of thinking about data integrity or backups
or any sorts of asynchronous processing, these sorts of things, they need SLOs that are a lot
more interesting than your standard error rate and latency. Error rate and latency gets you so far,
but it's really just very cookie cutter stuff. But people know what's wrong with their systems,
by and large. They may not know everything that's wrong with their systems,
but they'll know the big things for sure.
Give them space to talk about it.
Speaking of bigger things and turning into the idea of these things escaping beyond pure tech,
you have been doing some rather interesting work
in an area that I don't see a whole lot of people that I talk to communicating
about. Specifically, you're volunteering for the Campaign to Stop Killer Robots, which 10 years ago
would have made you sound ridiculous. And now it makes you sound like someone who is very
rationally and reasonably calling an alarm on something that is on our doorstep. What are you
doing over there? Well, I mean, let's be real. It sounds ridiculous because it is ridiculous. I mean,
who would let a computer fly around in the sky and choose what to shoot at? But it turns out that
there are, in fact, a bunch of people who are building systems like that. So yeah, I've been
volunteering with the campaign for about the last five years, since roughly around the time that I
left Google, in fact, because I got interested in that around about the time that Google was doing the Project Maven work,
which was when Google said, hey, wouldn't it be super cool if we took all of this DoD video
footage of drone video footage and, you know, did a whole bunch of machine learning analysis on it
and figured out where people are going all the time. Maybe we could click on this house and see like a whole timeline of people's comings and goings and which other people they are sort of
in a social network with. So I kind of said, maybe I don't want to be involved in that. And I left
Google. I found out that there was this campaign and this campaign was largely lawyers and
disarmament experts, people of that nature, philosophers, but also
a few technologists. And for me, having run computer systems for a large number of years
at this point, the idea that you would want to rely on a big distributed system running over
some janky network with a bunch of 18-year-old kids running it to
actually make good decisions about who should be targeted in a conflict seems outrageous. And I
think almost every software operations person or in fact software engineer that I've spoken to
tends to feel the same way. And yeah, there is this big practical debate about this in
international relations circles, but luckily there has just been a resolution in the UN just in the last day or two as we record this. The first committee has,
by a very large majority, voted to try and do something about this. So hopefully we'll get
some international law. The specific interventions that most of us in this field think would be good
would be to limit the amount of force that an
autonomous weapon, or in fact, an entire set of autonomous weapons in a region, would be able to
wield. Because there's a concern that should there be some bug or problem or sort of weird factor
that triggers these systems to... It's an inevitability that there will be. That is not
up for debate. Of course it's going to break. In 2020, the template slide deck that AWS sent out for reInvent speakers
had a bunch of clip art, and one of them was a line art drawing of a ham with a bone in it.
So I wound up taking that image, slapping it on a t-shirt, captioning it AWS ham bone,
and selling that as a fundraiser for 826 National.
Now, what happened next is that for a while, anyone who tweeted the phrase AWS Hambone
would find themselves banned from Twitter for the next 12 hours due to some weird algorithmic
thing where it thought that was doxing or harassment or something.
And people on the other side of the issue that you
are talking about are straight-facedly suggesting that we give that algorithm and ban tool a gun.
Or many guns. I'm sorry, what? Absolutely. Yes. Or missiles or let's build a whole bunch of them
and turn them loose with no supervision, just like we do with junior developers.
Exactly. Yes. So many people think this is a great idea, or at least they purport to think
this is a great idea, which is not always the same thing. I mean, there's a lot of
different vested interests here. Some people who are proponents of this will say, well,
actually, we think that this will make targeting more accurate. Less civilians will actually
die as a result of this.
And the question there that you have to ask is, there's a really good book called Drone by Shamayou,
Greg Roy Shamayou. And he says that there's actually three meanings to accuracy. So are
you hitting what you're aiming at is one of it, one thing. And that's a solved problem in military
circles for quite some time. You've got, you know got laser targeting, very accurate. Then the other
question is, how big is the blast radius? So that's just a matter of how big an explosion
are you going to get? That's not something that autonomy can help with. The only thing that
autonomy could even conceivably help with in terms of accuracy is better target selection. So instead
of selecting targets that are not valid targets, selecting more valid targets. But I don't think
there's any good reason to think that computers can solve that problem. I mean, in fact, if you
read stuff that military experts write on this, and I've got, you know, lots of academic handbooks
on military targeting processes, they all tell you it's very hard and there's a lot of gray areas,
a lot of judgment. And that's exactly what computers are pretty bad at. Although, mind you,
I'm amused by your Hambone story, and I want to ask if AWS Hambone is a database.
Anything is a database if you hold it wrong. It's fun. I went through a period of time where I,
just for fun, I would ask people to name an AWS service, and I would talk about how you could use
it incorrectly as a database. And then someone mentioned, what about AWS Neptune, which is their
graph database,
which absolutely no one understands. And the answer there is I give up. It's impossible to
use that thing as a database, but everything else can be like, you know, the tagging system. Great.
That has keys and values. It's a database now. Welcome aboard. And I didn't say it was a great
database, but it is a free one and it scales to a point. Have fun with it.
All I'll say is this, you can put labels on anything.
Exactly.
We missed you at the most recent SREcon in Mia.
There was a talk about Google's internal chubby system
and how people started using it as a database.
And I did summon you in Slack, but you didn't show up.
No, sadly, I've gotten a bit out of the SRE space.
And also, frankly, I've gotten a bit out of the SRE space. Also, frankly, I've gotten out of the community space for a little while when it comes to conferences.
And I have a focused effort to start a 2024 to start changing that.
I am submitting CFPs left and right.
My biggest fear is that a conference will accept one of these because a couple of them are aspirational.
Here's how I built a thing with generative AI, which, spoiler, I have done no such thing yet, but by God, I will by the time I get
there. I have something similar around Kubernetes, which I've never used in anger, but soon will if
someone accepts the right conference talk. This is how I learned Git. I shot my mouth off in a CFB
and I had four months to learn the thing. It was effective, but I wouldn't say it was the best
approach. You shouldn't feel bad about lying about having built things in Kubernetes and with LLMs because everyone has, right?
Exactly. It'll be true enough by the time I get there.
Why not? I'm not submitting for a conference next week. We're good.
Yeah, future Corey is going to hate me.
Have it build you a database system.
I like that.
I really want to thank you for taking the time to speak with me today.
If people want to learn more, where's the best place for them to find you these days?
I'm sort of homeless on social media since the whole Twitter implosion,
but you can still find me there. I'm Laura Lifts on Twitter and I have the same tag on Blue Sky,
but haven't started to use it yet. Yeah, socials are hard at the moment. I'm on LinkedIn.
Please feel free to follow me there if you wish to message me as well.
And we will of course put links to that in the show notes. Thank you so much for taking the time
to speak with me. I appreciate it. Thank you for having me.
Laura Nolan, Principal Software Engineer at Stanza. I'm cloud economist, Corey Quinn,
and this is Screaming in the Cloud.
If you've enjoyed this podcast, please leave a five-star review on your podcast platform of
choice. Whereas if you hated this podcast, please leave a five-star review on your podcast platform
of choice, along with an angry, insulting comment that soon, due to me screwing up a database system,
will be transmogrified into a CFP submission for an upcoming SRE Cup.
If your AWS bill keeps rising and your blood pressure is doing the same, then you need the Duck Bill Group.
We help companies fix their AWS bill by making it smaller and less horrifying.
The Duck Bill Group works for you, not AWS.
We tailor recommendations to your business and we get to the point.
Visit duckbillgroup.com to get started.