Screaming in the Cloud - Episode 34: Slack and the Safety Dance of Chaos Engineering
Episode Date: October 31, 2018In the early days, angry nerd corners on the Internet viewed Slack and some of its predecessors as, “Oh, it’s just IRC. Now, you pay someone for it.” Many fell into that trap of wonderi...ng about what value such systems offered.The big differentiator? Slack is built as a collaborative business tool. Today, we’re talking to Holly Allen, who helped make government software better while serving as the director of engineering at 18F. Now, she’s a senior engineering manager at Slack, a collaborative chat program where you can do most of your work through a rich platform of integrations. Holly enjoys taking a weird set of skills that make a computer do things and convincing people who know how to make computers do things do things. Some of the highlights of the show include: Safety engineering brings chaos and resilience engineering, incident management, and post-mortem processes together for resiliency and reliability Slack strives to move really fast while being in complete control Slack is primarily on AWS, but is working on a multi-Cloud strategy because if AWS is down, Slack still needs to work Slack has a close relationship with AWS and is a collaborative company; it has immediate access to AWS staff anytime there’s a problem Slack uses Terraform and Chef and working to determine if its production workflows in Kubernetes would be worthwhile Disasterpiece Theater: Real scenario that might happen and surmise what will happen; don’t cause production issues, but teach Slack employees Slack hires collaborative, empathetic people to create a collaborative environment where everyone works together toward a goal Slack was firmly in a centralized operations model, but is transforming toward development teams to increase responsibility and service ownership Slack doesn’t encourage remote work because it’s not in a position to put in that investment; day-to-day work happens in hallways and between desks Slack sees itself as an enterprise software company; an enterprise software company must have enterprise software reliability, stability, and processes Slack has thousands of servers, so events and disruptions happen more often; system needs to respond, react, and repair itself without human intervention Links: Holly Allen on Twitter 18F Slack Freenode IRC HipChat AWS Kubernetes Terraform Chef QCon Datadog .
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, cloud economist Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
This week's episode is sponsored by Datadog.
Datadog is a monitoring
and analytics platform that integrates with more than 250 different technologies, including AWS,
Kubernetes, Lambda, and Slack. They do it all. Visualizations, APM, and distributed tracing.
Datadog unites metrics, traces, and logs all into one platform so that you and your team can get full visibility into your infrastructure and applications.
With their rich dashboards, algorithmic alerts, and collaboration tools, Datadog can help your team learn to troubleshoot and optimize modern applications.
If you give it a try, they'll send you a free t-shirt.
I've got to say I love mine. It's comfortable and my toddler points at it and yells,
Dog! Every time that I wear it. It's endearing when she does it and I've been told I need to leave their booth at reInvent when I do mine. It's comfortable, and my toddler points at it and yells, dog, every time that I wear it. It's endearing when she does it, and I've been told I need to leave their booth at
reInvent when I do it. To get yours, go to screaminginthecloud.com slash datadog. That's
screaminginthecloud.com slash d-a-t-a-d-o-g. Thanks to Datadog for their support of this podcast.
Hello and welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined today by Holly Allen, Thanks to Datadog for their support of this podcast. at Slack. That's right. Okay, wonderful. First, thanks for taking the time to join me. Absolutely.
And secondly, let's pretend for a second
that I've been living under a rock
for the last five years.
What's a Slack?
Slack is a collaborative chat program
and it tries really hard to differentiate itself
from other chat programs
by being a program where you can do most of your work. And it does that through
providing a really rich platform of integrations. So for example, you can make a Jira ticket from
a message or you can write Slack bots that automate sections of your workflow. And as a
manager, I really appreciate that I can approve vacation time from the workday integration without
leaving Slack.
So it's definitely more than just chatting.
It's a place where you can do all of your work.
Once upon a time, I was volunteer staff for the Freenode IRC network, where we spent an
awful lot of time yelling at people on the internet.
We considered it almost multiplayer notepad as far as that collaboration aspect goes. And when Slack and some of its predecessors first came out,
there was a sense, to some extent, I'm not sure it's ever really left some of the
arcane, angry nerd corners of the internet that some of us still spend time in,
that, oh, it's just IRC. It's something that
you can put IRC in the cloud and that's all it is, but now you pay someone
for it.
And I admit, in early days, I fell into that trap myself, where it seemed like, what's the value add on this?
And I have my answer to that, but what do you see as being the big differentiator?
That's a great question. I think that because Slack is really built as a collaborative business tool, first and foremost,
those integrations that I talked about really make it for me.
I know that at a previous, two companies ago, I used to use HipChat.
And now for two companies, I've been using Slack.
And I was pretty shocked, frankly, at how much better Slack is
in terms of getting your day-to-day work done on the platform.
I'm not going to disagree with anything that you said,
but the added benefit from my perspective was when I was running operations teams
and started to roll out something company-wide,
was instead of giving non-technical users in business functions
a 20-item list of what to do and midway through have to send someone over to help them with it,
I could send them a single link and suddenly there they were. They were part of the conversation.
They were participating. And that, for me at least, was an absolutely transformative experience.
Absolutely. Running incident process or any technical process in Slack is pretty amazing.
It was definitely built from the ground up for those sorts of workflows. So being a service
engineering manager, I get to see that every day.
So you're in service engineering.
What does that group do?
Engineering is one of those areas where across the board,
titles and department group names don't tend to translate from one company to another.
What is service engineering at Slack?
Service engineering at Slack at a different company would be called operations.
We have storage, we have site reliability engineering, Service engineering at Slack at a different company would be called operations.
We have storage, we have site reliability engineering,
we have observability, visibility teams,
internal tools, build and release.
So all of these functions are within service engineering.
I'm sorry, did you say storage?
As in there's a giant sand living around somewhere in San Francisco that has a giant pile of disks somewhere?
Absolutely not. Every bit of our storage and compute is up in the cloud. So I mean,
MySQL, Vitesse, Redis, Kafka running up in the cloud.
Wonderful. Well, we'll get there in a minute or two. But your specific area of service engineering
is what exactly?
I have storage, site reliability engineering, build and release, and safety engineering under my remit.
Most of those make some degree of sense to me, except for safety engineering.
What is that? That's something that's new to me.
Safety engineering is our attempt to bring everything that's great about chaos engineering, resilience engineering,
and really good incident management and postmortem process together
to help make Slack engineering more resilient and reliable.
What inspired Slack to call it safety engineering
rather than a lot of the, shall we say, trendy terms of art at the moment?
For example, chaos engineering, regardless of what it means, is a really cool title.
Safety engineer doesn't seem to have that same aggressive,
move fast, break all the things type of aspect to it.
Absolutely.
Well, you know, Slack is supposed to be where work happens,
so you can't really be breaking all the things.
The analogy that my boss, Dusty Pierce, uses is really good.
It's of a race car, right?
You want to be able to go as fast as possible around that racetrack
safely and not crash and be able to make a pit stop incredibly fast and have it be as boring
as possible. So I think we really take that to heart and say, we can still move really fast
and do it in a way that we feel completely in control and that we're still providing a really
great product, even as we try
to move as fast as possible. So getting back to what you said about not having a giant sand living
in a data center somewhere, Slack is a heavy AWS user, or at least so the general public can
probably surmise, given the fact that you've had executives at various AWS event stages,
you've been mentioned in a whole bunch of slides with your logo showing up when Amazon does an event.
So either you're an AWS customer
or you have an incredible marketing arrangement with AWS
where they get to pretend that you are.
So are you all in on AWS
or are you looking at multiple cloud providers these days
to deliver the service?
We are primarily in AWS right now, that is true.
But we're working hard on our multi-cloud strategy
because Slack is one of those few companies
that other software companies really, really need to work.
You were mentioning earlier that working together
in a technical context in Slack can be really transformative.
And we certainly find that ourselves as we run our own incident process in Slack when that's possible.
So when, say, AWS is having a big problem, it's actually really important that Slack be able to work so that everyone else that's using Slack can run their
incident and recovery process in Slack. And so in success, that'll be one of our measures.
When a big part of AWS is down, when a big availability zone in AWS is down, Slack still
works. A common thread through a number of these episodes that I've hosted has been around the idea that as a best practice in isolation,
going to multiple cloud providers as a design goal is generally not a great idea. It's something that
incurs an awful lot of complexity and there are generally not a terrific number of reasons for
people to pursue that path. You just touched on one of the examples that I like to give,
which is the idea of pager duty.
When something is responsible for waking you up because there's been an outage of a major provider,
whatever it is that's waking you up definitionally has to be able to withstand the outage of that
provider itself. So there is a narrative where making sure that Slack works despite any given
cloud provider, whoever that provider might be, is important.
I guess the question I have for you as you go down that path is, are you viewing all
of what Slack does, by which I mean not just the communications portion, but also the user
onboarding flows, the marketing sites, all of the various integrations?
Is that something that there's perceived value to Slack in for being able to migrate those
things seamlessly
between providers almost on a whim? Or are you focusing largely on individual specific workflows?
That's a great question. Well, first and foremost, our status site, for example,
has to be in a different availability zone, a different cloud provider is what we've chosen,
a different cloud provider. Well, credit where due as well. Your status site has been unfailingly honest when there's
been an issue, as opposed to the entire internet's on fire and I'm looking at a sea of green,
like some providers whom I'm too polite to name. AWS.
We definitely take our status site very, very seriously to the point where if a major integrator,
for example, Giphy, is down, we will actually tell
people on our status site, by the way, you might notice that the slash Giphy command is not working
right now. We apologize and hope that it will be up again soon. I simply can't do my work without
funny videos of cats doing things in a loop. I mean, who can? But when we talk about our strategy
for multi-cloud and those failure scenarios of AWS, for example, having a major outage,
then what we're really talking about is being able to connect, send, and receive a message.
That is bare bones, right?
If URLs aren't unfurling and if Giphy's not working,
and I mean, maybe we have to add emoji reacting into this list
because honestly I don't know if I can do my work
without emoji reacting either.
Exactly, without that you have to talk to people
using actual words and possibly even complete sentences.
That's something that I generally can't countenance myself.
Realistically if an emoji doesn't convey the depths
of what I'm trying to tell them, how important could it be?
So as a large customer of AWS,
are you able to talk at all about what your customer experience with them has been?
And I'm not saying that in a sense of,
please, this is an open invitation to throw a cloud provider under the bus.
No, we do those off the microphone.
But my question about it is,
what has your experience been as far as seeing new things come to light,
seeing service offerings evolve,
experience about the reliability? I mean, the challenges that the rest of us have with our
$12 a year AWS accounts generally tends to be mired in frustration from time to time.
Is that something that goes away with scale? Does it get worse?
That's a good question. We have a really close relationship with AWS. So I feel like one of the best things about being Slack scale is that we have very, very immediate access to AWS staff and our means that when we're seeing a problem, we can
immediately write to them in our support channels and say, hey, we're seeing these problems in these
areas, what's going on? So the big K that everyone is talking about these days is Kubernetes,
the container orchestration system that people can neither spell nor properly pronounce,
nor in many cases articulate a business case for, yet somehow people are barreling forward into it.
Is Slack currently working with containers?
Is it something that you're considering?
Is it something that you're potentially pushing off
until other things get done?
How does a darling of the internet age today,
as Slack is, tend to think about this stuff?
That's a great question.
Right now we are using Terraform and Chef,
and we are actively working on the Kubernetes project
to see if getting some of our production workflows
into Kubernetes will be really valuable.
We surmise, of course, that it will be.
Otherwise, we wouldn't be spending the time.
But we're running experiments to sort of de-risk this project
to make sure that it is going to give us the benefit that we expect. In my work with some of my clients, a common
pattern that I see that tends to emerge is there's a certain instinctive desire to keep their account
reps from AWS at arm's length. They don't tend to loop them into strategy planning sessions.
They don't tend to tell them when they're starting to see an outage. And it's almost as if they're ignoring a tremendous
amount of the value that a close relationship with a cloud provider can lend to the process.
From what you're saying, that's not a problem at Slack. How have you been able to get away
from a pattern where there's an instinctive distrust of outsiders?
Well, I think that Slack is built from the ground up to be a collaborative company.
And so that is definitely part of it.
I think like most companies, we could do better at the looping them in at the early stages
of a design.
Although we're pretty good at looping them in when we see any signs of trouble.
Everything's on fire.
Is it us or is it you?
And the honest answer in some cases
at something like Slack scale is,
does it matter?
Either way, it's going to be a bad day for the internet.
That's correct, yes.
Either way, make it work again.
So you mentioned a minute ago
about when we talked about Kubernetes,
the fact that you're viewing this
as something of an experiment.
I'd like to get a little bit into the idea of Slack's culture at this point. How do you folks
view experiments in a scenario where, as you mentioned earlier, you're working with people's
production workloads, where if you start working with the ideas of chaos engineering and intentionally
degrading certain aspects of the service,
and this has a customer impact,
how do you square that circle?
Because you're not going to be able to significantly improve safety
or reliability without those experiments.
But doing them does seem ethically challenging
when companies depend upon a service being available to do their work.
I know that when Slack goes down, I'm having a terrible day.
Virtually all of my clients are having a terrible day.
And Ops Twitter gets very grumpy.
Although Ops Twitter also gets very friendly.
The last time we had a major outage, we got about half a dozen deliveries of cookies.
So that was really lovely.
Oh, that's adorable.
One of the chaos programs we have is very, very lightweight compared to many other more mature chaos
programs out there, but it is called Disaster Peace Theater. And what we do in Disaster Peace
Theater- I'm sorry, did you say Disaster Peace Theater?
Disaster Peace Theater. And what we do in Disaster Peace Theater is we create a scenario,
which is a real scenario that might happen, for example, an edge pop going down,
and surmise what we think will happen when that happens.
Where will the traffic flow, for example?
How will the reconnects happen?
We will run that experiment first in a non-production environment
to make sure that we aren't about to do something really ridiculous that will ruin people's day.
And then when we feel very confident in a very controlled way, we will do it in production.
And I have been able to witness about a dozen of these, and we have never caused a production issue in this way. So we get around that with very, very careful planning and doing the experiments
that we know will not cause production issues and yet will also teach us something.
What you speak about alludes to a somewhat interesting reputation that Slack has in the
larger community. And I say interesting because Slack historically hasn't told a whole lot of
stories about how it works internally from an engineering context.
So people guess and speculate wildly.
And there is an established reputation for Slack having a highly collaborative culture,
which I suppose does make sense since you're making a tool that primarily focuses on collaboration.
How much of that is accurate and how much of that is just a wonderful marketing story?
And as a manager, your primary go-to tool is a horse whip.
That's a great question because it was really on my mind when I joined Slack.
Being in collaborative environments where we're all trying to get roughly the same thing accomplished and we're all working together towards that goal is really, really important to me personally.
And so I did a lot of homework, both in the interviews, but then back channeling.
And I'm happy to report
that it's totally true.
The reputation is well-deserved.
We hire for collaborative,
empathetic people,
and that makes for a pretty good
set of coworkers.
I will say that in the brief time
I've been sitting here with you,
no one has barged into the conference room
and yelled at me to get out.
And people at the front desk
were extremely nice as well.
So fundamentally, yeah, so far the people that I know who work at Slack tend to mirror exactly what you're saying.
I haven't seen anyone crying at their desk as I walk on my way over here.
That might be on a different floor.
So somewhat related to your culture, how does Slack view operations?
You said earlier that in another company,
what service engineering does would better be described as operations.
And I'm seeing, to some extent, across a large swath of the tech sector, a bit of a tension between an idea of a centralized ops model
versus a model where developers are responsible for the code that they write once it enters production.
Are you able to comment at all on where Slack is on that spectrum?
We are in motion. I would say about a year, year and a half ago, we were firmly in that
centralized operations model where there was one group of humans that got almost every page.
Those pages were pretty low level and developers were never on call.
And we are in the maybe final stages of the first part of that transformation towards development
teams having a bigger list of things that they're responsible for. And the thing that we're calling
this here is service ownership. So putting the tools and
processes in the hands of the development teams that are writing the software to support it in
a more rich way through its whole lifecycle, including getting pages for it. So service
engineering in that world becomes the maker of those platforms, the cloud platform you're
deploying to the observability tools you're using to make your alerts
and figure out how your software is doing in production,
and of course, site reliability engineers
who can help you with specialized knowledge and experience
and embed with your team to make your service more reliable,
more performant, easier to recover in an outage,
whatever the problem for your service happens to
be. Do you find that that tends to lead to an interesting cultural reaction from a perspective
of implementing this? Because historically, even when I was on operations teams and one day people
would throw something over the wall and say, congratulations, when this thing breaks,
we're going to be waking you up in the middle of the night and expecting you to fix it. And my response was a poorly articulated version of, what? And I can't
imagine that that instinctive knee-jerk reaction of no is something that's gone away since the
time I was hands-on keyboard in an engineering sense. How are you finding that being adopted
culturally? That's a great question.
And with any change like this,
you're going to have the narrative that is company-wide,
and then you're still going to have pockets of individual reactions.
It's happened slowly.
So last fall, engineering teams started going on-call.
They were escalation on-call only.
So if a human on the operations team
found that that
development team software was having a problem, then the operations team could page them. And so
switching from being on call but not having machines page you to being on call and having
machines page you is not nearly as big of a change as not being on call at all and suddenly you've got machine pages.
So that's part of it.
Another part is that it has become clear at Slack that the way that software teams can ensure that their software does what they need it to do in production is really to just be with that software through its whole life cycle.
And even though no one likes being woken up in the middle of the night and no one likes being told, hey, you're on call this week, the fact is that everyone here does want Slack to succeed and does want Slack to be up.
And so they're all willing to do their part.
One common observation that I've made historically has been that as an industry, technology is so innovative and so disruptive that we've taken a job that can
be done from literally anywhere on the planet and created a land crunch in a single eight square
mile block of land that's in the middle of an earthquake zone. So to that end, I'll take that
a step further even. Slack is one of those tools that enables people to collaborate remotely. In
fact, I have a couple of clients who are pure remote and spend a bulk of their time talking
to one another via Slack. So I guess my question is, why does Slack not encourage remote work?
In other words, must be willing to relocate to one of several cities that host a Slack engineering office. When I was director of engineering at H&F, I definitely saw the power of having a remote-first workforce.
Almost half of the engineers on that team were working from their home offices in northern Idaho or Wyoming or Texas, wherever it was, definitely not where there were offices.
And coming to Slack and of course hearing this joke many times, I can definitely say... I take no credit for it myself.
I can definitely say that supporting a remote workforce is something that the entire organization
has to bake in. And AT&F was able to do it and support
everybody and support itself and get good work out. And Slack, for better or worse, right now
is not in a position where it's going to put in that investment. And so hiring a large number
of remote workers would not be setting Slack or those workers up for success. They would
just be very isolated because most of the day-to-day work still does happen in a combination
of Slack, yes, and just standing around in the hallways getting work done between desks,
sitting at each other's desks in those old-fashioned co-working kind of situations. And so even, I would say, my colleagues in New York and Dublin and Melbourne
have complaints about how San Francisco doesn't stay in as much touch
with those offices as we should,
even though we have this amazing, groundbreaking tool that is Slack.
So this may sound like a bit of an obvious question to you,
but I am curious.
What made you decide to want to work at Slack?
Someone with your background at 18F, which is widely respected across the aisle, it's
something that people generally should look into if they haven't heard of it already.
I'll throw a link to it in the show notes.
But going from there to Slack seems like an interesting move to make.
Because when you're done doing what you did at 18F,
the world is very much your oyster and you could go virtually anywhere.
What made you pick Slack out of all those other opportunities?
One of the things I've really enjoyed in my technical career
is taking this weird set of skills that are understanding
how to make computers do things and how to convince people
that know how to make computers do things do things as a manager is taking that work to
different domains. So I've been in healthcare and movie making and publishing and then yes,
at 18F in government and civic tech. I had never worked for an enterprise software company,
which sounds weird.
Well, just as a quick question, and there is Slack, an enterprise software company at
this point.
I mean, my exposure to it was when I was at a tiny startup with 40 people, and it was
sort of, oh, they're a small, scrappy startup, just like us.
Given what I'm reading in the papers about various rounds of funding that you've raised
and the discussions that we've had about some of the things you're viewing,
some of the things you're considering from a cloud architecture perspective,
it sounds like it may be time for me to update my mental map of Slack's sense of scale.
Yeah, I think most of us who use Slack have that giant bar of Slack teams that we are in,
and maybe five or six of those might be professional,
and then the rest are social or vaguely professional
but mostly not where you are being paid to do something.
And so it is easy to think of it still as a scrappy startup
and a place for non-work collaborating.
But we have some gigantic enterprise customers.
Our largest customers are 250,000 users in their Slack teams.
And at that point, you are an enterprise software company,
and you also have to have enterprise software reliability
and stability and processes.
And so Slack definitely sees itself as an enterprise software company.
Okay, thank you. I appreciate that.
But fundamentally, there are an awful lot of enterprise software companies you could pursue.
What specifically about Slack was appealing?
Well, first, I had never worked at a company that was doing cloud computing at this scale before.
And so that in itself is teaching me a lot.
The second thing is that there are lots of enterprise software companies, and there are several that are working at this scale.
But Slack was the one that seemed like it was the kind of company where I could come in, be the kind of worker and manager and leader that I want to be because the culture was going to support that.
That was super important to me, And that has definitely been true. When you say that this is exposing you to a scale that you've never seen before,
what changes when you get to that level of scale?
Services and right down to just easy two nodes and the networking between them,
the reliability of any one of those nodes and any availability zone and any other service in
a cloud provider. So the small company is going to have some few dozen EC2 instances,
some networking between them, some storage. That all works great. When you start having about 10,000 servers like
Slack does, events and disruptions in those services that are pretty infrequent start to
happen a lot more often. And having your system be able to respond and react and repair itself
without human intervention from that is really interesting. And also thinking about doing that while providing
as much high performance through, for example, our stateful edge cash flannel and keeping things
really robust is an interesting problem space that I had never been exposed to.
It turns into one of those scenarios where once in a million occurrences start happening I had never been exposed to. to if you ask AWS to spin up 10 instances, in most cases the answer is sure, no problem.
Ask them to spin up 10,000 instances
and suddenly you're getting a very weird phone call from people
where we don't necessarily have that at the moment.
It's the old story of cloud does not scale infinitely.
Source tried to do it.
It winds up becoming something that you have to start viewing
a little bit differently and what happens if certain capacities aren't available.
It was something I never understood either
until I started playing around with companies that are at significant points of scale.
But I agree absolutely with what you're saying
where you have to change the mindset you bring to this.
Do you find that it was what you expected it to be since joining Slack?
Yeah, all the teams that are writing software here at Slack really understand the scale
from the ground up for their services and bake the kind of resiliency into those services that
you need at each of those layers. And in a way that shows that Slack has a really great engineering
culture around operating at this scale.
One thing that continues to impress me about Slack is how, first, you're hiring an ever-increasing number of people that I know, which generally means either I know the right people or we both
have the exact same failure modes as far as judging people on their character. So either way,
at least I'm in good company. But what also continues to encourage me is the nice things people continue to say about same way. I mean, everything is terrible,
all infrastructure is on fire,
and every company is crap.
That's something that you can almost inherently assume on.
I don't get the vibe from people at Slack
that that is how they see the world.
Now, yes, I'm incredibly cynical,
but even the cynical people I know
who you folks have let sneak through the interview process
tend to approach this from the same way. I don't want to say true believers because that comes
across as being very cultish and that's not what I'm trying to get at. But every person I've spoken
to here seems to not only agree with the mission that Slack is on its way towards, they understand
and approve of the way that Slack is going towards that. And that's something that's special.
I don't see that in too many places.
I think that's true.
And I think I was a little maybe blind to that because 18F had it as well.
And it is very special when you're in an environment like that.
People genuinely believe we are going to succeed if we do the right things.
This will work.
We are building something great. we do the right things. This will work.
We are building something great. Let's do it together. And it's pretty magical to be a part of and kind of table stakes for a good job, in my opinion.
To that end, I'm going to go out on a limb and guess that you folks are hiring.
Oh, just a few. We might have a few positions open. We're definitely hiring, always looking for really good engineering talent
and talent in a lot of other departments,
although I'm most familiar with our engineering needs.
Must be willing to relocate to San Francisco?
No, or New York, Melbourne, Dublin, and soon Denver.
Wonderful. That's places people might actually want to live at some point.
What a
thought. One other question. Do you have anything coming up in the near future as far as where
people could wind up hearing more about the things you have to say? Should they follow you on Twitter?
Are you doing anything at a conference sense anytime soon, etc.? Yep. I'm Holly J. Allen
on Twitter. And I'm giving a talk at QCon in San Francisco in November about this journey from
a centralized operations model to an embedded site reliability engineering model.
Perfect. Well, be sure to check that out. Holly, thank you so much for taking the time
to speak with me today.
Thank you.
I'm Corey Quinn. This is Screaming in the Cloud. This has been this week's episode of Screaming in the Cloud.
You can also find more Corey at screaminginthecloud.com
or wherever fine snark is sold.