Command Line Heroes - DevOps_Tear Down That Wall
Episode Date: February 13, 2018As the race to deliver applications ramps up, the wall between development and operations comes crashing down. When it does, those on both sides learn to work together like never before. But what is D...evOps, really? Developer guests, including Microsoft’s Scott Hanselman and Cindy Sridharan (better known as @copyconstruct) think about DevOps as a practice from their side of the wall, while members from various operations teams explain what they’ve been working to defend. Differences remain but with DevOps, teams are working better than ever. And this episode explores why that matters for the command line heroes of tomorrow. Read Cindy Sridharan's attempt to demystify DevOps. And check out Gordon Haff's take on how to improve DevOps here.
Transcript
Discussion (0)
I want you to imagine a wall.
The wall stretches as far as you can see to the right
and all the way off to the left.
It's taller than you. You can't see over it.
And you know there are people on the other side.
Lots of people.
But you just don't know if they're anything like you.
Are they enemies or friends?
Developers created their code and threw it over the wall to operations,
and then it was operations problem.
Just doing whatever they feel like, not really caring about the quality of the service.
These two sides have almost opposing jobs.
One to make changes and one to resist those changes as much as possible. But it's not talking on the same page about what it actually is they're trying to achieve.
I'm Saranya Bark and this is Command Line Heroes, an original podcast from Red Hat.
Episode 4, DevOps, tear down that wall.
So yeah, for decades, the IT world was defined by that division of roles.
You had developers on one side.
They were incentivized to create as much change as quickly as possible.
And then you had the operations team on the other side.
They were incentivized to prevent too much change from happening.
In the meantime, code was getting tossed blindly over that wall
with no real empathy or communication between these two worlds.
What would it take to tear down a wall like that?
It would take a seismic shift.
Last episode, we heard how new Agile methodologies were making it possible to produce constant, iterative improvements.
And that was great.
But with change comes unintended consequences.
Agile increased the rate of changes we were making.
But then suddenly, all that throwing code over the wall and hoping for the best, it just wasn't fast enough anymore.
In our little silos, we were comfy with the way things were.
But siloed people can't get things done as fast as they should.
We'd put a speed limit on ourselves because we weren't working together.
And that speed limit was getting to be more and more of a problem because...
It's all about faster time to market, increased agility,
doing more iterative rather than longer term big pieces of work.
Richard Henshaw is an Ansible product manager.
You know, I remember the days when you put in an order for a server
and it turned up four months later.
Everything was converged together.
So the entire stack was one thing and it took years for those to be designed and built. That doesn't fly anymore. And it's just
disappeared to the point that it's just throw something up, try it, bring it back down again
for a lot of organizations. These days, a company like Amazon will deploy new code several times
every minute. Imagine trying to get that done using some step-by-step waterfall workflow.
It's just impossible. Soon enough, those ops concerns about stability, security, reliability
would get pushed to the side in favor of moving fast. Developers, meanwhile, didn't see it as
their responsibility to produce code that worked in the real world. Developers had little interest
in stability and security issues,
but those are very real issues that we need to address.
So we end up with a lot of needless revisions down the pipe,
back and forth across the divide.
Think how much that division can slow a company down.
Think how inefficient that could get.
But developers were rarely encouraged to look beyond their own command
line. The size of their directories would just grow and grow and they would never clean up.
They wouldn't be able to get any work done without cleaning up. Sandra Henry-Stocker is a retired
sysadmin who writes for the IDG magazines. So I was kind of often having to be a nag saying,
hey, look, you know, using this much
disk space, isn't there something you can get rid of, you know, so that we have more space to work
because we're running out of space on this server? And yeah, we'd go through that a lot.
Ultimately, this is a mindset problem. This divisive attitude between developers and operations
where one didn't have to understand the concerns of the other,
well, in the past, that had been just fine.
But as speed became a premium,
that culture became more and more unstable.
Being siloed in your own work bubble was just way too inefficient.
Jonah Horowitz works for the reliability engineering team at Stripe.
He describes how, even if developers and operations
had wanted to work together, they couldn't have because, in a sense, they'd been placed on opposite
teams. The operations team is often measured by uptime and reliability. And one of the biggest
ways to increase uptime is to decrease the amount of change in the system. But of course, releasing new features is
changing the system. And the software engineers who are doing product work are incentivized to
ship as many features as quickly as possible. So you set up this conflict between dev and ops when
you've got these separate roles. Developers committed to building features.
Operations committed to keeping the site working.
Two goals at odds with each other.
But, like I said, because of the increasing need for speed,
for iterative rapid-fire releases,
this disconnect between dev and ops was reaching a crisis point.
And something had to give.
Around 2009, the wall dividing dev and ops
was looking a lot more like a prison wall than anything else.
What we needed was a new methodology
that would smooth the transition from development to operations,
allowing both sides to work in a faster, more holistic way.
Patrick Dubois, CTO of the video platform Small Town Heroes,
launched a conference for people who wanted to tear down that wall.
He called his brainchild DevOps Days.
He shortened it to DevOps for the hashtag.
And thus the movement was given a name.
But a name is not a process. It was clear why DevOps was needed, but how would it work?
How are we supposed to bring dev and ops together without starting a war?
Thankfully, I have Scott Hanselman to walk me through this. Scott's the Principal Program Manager for.NET and ASP.NET at Microsoft.
So, Scott, I've known you for, I feel like I've known you for forever.
Definitely a few years.
Forever.
And I want to talk to you about the relationship between being a developer and what DevOps has looked like over the years.
How does that sound?
Yeah, that sounds like a plan.
Okay. So I think a good place to start is just defining what DevOps is. How would you describe it?
The Wikipedia from 2008 that defines DevOps is actually very good. So it's a set of practices
that is intended to reduce the time between committing a change and that change going into production while ensuring quality.
So if you think about, hey, I checked in some code, it's Tuesday, and that'll be going out in the June release.
Right?
That sucks.
That would be not continuous integration.
That would be a couple times a year integration.
If you have a good, healthy DevOps system,
if you've done a set of practices,
then you are going to be continuously integrating into production.
So it's what can you do,
what best practices can you define, can you create,
that will allow you to get it? So I
checked in some code on Tuesday and it's in production on Thursday. Now here's the important
part. Pause for effect while ensuring high quality. So what's really interesting about
that definition is it's a set of practices, but I feel like when I hear people talk about DevOps, it's a little bit
more crystallized, I guess. They talk about it like it's a role, a job, a position, a title.
Does that conflict with the idea that it's a set of practices?
I think that when a new set of practices or a new buzzword comes out, people like to put it on a business card.
No disrespect to people who are like listening to this podcast and now are offended and looking at their business cards. This sucks. And now they're going to like, I don't know,
slam their laptop shut and rage quit this podcast. There was a really great thread by
Brian Guthrie, who is a thought worker and he worked at SoundCloud and he talked about DevOps and he said that
DevOps is a set of practices, period. It's not a job title,
it's not a software tool, it's not a thing you install, it's not a team name
and the way he phrased it was, it's not magic enterprise fairy dust.
If you don't have best practices, if you don't have good practices, you have no DevOps.
So it's more a mindset than it is putting out a job title and like, we're going to hire DevOps engineers. And then we're going to sprinkle these magical DevOps engineers into the organization
without the organization having organizational willpower and buying into the mindset that if
DevOps. So if you think it's a toolkit or a thing you install,
then you've missed the point.
Okay, so let's go back in time.
Before DevOps was a term,
before we had DevOps on our business cards
or talked about it as a set of practices 10 years ago,
how would you describe the relationship between developers
and those people who were on the ops side of things? It was rather combative. Like the people in ops controlled production
and developers never got near production. We were on different sides of a wall that was an opaque
wall. And we over in development tried as much as we could to make something that looked
like production, but you never actually, it never looks like production. So we had a couple of
issues. We had development environments that didn't look or feel or smell like production.
So inevitably you'd have those, hey, it works different in production than it does in development
kind of environments. And then the distance between the check-in and when it got into production was weeks
and weeks and weeks.
So your brain wasn't even in the right headspace because I worked on that feature in January
and it's just now rolling out in April.
So then when the bug inevitably comes down, it's not going to be fixed until June.
And I don't even remember what
we were talking about, you know? So people in ops, it was almost like their job was to
consciously slow us down. They existed to make developers slower. And then of course,
they felt that we wanted to break production at all times.
So why was it like that? Was it just a fundamental misunderstanding of what developers
wanted and were trying to do? Was it a trust issue? Why was it so combative?
I think that you nailed that. You answered it all correctly. There was a trust issue.
There was a sense, I think, that developers thought they were special or somehow better
than IT people. And IT people thought that developers had no respect for production.
So I think that that culture came kind of from the top, the idea that we were different orgs and that somehow our goals were different.
I think that there's some maturity that's happened in software where we all realize that we write software in order to move the business forward, whatever that business is. So that sense of we're all pushing in the right
direction, you know, but it was definitely trust because, you know, DevOps engineers don't trust
product engineers to deploy, right? And no one understood the deployment process
and people trusted only themselves. And they also, um like i only trust myself to go into
production i can't trust saran to go into production she doesn't know what she's talking
about i'll do it um so if no one truly understood the system like the idea of a full stack engineer
was a was a mythic thing but now we're starting to think about the whole stack as an organization.
We've had terms like full product ownership,
and the Agile methodology has come along saying that everyone should own the product,
and that sense of community ownership and community around the code
all slowly changes things to bring an environment of trust.
I'm Saran Yitbarek, and you're listening to Command Line Heroes,
an original podcast from Red Hat.
So for DevOps to hit its potential,
we were going to need a lot of trust on both sides.
And that means a lot more communication.
Back to Richard Henshaw.
He sees empathy for both sides as the cornerstone of DevOps.
Some of the DevOps practitioners, some of the really good ones, have done both roles.
And I think that is where the real power comes, is when people actually get to do both roles rather than just seeing the other side.
So you don't keep the separation.
You actually, you know, you are going to live in their shoes for a period of time.
And I think that gives, that's what brings the empathy back.
Now, this isn't just communication for the sake of warm fuzzies. What Richard is describing is the industry swerving toward that focus Scott
mentioned, a focus on continuous integration. Software was going to be not just written and
released in small rapid-fire batches, but also tested in small rapid-fire batches. And that meant
developers needed instant feedback on the code they were writing and how it would perform in the real world. As time to market shrank from months to days to hours, we cast around for a new set of
tools that could automate any element that could be automated. You really need a whole new ecosystem
of tooling to do DevOps most effectively. Gordon Half is a senior manager at Red Hat. What we see is this huge collection of new types of tooling and platforms that DevOps
can make use of, and they're really all coming out of open source.
Gordon's right.
The collection of new tools is huge.
And he's right about the open source angle, too.
The growth of automation tools
never would have been possible in a strictly proprietary system. A lot of monitoring tools
out there. Prometheus is a common one. Istio for service orchestration is starting to
interest a lot of people, so that's out there.
GitHub lets you track changes.
PagerDuty manages digital operations.
NFS mounts file systems across a network.
Jenkins lets you automate testing on your build.
So many tools, so much automation.
The end result?
Developers can push their changes live,
the build is automatically created,
compilation is managed, and automated tests are run against it.
Sandra Henry-Stocker describes what a change this made.
So I could take something that I was working on and rapidly deploy it, and I could control many systems just from the command line on one, rather than having to work at a lot of
different places or wonder how I was going to get something that I was working on sent
across a network and deploy it on a lot of different machines.
It became easier to basically sit in one spot and yet make my changes across a wide range
of computer systems.
Automation tools had solved the speed problem.
But I don't want us to just praise tools at the expense of the actual methodology.
Scott Hanselman and I talked about that fine line.
You started this conversation by saying DevOps is a set of practices.
It's a mindset. It's a way of thinking.
And it sounds like the tools that
we created are the manifestation, the code version of the way we should be thinking and
we should be operating. I love that. You're a genius. Exactly. We used to have the product
owners write in these Word documents about how the code should work. They write the spec, right?
When was the last time a Word document broke the build?
Right.
Okay, partly I just wanted you to hear Scott calling me a genius.
But I do think those tools are almost like symbols of our cultural shift.
They encourage us to broaden our roles.
We developers have been forced to look up,
at least occasionally, from the command line.
That way, the priorities of dev and ops partly come into alignment. In fact, what the rise of
DevOps has made clear is that in a world of ever-increasing speed, nobody can afford to
remain siloed. Jonah Horowitz has worked for a number of Bay Area companies, including Quantcast
and Netflix. He explains how even some of the Area companies, including Quantcast and Netflix.
He explains how even some of the largest companies in the world have reimagined their culture in this light. We had sort of this cultural buy-in from the entire company that was like, this is how
we're going to deploy software. We're going to do it in these small batches. We're going to do it
using these deployment procedures. I don't think DevOps can be, I don't think it can be successful if it's just being driven by the ops team.
It has to be something that the management and leadership of the company buy into.
And it's very much a cultural shift.
When McKinsey surveyed 800 CIOs and IT executives, 80% said they were implementing DevOps in some part of their organization,
and more than half planned to implement it company-wide by 2020.
Executives are realizing that automation tools ramp up the speed of delivery.
These are the same people who used to be okay with having a pallet arrive in a data center
and then have it sit there for a whole month before a new machine was brought online.
Today, if you're waiting longer than 10 minutes
to have something provisioned,
you're doing something wrong.
With competitors hitting speeds like that,
nobody can afford to be left behind.
I can imagine that ops teams must have been nervous, handing all those tools over to developers.
Ops was used to being the grown-up, and now they were supposed to hand over the keys to the car?
Yikes.
I think we developers are learning to move fast without breaking things.
But as the dust settles on the DevOps revolution, the biggest changes may be for the ops team.
Does DevOps actually threaten the role of operations? Is Dev using its shiny new tools
to eat ops? Cindy Stridharan is a developer who wrote a long investigative piece about all this.
In your article, in your blog post, you mentioned that operations people were not necessarily happy with
the way things were going. What was going on? What were you saying? Let's put it this way, right?
The DevOps ideal was that, you know, responsibilities will be shared, right? Where, you know, developers
and operations will have like, you know, more 50-50 split, you know, for really ensuring the holistic
delivery of software, right? And I think a lot of the unhappiness
from engineers, from operations engineers
stems from the fact that that is not really
the reality on the ground, right?
And that there's still
sort of like, there's still the ones who are always picking the short
straw. There's still the ones who are
sort of like, you know, always doing the
grunt work. There's still the ones who are primarily
shouldering
responsibility for like actually running the applications and the developers aren't necessarily doing enough always doing the grunt work, they're still the ones who are primarily shouldering the responsibility
for like actually running the applications
and the developers aren't necessarily doing enough.
The question will be a crucial one
over the next few years.
How Opsy is DevOps going to be?
As we automate, does the role of ops get diminished
or does it transform?
Maybe the responsibilities of older ops
will get automated
so their teams can focus on creating new services
instead of just maintaining old ones.
However the ops role evolves, this much is clear.
The DevOps methodology is actually shaping the tech.
And in turn, the tech is shaping the methodology.
There's this amazing feedback loop. Culture makes the tools, and tech is shaping the methodology. There's this amazing feedback loop.
Culture makes the tools, and the tools reinforce the culture.
And in the end, that wall we described at the top of the episode,
the one dividing dev from ops,
I don't even know if the whole throw your code over the wall analogy
is going to make sense to a developer in five years.
And that's sort of a
great thing. Already, when I talk to folks today, I'm hearing a new story. Cloud architect Richard
Henshaw. I think it is starting to make people realize what the other side of the equation
was concerned about more. I've seen a lot more understanding. CysAdmin Jonah Horowitz. I think there is a craft to writing really good software.
And one thing that I see in the best developers that I work with is that they really, they
push the craft of software engineering or software development forward.
SysAdmin Sandra Henry-Stocker.
I think that developers are becoming much more astute and much more careful.
So they're constantly having to up their skills.
And I know that takes a lot of work.
It's a love-in.
Turns out there were some friends on the other side of that wall.
Nice to meet you.
So a confession. I always used to think DevOps was boring. Just a bunch of hardcore automation
scripts and scaling issues. My resistance was partly just practical. As developers,
every week there's some new tool coming out, some new framework. DevOps has been part of those scary, fast changes.
But now, especially after hearing these stories, I get it.
DevOps is more than its tools.
It's how we can work together to build better products faster.
And here's the good news.
As we develop new platforms for developers like you and me,
my work is becoming better, faster, and more adaptive to different environments.
The circle of interest can keep expanding too. You see people widening DevOps to include security,
so we get SecDevOps, or they include business, so we get BizDevOps. The debate we're going to
have now is, how important is it for a developer to understand not just how to use these tools,
but how all that DevOps stuff even works? And how realistic is it to expect developers to understand that new world?
The way we settle that debate is going to define the work of tomorrow's command line heroes.
You might have noticed that in all that talk about tools and automation,
I left out some big ones.
Well, I'm saving those for next time
when all this DevOps automation hits light speed
and we track the rise of containers.
It's all in episode five.
Command Line Heroes is an original podcast from Red Hat.
For more information about this and past episodes,
go to redhat.com slash command line heroes.
Once you're there, you can also sign up for our newsletter.
And to get new episodes delivered automatically for free, make sure to subscribe to the show.
Just search for command line heroes in Apple podcasts, Spotify, Google Play, CastBox, or however you get your podcasts.
Then hit subscribe.
So you'll be the first to know when new episodes are available.
I'm Saran Yitbarek. Thanks for listening, and keep on coding.
Hi, I'm Jeff Ligon. I'm the Director of Engineering for Edge and Automotive at Red Hat.
Even 10 years ago, the chaos of running hundreds and thousands of containers in a cluster,
it didn't feel like you could go from that to running just dozens in a car.
But these days, it's coming.
In fact, containers are a big part of the future vision of software-defined vehicles.
And look, if we can get the container revolution to work in cars, then everything a cloud-native developer can do today can apply to cars. This huge ecosystem
of engineers can start to write applications for automotive. We can completely change the industry.
This is why Red Hat's open-source approach to edge computing is so important. The way we
collaborate, the way we build together, it's already making some pretty incredible things
possible.
Learn more about them at redhat.com slash edge.