a16z Podcast - a16z Podcast: Harnessing the DevOps Movement -- Don’t Go Chasing Waterfalls
Episode Date: January 8, 2016In this, world of massive cloud-based applications and services, rolling out software has moved from an episodic event to an almost continuous release cycle. In that environment, software products are...n’t as “done” as they used to be -- they can’t be -- so the focus has shifted to reversibility. Building a development organization with the design tools and processes that can aggressively iterate while also creating safety nets. So if things do get screwy they can be fixed before customers even notice. Call it DevOps or application operations, Steven Sinofsky leads a discussion with Karthik Rau from SignalFx and Alex Solomon from PagerDuty about the evolution of I.T. operations – and the requirements and challenges that modern distributed applications pose for a development organization. The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.
Transcript
Discussion (0)
The content here is for informational purposes only, should not be taken as legal business tax
or investment advice or be used to evaluate any investment or security and is not directed
at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com
slash disclosures. Welcome to the A16Z podcast. I'm Michael Copeland. In this world of massive cloud-based
applications and services, rolling out software has moved from an episodic event to an almost
most continuous release cycle.
In that environment, software products aren't as, quote-unquote, done as they used to be.
They can't be.
So the focus is shifted to reversibility, building a development organization with the design
tools and processes that can aggressively iterate while also creating safety nets.
So if things do get screwy, they can be fixed before customers even notice.
Call it DevOps or application operations.
Steven Sinovsky leads a discussion with Kartik Rao from Signal FX and Alex Solomon from PagerDuty
about the evolution of IT operations and the requirements and challenges that modern distributed applications pose for a development organization.
Steven Sinovsky starts the conversation.
What we thought we do is have a little bit of a discussion about the role of DevOps and how that really changes, how things are going in IT,
represented by two great founders with some wonderful tools in the space.
But just diving right in, I think one of the most interesting things is that historically
IT has really thought about waterfall development and requirements gathering and really trying
to solve these customer problems where the customer is an internal-facing organization.
How does the cloud and modern techniques and the consumer Internet really alter the way that
people think about the different roles and the types of work to get done in IT.
Maybe you start with Kartik on that.
Yeah, I think one of the best examples or responses to that question I read,
it was a Facebook engineer who wrote a blog post about how traditionally the goal was always
to reduce complexity or mitigate complexity, and that's what waterfall and kind of all
the face checks are essentially all about managing that complexity.
And the point that he made was if you're really trying to be innovative and move quickly,
you can't really manage the complexity
because at Facebook they've got so many small teams
and they're all releasing updates very aggressively
and in that kind of a world
you really have to focus more on organization design
tools and process that focus on
what he called reversibility
and so this is you still move very aggressively
but you have to create the safety nets
so that as you're making changes
if you make any change that is potentially destructive
that you recognize it very quickly
and you have the means, both in kind of how your software is designed,
your processes designed, your teams are designed,
so that you can roll it back very quickly before your customers even notice.
When you do that, you then have confidence, a lot more confidence
that you can be much, much more aggressive in rolling out software, right?
So I think that, to me, summed it up in a really crisp way of, you know,
the world has changed a little bit,
and if you're going to really support fast release cycles
and you want to be competitive and being very responsive to the marketplace,
you can't control complexity the same way you could before.
you just have to focus on other things, primarily reversibility.
And so with this role, though, of DevOps,
like, how do you see customers, like, you know,
in the world of, like, break and response and escalate incidents,
how do you see the internal customers sort of managing
when their products don't appear as done as they used to,
but they're done a little bit sooner?
And how does that influence the way to think about the engineering cycle
and also just the communication with those internal customers.
Yeah, it's a good question.
So what we've seen is that the requirements stage
essentially boils down to doing customer development
and being able to talk to customers.
And what's really important as part of that is showing them something.
So as part of the development cycle,
you would show them wireframes and something to react to.
And then you'd make it a much more iterative process
where you wouldn't,
it's a shift away from the waterfall, get it done one big bang, which actually is very risky
because if you've made any mistakes along the way and those mistakes actually add up,
at the end of the day, you don't deliver what the customer needs.
So being able to develop the software much more iteratively and show them,
here's what we have so far, what do you guys think, and get the reaction back from the customer
and then adjust and learn and iterate, that's a big part of DevOps.
Wow. In the consumer space, one of the things that's so interesting is, you know, there's this perception that you throw it out there and you see how people react and then you adjust and you iterate and things like that.
But often in the business world, people say, well, we can't, we're just not able to do that because our requirements are fixed.
Like, yeah, go build a messaging app. Go build a shopping app. But those requirements aren't, you know, they're flexible.
Whereas we have to, this is our expense report process, our performance review process, our cash.
a quote process. How do you see the role of, in the customers you work with, how do you see
the role of an MVP or just these early releases? Do you see that evolving in any way?
Well, I think one of the things is you don't have to build the entire stack, right?
I mean, I think in the web services economy, you can leverage a lot of other components and
focus on the things where you really, you know, where you want to invest, and it makes it a lot
easier to get something up and running very quickly, right? And I think ultimately, even in the
enterprise world, markets are changing very quickly. And so if you're taking two years to get
something out, the markets will probably change in those two years. So it's very advantageous
to get something out quickly. I think the key is just to have focus and, you know, leveraging the
sort of web services ecosystem. There are all of these different technologies that you can leverage
without having to kind of wrap it up and build it up into this one giant software package that
takes two years to release. I think it certainly makes things easier.
Let me put you on the spot a little bit. Have either of you, you know, really gone through that
with a customer, with a particular kind of app
where it's really jumped out
at them in terms of, you know,
wow, this was an app where we were generating
way more tickets than we used to expect
because we have way more telemetry.
How are you seeing the actual deployments
of like these modern cloud-based applications
really evolving in terms of the level of support
and the level of understanding that's really going on?
Yeah.
What we've seen is that customers are becoming
a lot more demanding as software
as the world becomes more
powered by software. These are the customers of the app
or the customers of yours?
The customers of the app. Yeah.
Yeah, they become more demanding. They expect everything
to be up 24-7. You can't
take hours to fix
an outage. You have to automatically detect
outages. You can't have your customers
detect the outage for you, and you have
to respond quickly and fix it.
And
if you don't do that, you get
your bottom line hurts.
Your reputation
many of these applications have SLAs.
So if the app is not up, you're actually, you have to refund money back.
And so there's a lot more pressure on the IT department to deliver 100% uptime.
Of course, 100% is not realistic, but you have to get as close as possible to that.
Yeah, that's one of the things that's a good comment, like the 100%,
because one of the things that's so interesting is that IT used to think of we can deliver 100%
or we can get really, really close if we own all of the parts from the network,
routers on up. But in the SaaS world, you know, wow, you might Oath in with your Azure
Active Directory ID. You might be using this storage system from somewhere else and this other
service and you might be involved with an integration. How do you parse the notion of 100% uptime
in that? Or how does IT think of accountability even in that? Yeah, I think that's an interesting
question because your customers don't know the difference, right? They're just like, I can't
log on. I can't file an expensive port. I was talking to a media company and they had the situation,
they had a really big event and they had a streaming app and one of their ad networks was taking
an abnormally long time to load the ad before the video streamed. And all their users were on
Twitter complaining you suck and just, you know, and it was terrible for their brand, but it wasn't
their fault. It was a third party that they were just taking really long time to load the app.
Well, it wasn't their fault. I mean, it was, it was, it was, it wasn't their type.
It was someone else's technology, right?
But from a customer point of view, it didn't matter.
They just felt like the experience was poor.
And so, you know, for example, they're working with us on instrumenting the calls they make out into their third-party services
and being able to measure it and having the real-time visibility as they see events happening.
If they see some particular networks that are taking longer to load ads, at least they have that data.
They can make real-time decisions if they need to, and they can hold their vendors more accountable.
So let's switch here a little bit just in terms of one of the interesting things about being a person in DevOps now, particularly inside the enterprise, is really balancing these needs of control, which used to come from being all on-prem, which now might move to sort of a hybrid cloud model.
And we're all very forward looking here, but many customers are sort of dealing in these kind of hybrid environments.
What do we, how do we help people to understand that both the advantages, you know, of moving as fast as possible, which I think most people want to do to a cloud world, but then the realities that they're dealing in in terms of just these mixed environments and just responding to what's going on from a DevOps perspective.
Yeah, I mean, I think the on-prem, you know, the control aspect of it is a false illusion. I mean, the complexity of these systems.
I mean, but what do you do with just, I have to interrupt, because, like, people, you know, made big bets on delivering on that expectation.
Like, and we're just kind of bursting their bubble, but it's not true, where we're going to finally tell their boss it's not possible.
Well, I mean, the truth is that a cloud service or a SaaS service, these guys have teams of people who are dedicated to keeping that service up.
And, you know, trying to take that on yourself, I mean, I'd rather pay.
someone else to do it for me, especially if they're really good at it.
So this is why people use increasingly AWS and infrastructure as a service, platform as a
service, and now SaaS, because someone else has to worry about that uptime.
And I paid them, and then if they don't deliver, I get my money back, SLAs and such.
So I think that's why a lot of companies, including like the CIA and government,
are paying AWS to do it for them because they have that expertise in house.
And it's hard to gain that same expertise for every single, you know, company out there.
It's a good way to think about, you know, the different levels of DevOps.
How do you, real quick for folks, just even define what DevOps is?
How do you help them to understand when they go to hire them and things like that?
Yeah, well, we, at Signal Effects, we like to think of it more as application operations, you know, or just operations.
You know, it's just the evolution of IT operations.
It's just focused more on modern distributed applications.
and focusing your tool set and your process
on a different set of challenges
from maybe what you did before.
And that's, it's, there's still,
there's a significant amount of work involved
in building the processes and the tooling
to make a cloud infrastructure usable
and highly effective for an end-user development organization.
And that's, you know, what we think of as the modern role
of operations, application operations, DevOps,
whatever you want to call it.
Cool.
Well, thanks everybody.
This was just a quick chance to see some,
excellent work and really think a little bit about DevOps in the modern world.
Thank you very much, guys.
Thank you.