Screaming in the Cloud - Episode 1: Feature Flags with Heidi Waterhouse of LaunchDarkly
Episode Date: March 19, 2018This podcast features people doing interesting work in the world of Cloud. What is the state of the technical world? Let’s first focus on the up or down, on or off function of feature flags....Today, we’re talking to Heidi Waterhouse, a technical writer turned Developer Advocate at LaunchDarkly, which is a feature flag service - a way to wrap a snippet of code around your feature and make it into an instrument to turn on or off. It lets you turn things on and off in your codebase quickly without having to do several commits. However, it is difficult to track it when there are more than about a dozen flags. So, LaunchDarkly provides a way to manage your features at scale with a usable interface and API.Some of the highlights of the show include:A feature flag allows you to hide items before you want them to go live on your Website. You hide it behind a feature flag, doing all the work ahead of time. Then, at some point, you turn it all on instantly without the risk of pushing untested code into your production.You can test at scale to gain authentic data. Test something with your team, your company’s employees, your customers, etc. However, no matter how good your integration tests are, there’s always wobbles to watch for in the system.With implementation, there are a few paths that can work, such as the massive reorganization path. Or, you can just start incrementally with feature flags for new features.LaunchDarkly thinks in the Cloud as the surface because it mostly works with people who are doing Web-based delivery of features.Major companies, like Google and Facebook, offer services similar to feature flags for their own development. They’re operating on such a giant scale that they have internal teams doing it.Companies use feature flags on the front-end and other purposes. It works through the whole stack from frontend page delivery, pricing tiers, white labeling, style sheets, to safer deployments.Do not focus on documentation. You should not have to read documentation for anything that you don’t own. Every feature should have documentation tied to its code. Create a customized experience.Feature flags effectively manage and minimize risk. There is always risk in the world, but what causes disaster is not just one failure. It is a multiplication of failures. This goes wrong and that goes wrong. Feature flagging breaks monolithic releases into tiny chunks that can go forward or backward.LaunchDarkly holds monthly meet-ups called, Test and Production. People share their use case regarding continuous integration, continuous deployment, DevOps, etc.Links:LaunchDarklyiPadAutodeskSlackIBMQuotes by Heidi:“What feature flags do is make it possible for you to push out a deployment with things hidden, we call it launching darkly.”“We’re all about avoiding risk, I think this is our motto this year, eliminate risk…you can’t eliminate risk, but you can make it much less risky.”“Go ahead and write your feature. You know that it’s hidden behind the magical feature flying curtain until you’re ready to turn it on.” “If 20 years of technical writing taught me anything, it’s that nobody wants to be reading documentation.” .
Transcript
Discussion (0)
Hello, and welcome to Screaming in the Cloud, with your host, cloud economist Corey Quinn.
This weekly show features conversations with people doing interesting work in the world
of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles
for which Corey refuses to apologize.
This is Screaming in the Cloud.
This episode of Screaming in the Cloud is sponsored by my friends at GorillaStack.
GorillaStack's a unique automation solution for cloud cost optimization,
which of course is something here and dear to my heart.
By day, I'm a consultant who fixes exactly one problem,
which is the horrifying AWS bill.
Every organization eventually hits a point where they start to really, really care
about their cloud spend, either in terms of caring about the actual dollars and cents that they're
spending, or in understanding what teams or projects are costing money and starting to build
predictive analytics around that. And it turns out that early on in my consulting work, I spent an
awful lot of time talking with some of my clients
about a capability that GorillaStack has already built. There's a laundry list of analytics
offerings in this space that tell you what you're spending and where it goes, and then they stop.
Or worse, they slap a beta label on that side of it for remediation and then say that they're
not responsible for anything or everything that their system winds up doing. So some folks try and go in a direction of
doing things to write their own code, such as spinning down developer environments out of hours,
bolting together a bunch of different services to handle snapshot aging,
having a custom Slack bot that you build that alerts you when your budget's hitting a red line and this is all generic stuff it's the undifferentiated heavy lifting that's not
terribly specific to your own specific environment so why build it when you can buy it gorilla stack
does all of this think of it more or less like if this then that ifttt for aws it can manage
resources it can alert folks when things are
about to turn off. It keeps people appraised of what's going on. More or less the works.
Go check them out. They're at gorillastack.com, spelled exactly like it sounds. Gorilla like the
animal, stack as in a pile of things. Use the discount code screaming for 15% off the first
year. Thanks again for your support, Girlistack.
Appreciate it.
Hello and welcome to Screaming in the Cloud.
I'm Corey Quinn.
Today I'm joined by Heidi Waterhouse of LaunchDarkly, where she's currently a developer advocate.
Welcome to the show, Heidi.
Thanks.
I'm glad to be here.
So your backstory is fascinating to me.
You were a technical writer for a couple of decades. At one point, I think you mentioned to me that you used to write Patch Tuesday release notes, which I think is the definition of thankless job to some extent. And in your most a year, people begin to think that you might know what you're doing and invite you to come give talks. And so instead of being my marketing side hustle as a technical writer, writing blog posts and giving technical talks is my full time job, which is like a dream come true.
Was that a role that you found or did you have to have it built when you started
talking to them? So I applied to be a technical writer, and they asked me to come in and interview
to be the developer advocate. And so I'd never done that role or really, like I knew a bunch of
dev advocates and dev rel people. But I had never thought of myself that way. But LaunchDarkly asked
me to come in and give them a 15-minute presentation on feature flags
to see what I could do with that.
And I ended up giving them a 20-minute presentation on how you could use feature flags to do documentation
better.
That sounds like a fun story.
But let's rewind a little bit first.
When I first met you, I knew you as the person who was doing a bunch of live, presentations and all, from an iPad.
And that is something that turned into a surprisingly interesting area.
Isn't that great?
I find it so fascinating that the iPad really is a powerful enough computer that that's all you need to travel with. I'm also so glad I have that method because it turns out that the USB-C connection
on the new MacBooks is not 100% reliable for HDMI
and can fail not just frustratingly,
but like catastrophically.
So this happened to me.
I had just gotten a brand new laptop for this new job.
I'm doing my first
conference as a dev advocate. I plug in my laptop and it got this weird jaggy sideways graphic and
failed to project. And I'm sitting there thinking, good, good. This is good. This is a brand new talk
at a conference where I don't know anyone for my brand new job, and I have just bricked my computer.
I feel like the only appropriate response there is,
and there's the metaphor, and just wait for the applause.
Exactly. I actually ended up giving a 35-minute talk purely from memory and adrenaline and fear
sweat. But after that, I was like, MacBook, you are fired for all time, and I'm going back to my
iPad.
Okay, back to topic a little bit. What is a launch darkly?
Launch darkly is feature flags as a service. So it turns out that a lot of people want to be able to turn things on and off in their code base really quickly without having to do a lot of commits.
But they have a lot of trouble tracking it when you get over, say, a dozen flags.
So what LaunchDarkly is providing
is a way to manage your features at scale
with a usable interface and also an API.
Okay, for those who don't have a background
in software development, namely me, what is a feature flag?
So imagine if you are creating a piece of music and you know those big sample boards that the
DJs use and they use in theaters? Oh yeah. and make that into an instrument to turn it on and off.
Wonderful.
And the advantage of doing this as opposed to a fresh code deploy would be?
Speed and risk.
So I am old enough that I remember when code and products came on CDs,
actually floppy disks, but we won't talk about that.
Lotus 1, 2, 3.
But it used to be a big deal to push out a deployment,
and then that was all you got.
So what feature flags do is make it possible for you to push out a deployment with things hidden.
We call it launching darkly.
Ah.
Ah, see?
And then when you're ready, you can turn it on. So imagine you want
to do a big website refresh in September. And you want it to have all the things that you're going
to need for Black Friday on it. Well, you don't want to show the Black Friday stuff yet. So you
hide it behind a feature flag, do all the work ahead of time. And then on Thursday at midnight, you can go ahead and turn it all on instantly without the
risk of pushing untested code into your production. That makes a stunning amount of sense. Wonderful.
Yes. We're all about avoiding risk. This is in fact, like, I think this is our motto this year
is like eliminate risk, which I argued with, because you can't eliminate risk, but you can make it much less risky.
So as far as doing this in production and minimizing risk, pushing it further down the deployment chain, how does this start to impact larger scale environments? So I think one of the exciting things about cloud and scale is
that you're doing things across servers and time zones and areas of control. And you don't know
exactly what's going to happen in production. There is no way to test a massive distributed
system except in production. But if you're doing that, you would
like not to be showing everybody you're testing. So imagine you have a massive enterprise-grade
system, and you want to know if this new feature, let's say a toolbar, is going to work right.
Well, the first thing you do is you deploy it
with nobody able to see it.
And then you turn it on just so you and your team can see it.
And then you turn it on so that only people in your company
can see it based on IP.
And then you turn it on for 10% of your customers.
And then you scale up the percentage of customers
who can see it.
This whole time you're doing sampling and metrics
and analysis
to make sure that it's not causing edge case problems or somehow causing your system to
fail or conflate or fall over. So testing isn't a binary, it's a degree.
How does this apply to, I guess, other methods of testing large-scale distributed software
in production or otherwise?
So one of the ways that we've tested large-scale software before is to run a bunch of fake data
through. And the problem with fake data is that it's fake and frequently sanitary. It's sort of
like trying to test whether an antiseptic works, but only in a sterile environment. You're just not going to find out
because you're feeding it data that isn't contaminated and grody. So being able to test
in production means that you're going to get authentic data. Another thing that's important
to remember when you're testing at scale is that no matter how good your integration tests are, there's always going to be some sort of wobble in the system.
I think about, say, Autodesk.
I was working for a company doing cloud integration stuff a couple employers ago. up thousands, thousands of servers, and then spin them down again really rapidly because they were
using them for basically scaling user 3D printing stuff. And if you couldn't handle the fact that
you were spinning up 2000 servers all at once, it was a real problem. But it's hard to get the
testing capacity to do that. Gotcha. So does this apply, I would say,
let me take that step back. There are some technologies that you tend to see that make
an awful lot of sense for certain use cases, generally at software company startups based
in San Francisco. And if you try and take that model to something like, I don't know, we control all
of the ATMs in North America. A lot of the paradigms that work when you're Twitter for pets
start to fall down at bank of the world. And for example, when a dog can't tweet for two minutes,
that tends to be a different failure domain than the ATM is now spitting out wrong balance
information to a subset of users.
Or 20s.
Oh, yeah, exactly.
That depends entirely on what level of happy or sad you want your users to be.
But the question I'm getting at here is, are feature flags something that maps reasonably
to most workloads?
Or is this something that is better suited for stuff that errors won't
really leave nasty marks and bruises? We certainly think that it is enterprise grade,
and I cannot talk about a lot of our customers. I have to go through and see how we can talk about,
but I will say that Atlassian and Jira are using us, which I think is a pretty significant use case.
I bet a lot of your listeners have a confluence somewhere.
And we think that it is exactly because you can do feature flagging that it makes it safer to do an enterprise grade.
Because if you have a button, a knob that can turn on a feature, you also have a knob that can turn it off instantly.
We think it's about 200 milliseconds from the time you hit what we call the kill switch
to the time servers stop delivering that broken feature. Imagine the power to say,
wow, we're spitting out 20s from our ATM. Let's roll that back right now.
That's compelling, which I guess leads into the next big question here.
How do you get there from where many shops are today? It's similar in some ways to,
it's easy to implement something like this or relatively easy to implement this in something
completely greenfield where you're not necessarily going
to be having to retrofit it to things. But in practice, we rarely get to work with environments
like that. Here's a 20-year-old PHP app. Time to go ahead and re-architect it to take advantage
of something like feature flags. What does that journey look like?
So people can take a couple of different paths. There is a massive reorganization
path, which is, it's not ideal. Like nobody enjoys it. It burns a ton of developer time and your
value add is very small right away. But if you're doing something like using feature flags to do
price tiering, where you're showing people the same page, but with different features,
depending on whether they're paying you or not, it's what you have to do. Most of the time,
what we recommend is that people start using it for new features. So you just, as your new coding practice, as your best practice, whenever you create a feature, instead of necessarily making
it a branch, you just wrap it in a feature flag and add some, if this is on and this is off,
defaults, and then go ahead and write your feature.
And you know that it's hidden
behind the magical feature flag curtain
until you're ready to turn it on.
So we say like this incremental,
just new features or just new code bases,
it's still going to help you a ton. You're still going to see a lot of benefit from it without disrupting and randomizing a working cold fusion environment.
Which makes sense. So the next logical evolution of that question, given that this is screaming
in the cloud, how does rolling something like feature flags out change
when you're doing it in a cloud-based context
instead of a traditional on-prem style deployment?
Or does it?
So interestingly, when I say we're feature flags as a service,
we are a cloud-first organization.
We are tightly linked with the CDN.
We're all about the distributed network.
It's not like we're having your request come all the way back to our servers to be evaluated
or to your servers to be evaluated.
We're evaluating on the edge, which gives us a lot of power.
But it also means that it was a little hard for us to move into on-prem.
We do have some on-prem installations, but it's less powerful because we don't have that edge service.
The edge of the cloud is giant.
And sharp.
And sharp.
And the on-prem servers are small and localized.
And so we can do it, but it's sort of not how we think.
We're very much thinking in the cloud as a service because mostly we're working with people who are doing
web-based delivery of features. We're working with companies that are giant web pages or
retailers or people who really need to have always on control of their features.
Which definitely does tend to make sense as you look at the current crop of companies and the historical migration pattern that we're seeing as companies move out of on-prem and into cloud.
A question I have for you, though, is whenever I hear someone say something as a service, in this case, feature flags as a service, my immediate instinctive knee-jerk reaction is, oh, okay, so how long is it until AWS comes out with a confusingly named offering around this that tries to eat your lunch more or less, but somehow manages to have 15 they can build out for their customers or how tightly integrated to a particular vendor's offering it becomes?
Interestingly, there is a level of organization that's not interested in buying us because they're doing it themselves. And I think it's possible AWS is not offering this
to customers, but I do think they are using something very much like feature flags for
their own internal development. I know Google is, I know Facebook is. They're operating at
such a giant scale. They have entire teams that are already doing this so that they can serve you
the 15 degrees of confusing, badly worded pricing.
Because they're serving you your 15 confusing things, but they're serving someone else 15
other different confusing things.
And the only way that can be happening is if they're doing feature flags.
Gotcha.
So this is A-B testing taken to a extreme level.
Yes, this is A-B testing, but I like to call it
on-beyond A-B testing,
on-beyond Z,
because A-B testing
is just one of the ways
that you could be
manipulating what people
are experiencing.
Wonderful.
I feel like when we break
into the level of alphabets
to name that
across that many dimensions,
we run a reasonable risk
of inadvertently summoning a demon. As a general rule, are feature flags considered to be a front-end
technology, or is this something that starts to work its way throughout the rest of the stack?
It works through the whole stack. People are using it for, like, our customers are using it for front end page delivery, but also, like I said, pricing
tiers, and also just safer deployments. So if you're doing a significant back end revision,
like I was reading about how Slack upgraded their database back end, they put a new database on top
of their old database, and then switched over slowly,
almost like a blue-green, but they weren't identical. And that was done using feature
flags so that they could slowly shift traffic from one to the other without having anything
irrevocable. Fascinating. I guess this ties into my next question rather neatly, which is,
you mentioned at the beginning
that you've done some documentation work with feature flags or mentioned that in your interview.
What sort of wacky things can you do with feature flags that aren't continuous integration
or delivery based?
You can use it to do white labeling.
So imagine if instead of having 15 different custom websites that are slightly different and you have to maintain, you have one website and you're just pulling the customized look and feel, the CSS, out using a feature flag.
I think you could use it for some really interesting localization and market segmentation. So if you wanted to target all the people in Germany who have previously
expressed an interest in Hamburg United, you would be able to say, deliver that to them.
And I'm working on a blog post right now. My CTO is a little dubious about this idea,
but I think you could use feature flags to do some really interesting localization stuff to pull out
different files and do your localizing on the fly using feature flags instead of having to do
browser-based dependencies. Fascinating. Getting back to what you said originally about using
feature flags for documentation, how does that work? So I don't think you should have to read documentation for anything you don't own.
So every feature should have its documentation tied to it, committed as code.
And then if you don't have the extreme module, you will never see the documentation for the
extreme module.
That just won't appear because we'll have turned that flag off.
So being able to synchronize exactly the code that you're using
with exactly the documentation you get
will really cut down on the amount of documentation people want to read.
Because if 20 years of technical writing taught me anything,
it's that nobody wants to be reading documentation.
It seems sometimes that no one wants to read it at all,
to the point where RTFm has become almost a trope
in our space oh how do i do this read the manual yes i tried it's an encyclopedia could you be
slightly more specific please yeah i used to call ibm to get the help desk to give me the page that
i needed to be reading it was like your indexing is really bad, people. But I think it helps if
we remember that everyone who's reading documentation is already a little bit angry
because they couldn't figure it out. And all documentation is essentially a failure of user
interface at some level. I like the idea that everyone who's reading documentation is already slightly upset. They've had a negative experience.
Other than removing, I guess, parts of the documentation that don't apply to them,
with your background in technical writing, how else can you see reasonable ways to address that?
I know that I spend more time swearing into various cloud provider documentation bundles
than I'd care to admit publicly?
Well, I think that you should be able to have a customized experience.
And that would mean that not only is stuff
that you're not using hidden from you,
but also you would get synopses of things
that you already know.
So for instance, if you log in as Corey Quinn,
expert AWS person,
nobody's going to explain to you how the certificates work because you've done that enough times.
You know that.
So, it'll just be like a consolidated summary.
And then we'll go into how certificates don't work in these particular circumstances.
And that part would be expanded. So I think you'd be able to do level setting to answer these three questions, and then
we will give you a more accurate representation of what you're actually trying to get an answer
for.
You flatter me.
It tells me my marketing has worked.
I have no idea how most of the certificate stuff works.
I smile, shrug, hand wave over it, and hope no one presses me too closely on it.
My website was down for 10 days because I can't figure out how to renew certificates.
Welcome to the eternal joy of anything involving infrastructure. So something you mentioned
earlier was using feature flags to effectively manage and minimize risk. How does that
wind up progressing as companies start to embrace the idea of feature flags?
The thing that we're trying to do is accept that there is always risk in the world.
And what causes disaster is not one failure. It is a multiplication of failures. This goes wrong
and this goes wrong. It's not just that the O-ring got too cold. It's that PowerPoint made
it difficult for people to explain to their bosses that the O-ring got too cold, it's that PowerPoint made it difficult for people to explain to their
bosses that the O-ring was too cold and the space shuttle might blow up. All of the failure analysis
you've ever read involves a lot of different factors. And so what we're trying to do with
feature flagging and continuous integration and continuous deployment is break these monolithic
releases up into tiny bite-sized chunks that we can make go forward or go back.
So if you think about it, it's less like putting all of your money on one color of the roulette table
and more like putting it all over the roulette table.
Your odds of something catastrophic happening are much lower.
Gotcha. So effectively what you're doing is reducing failure domain by having fewer
deltas at any given time?
Exactly. Because if we say, I'm only changing this one particular thing, then you can track
what's happening with that feature. And if something goes wrong, you have a way to back
it out, or I like to call it roll it forward. So if you have a deploy that has 20
features in it, which that's a really huge deploy in the CI CD world, but has 20 features in it,
and one of them goes wrong, the old style was to panic and push out the old version,
effectively rolling back all 20 features. Feature flag style is you go, oh, feature X is not working. We're going to
turn that off. All the rest of the features are fine. They're going to roll forward. We're going
to go find out what happened with feature X. Gotcha. It seems almost counterintuitive in
some ways where you have a deploy, things are broken. It's a terrifying moment. The instinct is, since that hurts so badly,
companies generally want to do fewer releases as opposed to more releases that have smaller
change sets. Right. I think this is why people have trouble with weightlifting. It turns out
very few of us can actually lift 300 pounds, but a lot of us could lift 30 pounds 10 times or 3 pounds 100 times.
I want more companies to be lifting 3 pounds 100 times when they release instead of trying to lift this one massive 300 pounds.
I like the metaphor quite a bit.
You're going to hurt yourself if you try and lift that.
Yes, I don't even try to lift the 30 pounds 10 times.
Good Lord.
Yeah, that's not really my skill set
these days. Yeah. How old is your baby? Eight months at this point. And yeah, not quite to 30
pounds yet. Not quite. Hopefully, won't get there for a while, but we'll see. So is there anything
else that you'd like to talk about or mention that you'd like people to take a look at, participate in, throw fire and brimstone upon, etc.?
So I've got a couple things.
If you're in the Bay Area, my company does a monthly meetup called Test in Production.
And we talk more about these things.
And we have people come in and talk to us about how they're testing in production and what their use case is for continuous integration,
continuous deployment, DevOps, that sort of stuff. And it's super fun. And I would like people,
if they have time, to write me in with stories about trunk-based development versus branch-based
development. Because I think it's a philosophical split that we haven't explored a lot in the DevOps industry yet.
The way that I've always found that worked very well
to get people's feedback is to stake out a strong opinion
on one side of an issue or the other,
and then just wait.
You don't even need to give them addresses.
They will come back and give it to you themselves
without prompting.
It's true.
Yeah.
So I want to say, let's just say trunk-based development is probably a better way to go for your enterprise organization than branch-based development.
Well, thank you very much for joining me today, Heidi.
I'm going to disagree with you vehemently as soon as we stop recording.
My name is Corey Quinn.
This has been Screaming in the Cloud.
Thank you for joining me. Thank you. Have fun.
This has been this week's episode of Screaming in the Cloud. You can also find more Corey
at screaminginthecloud.com or wherever fine snark is sold.