PurePerformance - 012 Automating Performance into the Capital One Delivery Pipeline
Episode Date: September 12, 2016Adam Auerbach (@Bugman31) has helped Capital One transform their development and testing practices into the Digital Delivery Age. Practicing ATDD and DevOps allows them to deploy high quality software... continuously. One of their challenges has been the rather slow performance testing stage in their pipeline. Breaking up performance test into smaller units, using Docker to allow development to run concurrency and scalability tests early on, and automating these tests into their pipeline are some of the actions they have taken to level-up their performance engineering practices. Listen to this podcast to learn about how Capital One pushes code through the pipeline, what they have already achieved in their transformation and where the road is heading.Related Links:* Hygea Delivery Pipeline Dashboard https://github.com/capitalone/Hygieia* Capital One Labs http://www.capitalonelabs.com/#welcome* Capital One DevExchange https://developer.capitalone.com/
Transcript
Discussion (0)
It's time for Pure Performance.
Get your stopwatches ready.
It's time of Pure Performance.
My name is Brian Wilson and as always we have Andy Grabner. Hello Andy.
Hey Brian. It's been a while since we've last recorded because I i've been i've been gone for for a while i know it doesn't
matter to the audience because they keep hearing the episodes they keep coming on i think bi-weekly
schedule now is that what we have yeah a couple of others inserted in between just because we
had to squeeze them in but yeah pretty much bi-weekly with some special shows yeah well i'm
i'm happy to be back it's the almost well, the summer is getting towards an end, but not the end yet.
And I had a fantastic, fabulous time in Europe the last three weeks.
I got some tan.
I got the chance to dance with my girlfriend in Slovenia at the Salsa Congress.
Wow.
Yeah, it was really cool, actually.
But now back to reality, back to performance.
I got to,
well,
I got to spend some time in a lovely New Jersey,
um,
humid,
hot New Jersey,
visiting friends and family. So I'm nice and refreshed as well.
But before we continue,
I do need to,
uh,
give a shout out,
um,
to one of our listeners,
uh,
Ted.
Um,
he's my wife's cousin's fiance.
And due to some sad circumstances of, uh, her cousin's
father dying or uncle dying, um, they were together and Ted listens to us a lot.
And one night at the bar after the repass, uh, Megan sent me a video of Ted doing a impersonation
of our intro right off the cuff.
I was like pretty impressed, but i was also embarrassed for myself
so hi ted and thanks for being a loyal listener
so we have a special guest today do you want to we do you have a history with him so why don't
you go ahead and introduce him yeah the thing is i remember the last time actually we had it on air already on
one of my performance cafe it's adam our buck even though i know i think adam when we asked
you earlier and i know you're on the line already but when he asked you earlier how to pronounce
your name he said our back and i think me with my german background i would say our buck because
they at least that's how i would say so i I call you Adam Auerbach. I just call you
Adam. But Adam and I have a history because I think Adam, we met last year at Velocity. And
since then, I've just happened to bump into you at different conferences. You've been promoting
a topic that is dear to my heart, which we will be talking about a lot, which is continuous testing,
continuous performance, and speeding up the pipeline by pushing performance early in the lifecycle.
And without further ado, I don't want to, because I think you can introduce yourself better.
Adam, welcome to Pure Performance.
And maybe you want to give the listeners a little background about yourself,
what you do and what drives you, and then we go right into the topic.
Awesome. Thanks, Andy. And thanks, Brian, for having me.
Yeah, so Adam Auerbach. I work at Capital One.
I am a senior director of technology,
and I lead the enterprise group advanced testing and release services for Capital One.
And so our big mission is enabling continuous testing,
continuous delivery for our feature teams.
My background has mostly been in the testing profession for over 15, 16 years now,
and I've been at different banks and whatnot and have gone through different transformations. But over the last four years, I've been at Capital One, and we've been on just a magical journey, which started with our agile transformation and then moved into DevOps and continuous testing, continuous delivery.
And now we have teams that are actually living that dream, which is pretty exciting.
So I'm happy to be here and happy to talk about this with you guys.
Awesome.
And so now this is actually an interesting – you mentioned something very interesting.
It's the transformational process, and you said you have a background in testing, and so does Brian as well.
We're all testers kind of in our initial profession.
And you said you're transformed, and you mentioned DevOps.
You mentioned continuous delivery so has your and i remember the first presentation you gave at velocity which
for me is something i always reference in my presentations were and correct me if i'm wrong
what you said back then is you three years ago and now probably four years ago you used to be eight
testing teams and you did a lot of manual testing and now three years later there's no testing teams
anymore and the only I think you're the
only one left now that is doing a lot of mentoring and basically providing the framework, the tooling,
the automation support. And I think that, for me, was phenomenal. And I think I'm seeing a lot of
this transformation also going on with other companies. But did I get that right? Is that
kind of the major, the agile movement first that you had? Yeah, that's been a big key.
The you know, we still have testers.
The testers are on teams and they're part of the development organization.
And we really have have structured around delivering value to our customers.
And so. So, yeah, so the testers are on the teams, and my team is really focused around enablers.
So originally, enterprise groups were doing performance testing or doing test automation or even testing work on behalf of a line.
But as we've transitioned to agile and put everyone on teams, then, you know, we we start embracing DevOps, you you you start to focus on flow and really having true feature teams being able to deliver quality software early and often.
And in order to do that, you really have to, like, enable them to be able to to do everything. Um, and so from an enterprise perspective, it's really been around getting out
of being a service-based organization and being product-based, you know, helping people have,
uh, something that they can use to get their data that they need or something that they can use to
do performance testing or something that they can use, uh, to help with their test automation and
give them those, those tools and, and resources so that they can do that
themselves.
I'm at a testing conference today, the Practical Software Quality and Testing Conference in
San Diego.
And actually, most of my morning has been spent talking to people around the future
of testing and how testers are still important, but they just have to embrace these technologies,
embrace the ability to do these things where they're able to automate the validations in
real time, giving developers fast feedback.
And that's how you continue to have this career in testing.
And so that's been part of the transformation that myself and my team have been on
and now trying to spread that word to others so that they can get left behind,
in essence, through what's happening.
Sorry, I wanted to ask, we often talk about that in terms of leveling up.
I'm sure we didn't make up that term, but it's general concept of if you're a tester, no matter what it is, you have to level up. If you're going
to keep your job and keep running with this, uh, with these concepts, how difficult either for you,
or do you think in general, um, is that process? And do you see a lot of people not making it
through it or, you know, in the environments you've been around? Or have you seen kind of more of as long as the people are willing to put some skin in the game, they can level up and get into these positions and maintain them as well?
Yeah, so it definitely takes some work.
And I will tell you, like when we first started this journey, we had some good debates about this. Part of the conversation was, you know, before you used to build frameworks to, you know,
bring, to dumb it down, to bring, to obfuscate that layer.
So a manual tester could do all these things.
And so, but now what we're trying to do is get them the technical skills so that they know how to create jobs in Jenkins or they know how to create a Docker container to do their automation or they know how to use the open source tools.
And that was a big leap that we took.
And not everyone's going to be able to make that transition.
It is not easy. It does take work. Capital One made a large investment. We have a
software engineering college where we've built courses to help people get them some programming
skills or skills with the different cloud technologies or whatever the case may be.
We have technical coaches that are there. We've done pair programming to partner people together.
But then we've also given people alternatives.
So if you don't want to go down the technical route or it's just not your thing, the subject matter expertise that you have is still valuable.
And so we have had people move into product owner roles or scrum master roles or other non-technical roles. Um, you know, I would say
like, uh, majority of people are able to do it. Uh, it just does, does take some work and, you
know, and that's what, what there is an investment that the company needs to make to help people go
through it just, and that that's not just an investment in the resources, but also in the time,
right. Because people need to have time to, their classes or spend time just putting the technology at play and learning on the job.
And, you know, you can't, you got to give them some leeway when it comes time to, from a velocity perspective.
You know, that's one thing that I've heard from people say, well, we have this deadline.
And so how do I have time to learn this if I have this deliverable?
And that is something that management has to be able to accept.
And, you know, Capital One has done a really good job of letting people, giving them the resources, but also giving them the space to pick it up.
And we have seen majority of our people are able to do it.
That's all.
So that's interesting.
So it's a top-down commitment.
And also, I think it makes Capital One as an employer much more interesting for employees
because we all know it is hard to find the people with the right skills.
But you basically said, well, we are giving people the chance to actually move into that
field.
We invest the time and the money because eventually it will pay off because we all know how hard it is
to get people right and so you actually took the the uh at least one of the routes to say we are
going to invest in the people within our company and level them up to where they need to be and
that's paying off yeah i mean remember i mean uh the people who have been with the company for
some time have some really great subject matter expertise. They have the battle scars of, of what we went through when we were a smaller company
and through the different acquisitions and where we are today. And, and those battle scars have a
lot of value, um, and, and great perspective. And so you definitely don't want to throw that all
away. You know, definitely it's good to bring on new talent and, and, and new perspective, but at the same time, you know, we have, you know, a very large technology
group and so you don't want to lose, um, good people. And so yeah, Capital One's definitely
investing in, in new technologies, new skills, and keeping people relevant as we continue to
push forward. So you might be able to say that Capital One asked their employees what's in their wallet?
Definitely.
Hey, so you talked about feature teams and that basically you migrated or you moved from
service-oriented testing teams to feature teams.
That means the feature teams itself have the quote unquote testing quality role within the
team. So does this mean you, you have fixed assigned a quality engineers in every feature
team or are you still moving around a little bit or are they just part of these feature teams?
Um, so teams are fixed. So, uh, when we put a team together, that team is usually together for at least three months or six sprints.
But yeah, the people typically, when we go through like a bunch, when we go through planning sessions, like those teams are pretty much set.
I mean, that's a core Agile construct, right?
That the team doesn't have too much variance.
You know, and then who does the testing on the team? I think that's where, you know, that becomes a little bit more nebulous,
right? When we originally started the agile transformation, sure, you had like one or two
testers on the team and they did all the testing. But now as you get into continuous testing,
continuous delivery, everybody plays a role. You know, a story is not done unless you have automated tests that are running and passing.
And that's not just like one person's job that becomes the team or the people that own that story.
And so that's, you know, who you would say is the quality and the whole team owns quality.
Yeah. The whole team owns quality. And I think I remember one of the quotes that – well, I got a lot of strange – good and strange comments.
But one of your quotes last year at Velocity was – and I have it on my slides.
And it says, we don't lock bugs.
We fix them.
And basically, that's just the mentality of an agile team.
And the benefit of an agile team what you just said the story is not
done until the whole team says this is done done and therefore there's no need at that later stage
to find and log a bug because of course in the real in the theoretical world everything that
exits the team the feature team at the end of sprint if the story is complete has a right quality right exactly um
you know part of everything that we're doing is is around being able to deliver something into
production um and so there is nothing after the fact right where it's being coded and tested so
that it can ship today. That's the goal.
And if there's a defect, you need to fix it.
Otherwise, it doesn't ship.
Or if it's not something that's worth fixing, then it's not a defect.
It becomes a story in the backlog for a later point, and it's an enhancement.
And that definitely is – that might be the first thing that we tackled when we started our transformation, but it's a big deal in the sense that do things now in the moment.
Anything, performance issues, security issues, regression issues, everything that we're doing today is about shifting all of that left and bringing it right now so that you get those insights. Because we know that a developer that is, you know,
checks in their code and then moves on to something else to go back and have to fix something,
you know, a day, a week, two weeks later, most likely they're going to, they've already forgotten
what they did, how they did it, and they're going to break something else. And so every,
that's why that, you know, the shift left is really so overarching and applies to so many things.
But we really want to get them focused on doing things right now.
And defects are just a – you're not communicating.
There are so many things wrong with defects.
I could go on a whole show just about why logging defects is bad.
But, yeah.
I wanted to ask about, this is a question, not a question, a question on my side that I still haven't satisfactorily wrapped my head around or maybe not gotten a good answer to.
Is in terms of, you know, there's, we can look at performance testing and load testing.
And as Andy and I have discussed in previous tests,
performance testing can be done without load, right?
That's why you're taking a look at the performance metrics of a single call or whatever, right?
There's a lot that can be done on the left side there.
There's a lot that can be done
during all the CI components.
But when it comes time to a load test,
and this is where I always get confused
in these more agile
DevOps-y, continue, you know, CICD areas, my, you know, my specialty in the past was load testing,
and it was always a behemoth. It was always, we had all these scripts, we had to make sure they
all worked. Sure, we can test an individual component, but we really would always want to
test to be complete. We would always want to test the system as a whole, which of them, you know, that brought back then the whole problem of do all the scripts still
work and all that. How is low testing being handled or how do you see that it should be
being handled, um, in this kind of scenario? How do you make it from not being a bottle?
How do you keep it from not being a bottleneck basically is where I'm getting at.
Yeah. So I think, um, I'll answer and I know Andy probably has a perspective as well.
So from Capital One's perspective, you have to be able to do several different things.
And so to your point, when a developer checks in code, there should be some level of performance tests that they can run just at a component level just to make sure that the response rates are what they should be and there's been no degradation.
And those types of tests can run pretty quickly.
And then using monitoring tools in your non-prod environments has been a great way for us to see.
Now, granted, again, it's not load testing.
It's still just performance testing and response rates. But it's a great way to get insights.
While you're exercising the system for your acceptance tests, you're able to get insights into degradation right there.
And then lastly, you do need to do some level of load testing.
But what we've started doing is taking advantage of the cloud tools so something
like docker for example you can take your selenium tests you can spin up many many docker containers
and in essence just ramp up your your functional testing to create load And you can do that multi-threaded, so you don't need that many servers
to be able to have a thousand or so browsers or whatnot
to be able to generate that load.
And so you're able to still do it paralyzed
in a parallel execution
and get that information in real time. that's for us that's been the way
that we've been doing we basically just have broken things down getting component level
information do it using the monitoring tools and then being able to use you know things like
containers to be able to take advantage of the tests we already have to be able to then generate
additional load all right so that's
cool so it's basically you're just repurposing the scripts or not even repurposing just reusing
the scripts that you already have for all your other testing or most of your other testing and
pushing it out at scale to generate load exactly i mean we um you know there'll be times when maybe
we'll do something um one-off uh and-off and use a different tool set and dedicated performance testers for something.
But at the end of the day, the people on the team know the scripts that they've created.
And so why make them learn something else or create something additional that then takes more work to support when you can take the test that they've already written and just scale them up and get a nice good end-to-end view of performance under load.
So one thing that I – I mean I love this and I think the –
one thing that you said is basically what it is all about is identifying regressions
and regressions are not only there.
It's not only red and green from a functional perspective, but also performance issues can easily – or not easily, but some of them can be identified even in the earlier stage where you just run a functional test by looking at the right metrics.
It's a theme that I've been promoting and we've been promoting for a while.
I call it metrics-driven CICD,
where I basically look at key architectural performance
and scalability metrics while executing something
like a unit test, a component test, a functional test.
Because if I'm a developer and I make a code change,
and that code change now means my local chain meter test
that is testing five API calls is now sending 20% more bytes over the wire, or it is making 5% more database calls. I know this is going to become a problem later on because it just needs more resources or I'm just inefficient in the way I am letting my components talk with each other.
And that will become a scalability and performance issue.
And it seems that you are doing the same thing, right?
You're allowing your developers or you're asking the developer, you're demanding your feature teams that they do a lot of these tests early on, identify
regressions early on.
So you can find a lot of problems much earlier without pushing these code changes into later
stages of the pipeline where executing your longer running tests would find the same problems,
but they would just take longer.
And so you're optimizing your pipeline flow.
Yeah, I think that's a, it's so critical. And I think that's the realization that many people
don't have. You think of some of these tools and you think just production and, and by using them
in non-prod, you get these insights and then you can, you know, create baselines, failure, build,
failure, failure, pipe, stop your pipeline and really react. And it gets the team more comfortable with the tools, how to bake some of this stuff in.
And then when you ask them to do something in production, they already have that familiarity.
They can already start to write alerts in production.
It's just – to us, it's been really eye-opening and has had a lot of value.
And even when you do find something late, it used to take us a while, days if not weeks, to find root cause of a performance issue.
And now that time is cut down drastically because of those types of tools.
And do your development teams, your feature teams,
I think you just said it, but just to confirm,
your feature teams also define what the monitoring strategy should be
for production, like how they can monitor the technical performance
and scalability behavior of their features?
I'm sorry, say that again. I said, do the feature teams,
are they also responsible for not only the testing,
but also defining the monitoring strategy?
That means what needs to be monitored
later on in production.
So for them to close the feedback loop,
to see how their feature
is actually really behaving in production,
because their feature might behave
a little different in production
than in what they've tested.
So are the developers and the feature teams also building in the monitoring
or defining the monitoring, the dashboards, the metrics, and all that stuff?
Absolutely. Absolutely.
Yeah, I mean, especially those are the same developers that have the pagers.
So when something's not working, they're the ones being called.
And so they absolutely have that accountability.
How do developers left it if they have to be on call?
They don't like it at all.
And we've had some teams that want to hire somebody to be a dedicated on-call person.
You know, so there is a lot of effort spent on making sure it's right up front. I mean,
um, I know Gene Kim has talked about it many times, right. But the first time a developer
has woken up at two in the morning because there's a performance issue in production,
uh, that's a great reminder when they're building something the next time to make sure that it's
working all around. Um, so, I mean, it definitely is really powerful
because no one wants that call.
Yeah, hiring somebody sounds like a cop-out.
Exactly.
I mean, it's just going backwards.
Yeah.
Yeah.
Wow, that's pretty.
And so can you tell me how many feature teams do you have
that work in parallel?
And how often do these guys deploy into production?
How long does it take for, like, if a code change to actually go all the way through?
I know it's like kind of three questions now.
But kind of we'll be interested in seeing how your development team actually looks like,
how often you deploy, what's the throughput of your pipeline.
It would be interesting to know.
Yeah, so we have multiple lines of business. So just in general,
we filed the scaled agile framework or safe, um, and our teams are aligned to trains. So,
you know, typically in safe, they talk about 10 to 12, 12, 10 to 12 feature teams working on a
train. And that's very common to what you would see at Capital One.
I'm sure it changes where you go, but that's the general rule of thumb,
and they're all working in parallel.
We try to practice where everyone is committing back to main or master
and not having prolonged feature branches.
And so in order to do that, you have to have some of these best practices
that we're talking about.
But yeah, I mean, code right now,
we have teams that are delivering multiple times a day
through this process.
And it's these best practices.
It's also taking advantage of blue-green environments,
feature toggles.
There's other enablers that make that
a reality, but I think you definitely could find instances of someone making a commit,
and then an hour or two later, that being in production. Now, whether or not an end user can
actually see it and touch it, that might depend the other uh pieces of that feature and you know how
we've we're rolling it out you know if we're doing canary builds or whatnot but um yeah we it's a
it's pretty frequent and uh and who defines your performance criteria so who defines what is
acceptable performance uh under which load and who is this
the feature team itself or is there a quote-unquote business business person that says well we're
working on this feature and therefore we're expecting x amount of users and it has to be
that fast or how does this work how do you find the performance criteria and the SLAs?
Yeah, so we practice acceptance test-driven development.
And one of the big things with AT&T is around that three amigos and where the business is a business person with developer and tester sitting down going through user scenarios, which we then develop and test against.
And so the same thing from a non-functional perspective.
So the business is there to provide that input around what the expected,
you know, user load is going to be.
And then we're able to work with the developers to figure out what the TPS
or the transactions per second is going to be and then, you know, build it from there.
There's sometimes what we'll do do maybe some exploratory stuff,
looking at response, looking at where degradation is,
but the business is involved in helping us to find those NFRs.
Now, I'm also just – I opened up my browser here,
and I found the Hygieia framework that you guys uh put up on github which which i think is
an awesome stuff in terms of visualizing your pipelines and i know you've been promoting this
at the different conferences and you're encouraging people to integrate with it and i think i mean i'm
still i still i want to i want to get in touch with you and with TopoPal, with getting a Dynatrace integration as well so that we can feed into your dashboard
seeing whether a particular build phase or a particular code change in a particular phase is good or not.
Do you want to tell us a little bit about Hygieia and how you use it,
why you built it, and also what your plans are.
And I think, Brian, we should probably put up a link on the podcast recording in case people don't know how to spell it based on my pronunciation.
I'll put up a link to the GitHub and also a direct link to the video on YouTube.
I watched that today.
It was a great overview.
Was that you recording?
I don't think that was you recording the audio.
Was that on the YouTube video? No. Okay. I don't think so. So Hygieia
came out of, a couple of years ago, we made this executive decision that we wanted that in order
to compete going forward, that our competitors weren't going to be big banks, that our competitors were going to be these other fintech companies, and Google and Apple and whomever. And so in order to be
able to play with them, we need to be known as an engineering technology company. And open source
is a key driver in that being able to move fast and being able to adopt quickly,
as well as when you bring in top talent,
there's an expectation that they want to be able to use and contribute to open source.
So as a way to kind of eat our own dog food in this journey,
we said, well, hey, as we're building out our pipelines,
the teams need some type of dashboard to be able to see what's happening,
where they are from an intent perspective, where they're at with their builds and commits,
what's the health of their pipeline, what's the quality overall. And so that's how we came up
with the idea to build Hygieia and then use that as a way to cut our teeth on contributing what is
needed from our perspective and a compliance perspective to be able to contribute back to open source,
get our name out there and have other people help us.
And so it's really been, it was named rookie of the open source rookie of the year last year.
We have a lot of big companies that are using it and contributing to it.
And for us, it's been a big tool to help provide Teams visibility into their pipelines.
And now the latest thing that we just rolled out is this product view.
So many products are a combination of multiple pipelines. So you might have a pipeline from a UI perspective,
from different APIs and other backend systems that you're using.
And so in order to get a good picture of flow,
we have this product view.
And basically what that tells you is for each one of your pipelines,
where are the commits and what stage are they at?
So how many commits are waiting for a build?
How many commits are waiting for a deploy?
How many commits are waiting to be tested?
How many are waiting for performance?
And then thus, what is your overall time to market
from a commit all the way through to production?
And so that's been able to help us get really crisp
on where we have bottlenecks.
You know, one of the things that one of the slides that we showed at Velocity this year was just that
when we first started rolling this dashboard out, the fact that, you know, a commit might sit in all
these other stages for, you know, a couple minutes or maybe an hour at most, but then when it gets to performance,
it might sit there for days. And so it really, you know, gives you a good indicator of,
you know, we have a problem. We have, you know, performance right now is still happening way too
late and it's happening too long. And so that was a real good reinforcement that we needed to lean
in more to that space.
But you wouldn't have had that.
We knew that was somewhat the case, but this visualization just gives you a good sense of where things are at.
And frankly, if something stopped, what impact does it have on other things?
Yeah, I'm just looking, I think, at the screenshot that you show the velocity it's also on the
github page where it says uh it's about 10 days from comet to production for the the first project
or the first product that is on there and it's just phenomenal because it shows you a heat map
on on where on your production line you have your bottlenecks and and obviously there's just
there's multiple options you can deal with the bottlenecks.
One thing you said earlier for performance testing,
now with new cloud technologies, you can paralyze a lot of work.
But on the other side is if you shift left some of this
and break it down into smaller units that can be run earlier and faster
and basically stop that build earlier before they clock the pipeline later at
heavier weight stages then this is much more efficient right because if you i'm looking at
this dashboard now and it says i'm not sure how many how many builds are in here but on average
a test runs 65 minutes in perf and you have like 10 15 builds in there so if you can if you can
if you can achieve the same level of performance testing
with a test that only runs a minute or two i mean not the same level but you can identify a lot of
these problems earlier and it only runs like like five minutes then you are shaving off 60 minutes
all of a sudden and you get faster feedback from engineers and they don't have to as you said
earlier it's the it's the biggest pain if as an
engineer you're in the middle of implementing a cool new feature and then you get a notification
that the stuff you committed 10 days ago is now a big blocker and then it's just like not help this
is this is just a thing the feedback loops we need to tighten them up and i think what you guys are
doing is just phenomenal yeah you know um uh gene kim, uh, Gene Kim, Jez Humble and, uh, Nicole Forsgren,
um, they help, uh, run and execute the puppet lab survey every year. And, uh, and people talk
about metrics all the time, but that, that lead time or time to market is really like the, one of
the biggest indicators of how well you're doing um you know but then how how can you
get a metric like that that then is actionable from a leadership perspective and this is you
know how we got to this dashboard you know where we can actually dive into like what what are the
bottlenecks that we're facing um it's not just one pipeline, it's across them, but being able to see that
visualize and be able to then you could actually click into it to see, you know, specifically,
which can what that commit is for and what might be getting held up. But it definitely was very
eye opening for us. And, you know, it did shake out other issues, not just a performance testing
bed, we had some other things where, you things where people had poor branching strategies where they had featured branches that lasted a long time, and then they got back into main.
And again, their lead time was very long.
And so it's been very valuable for us. and um so i know where you are and i kind of have an understanding on where you are right now and i
believe capital one and what you've done you are i think seen by a lot as as kind of somebody to
follow because you did some great work and you're far ahead of many but i know you're not you never
stop and you never stop to innovate and get better. So what are the next steps? What are the other next big milestones that you want to implement, that you want to change in order to stay ahead of competition, in order to stay flexible and innovative, in order to keep the talent?
Any other big projects that you're currently working on that's coming up? Um, you know, I think right now we're focused on, um, you know, continuing to, to roll this
out at scale.
Um, you know, it's easier to do some of this stuff on Greenfield work than do some of the
legacy stuff.
And so that's, that's the big mission right now is, is some of our core banking systems
that are, that are older, you know, how do we make sure that these same great concepts
can apply to them?
So that's some of the stuff,
things that we have going on now.
I think the other thing too is just around open source
and how do we continue to make more contributions
outside of Hygieia?
We have another product called Cloud Custodian,
which helps you better manage your AWS instances.
Right now we're looking at service virtualization
and we're looking at some of the chaos monkey type tools.
So I think that's probably the next foray
is just a greater presence within the open space,
open source communities and contributions from Capital One.
And so I think what you just said is a problem that many companies face. Obviously,
you started all of this, what we just talked about with some new projects where you had a greenfield experience and you could test new things and you could do things.
And now you try to apply that to your, let's say, older legacy enterprise systems.
I think this is a big challenge that a lot of people have, like how to get started.
And it seems on your case, you also started with some projects where you could actually try something new.
And now you try to apply that back.
Maybe not all of it because maybe not every system can be changed in a way you're doing
development here.
But at least you apply a lot of the lessons learned to the other systems to make them
faster, more agile, and I guess also motivate the people and keep them that are still working
on these older projects, right?
Yes, absolutely.
I mean the overarching theme of shifting left, removing constraints, being able to get fast feedback early, that applies to everybody.
And so maybe you might not be able to do every single thing that we talked about,
but you're able to implement some of those things and still have big impact, um, for your customer. Um, you know, that's the, you know, one of the other things
that we've, we've done as part of this transformation is, is use things like value
stream analysis, where you sit down with, um, um, a legacy group and you go through like what it
takes for, um, business intent to get developed and
tested and out the door and where are they spending time manually doing things and where
they have waste. And that's been a really great exercise for us to then be able to put together
an action plan of, okay, you know, you have, you're spending 25% of your time on your builds and your deployments.
Let's focus on that first.
Then, okay, you have high defect leakage rates.
You're spending time after the fact, and you're losing productivity,
and this is all done manually, so let's get some stuff automated. And basically you're able to then build out a plan for them to,
and you can show them in their, in their world, how this will have impact to them. And that's
been a real powerful tool for us to get business commitment because these things do take money.
They do take resources and time and, uh, you know, just generalizing at a high level is one thing, but being able to give
someone very specific, uh, direction based on their experience, uh, has been very powerful for
us in making that argument on why they need to do these things. And with one question I had with,
I think Andy and I have discussed this in the past.
It's something that I think people are starting to think about more, but I haven't seen it deployed too much,
is another metric for operational metric could be cost of operations, you know,
especially if you're going into public clouds or, you know, areas where you can actually track how much you're
spending on hardware, network, disk, and all these other components so that you could see
build to build, feature to feature, how much your costs are going up and down.
Is that something you've thought about?
Is that something you're doing at Capital One?
Is that something that's ever been on your radar?
It definitely is on our radar. We use it from the sense of, you know, when we first went to the cloud, we weren't as efficient as we should be.
Meaning we didn't have immutable servers.
The infrastructure might not have been all scripted out.
And so when there was a patch, there was downtime.
Servers were always up. the environments weren't very elastic. And so yes, we did look at those infrastructure costs
and very detailed by each one of our platforms and applications. And we use that as a mechanism
to then go back and drive best practices.
You know, you could get very transparent on people who did things through brute force versus the right way. looking at the costs from a cloud perspective and then translating that back into how well people have engineered their pipelines
and their utilization of the cloud to stay as efficient.
Because the whole play about going to the cloud is that it was going to be more economical for us than having these big data centers.
But we definitely saw early on that that was not the case because of how we implemented it.
And so it really is an important metric you have to look at.
Do you run the public cloud as well or do you build your private clouds?
So our CIO, Rob Alexander, has been very public about us moving into AWS. So right now, we're still one or two years into that journey.
But I think at some point,
the goal is that majority of things are in the cloud,
in the public cloud.
Hey, and to actually,
I know it seems questions keep coming, but it's just interesting to have somebody like you there that we can ask questions so uh besides the cost
effect of let's say you know features that run in software runs do you guys also are the feature
teams also monitoring let's say which features are used how, which features may not be used to make a decision later on, which features to either kick out if it's not that used or features to optimize because you see, wow, it's amazing.
People keep using that feature versus the other feature, but in order to make it more efficient, it makes more sense now to optimize it to keep down the operational costs.
So do you have something built in where you keep monitoring a feature set
and then to use that as a prioritization vehicle for the upcoming sprints?
So definitely on our customer-facing sites and apps, the business absolutely has that information,
and they're using it from a product owner, product management perspective and prioritization. I'll tell you that we're using it,
we're experimenting with it in some other aspects around team performance and trying to get teams
more familiar with those metrics. One of the things that we've been talking about is like,
how do you demonstrate a
high performing team? How does a high performing team? And so what we talk about is, you know,
high performing team is delivering value to their customer early and often. And then you get into
what is value. And so having some of those types of metrics really has given us some insight. And
again, like this like these are experiments.
It's not necessarily widespread just yet, but like we've used it internally on some of our test data management tools.
So we built a homegrown application called OneSource,
which gives feature teams the ability to get their own data in real time,
on demand, as part of their pipelines.
And so for this tool, you know, we've implemented monitoring
tools so that we can see how people are using it, what features they're using, where they might have,
uh, you know, fallout rates or abandonment, uh, and then focus our, the next features that we're
building on those areas. And that's, uh, um, we're seeing some really positive results with that. And so hopefully that will grow more.
Cool.
Yeah.
Nice.
Well, I guess we could probably go on for a while
because I have a list of questions that are still up,
but I know we also have to make sure we stay within our time constraints, Brian,
because otherwise we keep going.
I mean, Adam, I want to say thank you so much
for trying and doing the same thing that we try to do
is spreading the word about
that we just need to build better software
and release it more often.
In order to do that, we need to shift left quality checks.
And I think that's just awesome.
Hopefully, you also find our podcast useful to spread the word.
And we do keep educating the people also at the speaking engagements that we have.
I'm really looking forward to also contribute to Hygeia.
I think it's just an amazing project that you do and actually giving back to the community.
But obviously, there's also benefits for you, for everybody that opens up to open source and that contributes and puts projects out there.
So I think it's just a win-win situation. It's awesome to see that a company like Capital One says our main competition is not the big banks,
but it's the Google and all the other companies because we are a technology company,
even though our main business is dealing with money.
But I think that's just a wake-up call from a lot of companies out there that need to transform
and think differently about how they go about their business.
Because we know we are a software-driven world,
and that's true for every business out there.
Yeah, I also think it's really cool that Capital One
gave everybody a chance to level up.
You know, that's kind of the scariest thing in the testing field is,
as all this stuff changes, what's going to happen to me, right?
And I think it's really awesome the way they tackled it. You know, there's always going to happen to me. Right. And I think it's really awesome. Uh, the way they
tackled it, um, you know, there's always going to be, unfortunately, some people who aren't up for
the challenge, but for, you know, anyone, you know, listening or who, who likes to follow this
stuff, they're most likely up for the challenge. And I think that's really great that they gave
them that opportunity to do it because I'm sure it's a lot more fun than running through tests
from top to bottom and
putting checks in boxes or whatever, you know, the old model was. Yeah. Well, I appreciate you
guys having me. I hopefully I didn't, hopefully people find this interesting and educational.
And, and I think, uh, to your point, I mean, there is a lot of great, uh, podcasts, there's books,
there is a lot of information out there.
And so it's not impossible to transform.
It's just a matter of like participating and doing it.
Yeah.
So any appearances coming up for you folk?
So I'm speaking at Star West in Anaheim in October,
and then in November I'm speaking at the DevOps East in Orlando.
Andy?
Well, I think when we did that little rehearsal, actually, I was smiling
because we will see each other at these two conferences.
So I think the stalking continues, but please don't see it that way and adam as i said earlier when we were off the mic
i will definitely make sure we get a couple glasses of beer or wine or whatever your preference is in
beverage uh to make sure uh we show the appreciation uh that you are on this on this podcast
additionally to the stuff that i mentioned i'm going to be at Java 1, which is going to be just a couple of days after this airs in San Francisco.
And I also have QCon in San Francisco and CMG.
Both of them are in November.
So looking forward to that.
Hopefully, if some listeners are out there at these conferences, ping me.
You can also ping me on Twitter at GrabnerAndy.
And I think, Adam, you also have a pretty cool Twitter handle, don't you?
Yes, I am bugman31.
So please follow me, tweet me, ask me questions.
I am emperorwilson on Twitter.
I actually have a performance coming up, an appearance for the first time, Andy.
My old band is getting back together for a reunion show.
So November 5th in Jersey City at Monty Hall in the performance space at WFMU will be opening for one of our influences from the 60s, the Silver Apples.
So if you're in Jersey City,
American Watercolor Movement is the name of the band.
I just found out
from somebody else
that it was officially on,
not directly from
one of my bandmates,
which was kind of funny.
They're like,
oh, I heard you guys are playing.
I'm like, oh,
I didn't know that was official.
So yeah, it's not a performance.
Well, it is a performance related,
but a different kind
of performance.
Anyhow, yeah,
you can also tweet.
If you want to tweet anything about the podcast,
do it hashtag pureperformanceatdynatrace,
or you can also email us at pureperformanceatdynatrace.com.
We'd love to hear from you.
Anybody who wants to come on, be a guest,
please don't hesitate.
We love having guests,
and we love hearing stories from everybody.
So participate, be part of us, be one of us.
Join us.
Adam, thank you so much for taking the time today.
Enjoy the rest of your time up in warm California.
Awesome. Thanks, guys. I appreciate it.
All right. Goodbye, everybody. Thank you. Awesome. Thanks, guys. I appreciate it. All right.
Goodbye, everybody. Thank you.
Bye.