Screaming in the Cloud - Summer Replay - That Datadog Will Hunt with Dann Berg
Episode Date: August 15, 2024In this Screaming in the Cloud Summer Replay, we revisit our conversation with Dann Berg. At the time, he was a Senior Cloud Analyst at Datadog, but he now provides community support for the ...FinOps Foundation. Dann and Corey go into the weeds of cost optimization, and each of them bring their respective experiences forward. Dann’s offers his take on multi-cloud and how Datadog is tackling its customer needs there. But the talent doesn’t end there, Dann is also an emerging thinker and influencer in the space, and to boot, an accomplished writer and playwright. Two of his plays have been produced in NYC and China. Check out their conversation!Show Highlights:(0:00) Intro(1:02) Duckbill Group sponsor read(1:36) Transitioning to Senior Cloud Ops Analyst(5:12) The composition of Dann's team(6:54) Cloud cost optimization in the regular business cycle(10:43) Helping customers understand their cloud bills(17:42) Paying attention to pricing changes(21:06) The psychology of cloud economics(23:20) Working with multiple clouds(25:02) Duckbill Group sponsor read(25:46) Spending too much money to save too little money(31:12) The dangers of relying on third-party tools(34:01) Pricing woes(36:25) Where you can find DannAbout Dann BergDann Berg currently works part-time with FinOps after spending more than a decade in the industry. He is also an active member of the larger technical community, hosting the monthly New York City FinOps Meetup, and has been published multiple times in places such as MSNBC, Fox News, NPR, and others. When he’s not saving companies millions of dollars, he’s writing plays, and has had two full-lengh plays produced in New York City and China., Dann is the Director of Community at Vantage. Previously, first FinOps Practitioner at Datadog and FullStory. Host of the NYC FinOps Meetup for almost three years. He also writes plays.Links:Datadog: https://www.datadoghq.comPersonal Website: https://dannb.orgLinkedIn: https://www.linkedin.com/in/dannberg/Twitter: https://twitter.com/dannbergMonthly newsletter: https://dannb.org/newsletter/Previous SITC episode with Dann Berg, Episode 51: https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/episode-51-size-of-cloud-bill-not-about-number-of-customers-but-number-of-engineers-you-ve-hired/Original Episode:https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/that-datadog-will-hunt-with-dann-berg/Sponsor:The Duckbill Group: https://www.duckbillgroup.com/
Transcript
Discussion (0)
Like I said, the product lifecycle, when you're building something new, you want to go as fast as possible.
When you're launching it, you want it to be as reliable as possible.
Once you're launched, once you're reliable, then you can start focusing on costs.
Welcome to Screaming in the Cloud. I'm Corey Quinn.
If there's one thing that I love, it is certainly not AWS billing.
But for better or worse, that's where my career has led me.
Way back in episode 51, I had Dan Berg, the cloud ops analyst at Datadog, and now he's
back for more.
Things have changed.
He's now a senior cloud ops analyst, and I'm hoping my jokes have gotten better.
Dan, thanks for being bold enough to come on and find out.
Yeah, I'm excited to see if these jokes have gotten better.
That's the main reason for coming back.
Exactly.
Because it turns out that death, taxes, and AWS bills are the things that are inevitable
and never seem to change.
Yeah, they just keep coming.
They never stop.
And they're always slightly different than you expect.
I guess just like death and taxes.
This episode is sponsored in part by my day job,
the Duck Bill Group.
Do you have a horrifying AWS bill?
That can mean a lot of things.
Predicting what it's going to be,
determining what it should be,
negotiating your next long-term contract with AWS,
or just figuring out why it increasingly resembles
a phone number, but nobody seems to quite know why that is.
To learn more, visit duckbillgroup.com. Remember, you can't duck the duck bill bill.
And my CEO informs me that is absolutely not our slogan.
So when we spoke back in, I want to say 2019 is when it aired. So probably
that ish is when we had the conversation, if not a little bit before that.
You were effectively a team of one and, as mentioned, had the CloudOps Analyst title.
Now you're a senior CloudOps Analyst, which I assume just means you're older.
Is the team larger as well?
What does that process look like?
How has it evolved in the last couple of years?
Yeah, it's been interesting, especially being at a single organization and that organization being Datadog, that to be able
to grow the team a little bit. So as you said, it was just me. Now it's a total of four people,
including myself. So three others. And yeah, it's been interesting just in terms of my own
professional development, being able to identify what needs to be done, how much capacity I have, and being able to
grow it over time, especially in this fairly new space of being specifically focused on
cloud cost billing. So kind of that bridge between engineering and finance, which itself is kind of a
fairly new space still. It is. And my favorite part of having these conversations with folks
who have no idea what this space is, is learning when I was starting out how to talk about this in a way that didn't
lead down weird paths. It's, oh, you save money on Amazon bills. Can you help me save money on
socks? It's like, no. Well, yes, get the prime card. It gives you 5% off, but no. And yeah,
I talk about camel, camel, camel and other ways of working around the retail side, but that's not
really what I do.
It's similar to back when I was doing SRE style work.
I made it a point never to talk about being someone involved in working in tech,
or suddenly you're the neighborhood printer repair person.
Similarly, you have, I guess, gone in a strange direction because you weren't, to my recollection, someone who had a strong SRE background.
That's not where you came
from in the traditional sense, is it? No, not an SRE background at all. Yeah, I mean, it's really
interesting. So talking about this space, I mean, people are calling it a lot of different things,
cloud economics. The term FinOps, financial operations, is being used a lot now.
Cloud financial management is another popular one. Oh, swing a dead cat, you'll hit 15 different words. And I give my advice on that, even though I hate some of the
terms, is cool. If people are going to pay you to have a title, even if you think it's ridiculous,
you can take the money or you can die on a petty naming hill. And here we are.
Yeah. And it's interesting because the role that I was hired for at Datadog was very much this niche, very specific role that I didn't
realize was a niche, very specific role at the time. So previously I was at a company and I was
building out their data center. So I was working with vendors, buying servers, sometimes going on
site, installing, racking those, dealing with RMAs. And I was getting more involved as their cloud usage was growing and bringing some
of those like hardware capitalization cost procedures to the cloud. And so I found myself
in this kind of niche role in my previous company. And at Datadog, they basically had the exact same
role that was dealing with all of the billing stuff around the cloud kind of from an engineering
perspective, because it was from an engineering perspective,
because it was on an engineering team, but working closely with finance. And I was like, oh,
these are the skills that I have. And it kind of fit perfectly. And it wasn't until after I got to
Datadog and kind of was doing more research about the specific space that I discovered just how
wide open it was. And I mean, meeting you was one of the earliest things I did in the industry,
discovering the FinOps Foundation
and a few other things is kind of like opened my eyes
to this as an actual career path.
It's an expensive problem
that isn't going away anytime soon.
And it is foundational and core
to the entire rest of how companies
are building things these days.
My argument has been for a while
that when it comes to cloud, cost and architecture are building things these days. My argument has been for a while that when
it comes to cloud, cost and architecture are the exact same thing. You don't have the deep SRE
architect background, but you're also now a member of a four-person team. Does everyone in the team
have the same skill set as you, or do you wind up effectively tagging in subject matter experts
from different areas? How is the team composed? People love to ask me this question, and I strongly believe there's no one way to do it,
but what's your answer? Yeah, I mean, the team works very much in terms of everybody kind of
taking on tasks that they need to do, but we did hire for specific skill sets when we tried to
find people. So the first person that we hired, we wanted them to have more of a developer,
engineer-type background, writing code, stuff like that. The third hire, we wanted them to have more of a developer engineer type background writing code,
stuff like that. The third hire, we were looking for somebody that was more of kind of a generalist.
I see myself more as a generalist in the space. Anything that's going on, I can pick it up and
make some progress on it and build something out. And then the fourth person, we were lacking some
of the deeper FP&A or finance experience. And so we found
somebody with more of that kind of background and less of the engineering experience, but they were
eager to kind of move from finance into more of an engineering role. And I feel like this is the
perfect role for that because I feel like there are a lot of non-engineers that want to break
into engineering and don't really know how to do it. And if you are in finance, in FP&A, finding one of these more cloud cost optimization specific roles is the
great way to bridge that gap, I feel. The last time we spoke, I was independent,
doing this all myself. And it turns out that taking all of the things that make me me
and trying to find those in other people is a relatively heavy lift, even if you discount the
things like obnoxious on Twitter. So how do you start decomposing that? Well, now we're a dozen
people and we've found ways to do it. But by and large, in our experience, for the way that we
interact, and I want to get to that in a second, is that it's easier for us to teach engineers how
finance works than it is the opposite direction. And there are exceptions to
that. And as we scale, I can easily see a day in the near future where that is no longer the case.
However, we also have two very specific styles of engagement. We do our cost optimization projects,
where we go into an environment and, oh, fix this, turn that thing off. Do you really need
eight copies of those four petabytes of data? Oh, you didn't realize they were there. Great. Maybe delete it. And we look like wizards from the
future and things are great. The other project that we do is contract negotiation with AWS,
especially at large scale. It's never as simple as people would have you believe because, oh,
you're doing co-marketing efforts and you're,
you have a very specific use case and there are business partnerships on 15 different levels. And that all factors into how this works. It's nuanced and challenging. And of course,
because it's a series of anecdata, I can't really tell too many stories in public about that.
But those are the two things that we wind up focusing on. You are focusing on a very different
problem. You're not moving from company to company,
basically re-implementing the same global problem,
solving it locally for them.
You are embedded in an account for the duration,
almost four years now by my count.
And okay, I guess I could just do a whole bunch
of cost optimization projects on a quarterly basis
in an environment like that.
It doesn't seem like it solves the problem
in any meaningful way.
What does your team do?
Yeah, well, I mean,
that's such an interesting question.
Just in terms of, yeah,
if you're doing consulting,
you're kind of starting from square one
every time you get a new contract,
a new engagement.
And being at the same company for,
like you said, about four years,
going on four years now,
you really have a chance to dive in
and think about, okay, what does it mean to work cloud cost optimization into just the regular
business cycle of like how it works, right? Because, I mean, you have the triangle that
everybody's familiar with. Things can either be like cheaper, faster, efficient, and at different
stages in a product lifecycle, you want to be focusing on these areas more or less. And so on our team, kind of the different things that I'm
thinking about is first is visibility, right? Is you want to provide engineers visibility into
their cost and not just numbers, right? Actionable visibility, where if something needs to change,
they need to do something, they know what that is. And a lot of times that means not just cost, but also efficiency. So like these are the metrics
that this particular application should be scaling against as this application grows,
as usage grows, are we remaining as cost efficient? Then there's also the piece,
as you're saying, like discovering things within the infrastructure that like,
hey, if we make this change or if you turn this off, if we do things this way,
we'll save a bunch of money. Let's do those. There's things like reservations, committed use discounts for GCP,
all of those kind of things we manage. And then dealing closely with verifying our bill,
working with finance, FP&A on cost modeling, forecasting, both short term, like within a
month, like what are we going to be at the end of this month? And it's the 10th right now.
And also like, what does our next quarter look like? What are our next two years look like? And that kind of bleeds into
the contract negotiations, those kinds of things as well. So, I mean, it's setting up the cycles of
how do you prioritize this work? What is the company focusing on at the time? And what can
you do when the company is not focusing explicitly on deciding to save money?
One of the more interesting aspects of my work that I didn't expect is
whenever I wind up starting an engagement
or even in the prospect stage,
I love asking the dumbest possible questions
I can think of because it turns out they're not.
And the most common one that I always love to start with is,
oh, okay, your AWS bill's too high.
Why do you care?
And that often takes people aback.
But once you dig down underneath the surface just a little
bit, it becomes pretty clear that the actual goal is not that it's too much money, because, spoiler,
payroll always costs more than infrastructure. Instead, it's how do I think about this? How do
I rationalize what the additional costs are going to be per thousand monthly active users or whatever
metric it is you're choosing to use.
And how do you wind up forecasting that?
Because the old days of data centers where you,
well, we're going to spend a boatload of money and then we'll have capacity for the next two years,
maybe down to 18 months, depending on growth.
That's easier for companies to rationalize around
rather than this idea of incremental cost
on a per unit basis, but not exactly
because it also turns out that
architecture changes, problems of scale, AWS pricing changes from time to time all tend to
impact that. What I think is not well understood in this space is that, yeah, if you have a 20%
overage this month, people are going to have some serious questions, but they're also going to have
those same questions if you're 20% low. Yeah. I mean, understanding why people
care about the cost is definitely the first step because with a single company, so it's just
constantly looking at the numbers rather than understanding exactly what motivations a company
has to contact somebody like you, like a consultant, right? Because usually I imagine that it's going
to be a bill, maybe two bills, three bills come in and they keep going up and up and up and they need to go down.
And they're going to have an explicit reason why it needs to go down.
Finance is going to say, like, margins are X, Y and Z or revenue has done this.
Our costs can't do this. There's going to be explicit reasons because if there aren't reasons, then they shouldn't necessarily be focusing on costs at that moment in time. And what you want to do is kind of have, I mean, this is way more complicated
than just saying it out loud, but have a culture of like cloud cost mindfulness where people aren't
just spinning up resources willy nilly. But also like my goal is for people not to have to really
think about cost that much other than just like in a way that helps them do their work. Because I mean,
I want engineers to be able to build stuff and build stuff fast. That's what the cloud is all
about. But I also want to be able to do it in a way that isn't inappropriately high in cost.
I have my thoughts on this and I've shared them before and I'll dive into them again,
but how do you approach that? If Datadog makes a grievous error and hires
me to write code somewhere as an engineer, what is the, I guess, cost approach training for me as I
wind up going through my onboarding as part of an SRE team or an application team? I mean, this feels
so basic as to not even be the right answer, but honestly, visibility is the easiest and best thing
that you can give people.
And so we've built out some visibility reports that engineers get on a regular basis.
We also meet with our top, what is it, 10 or 15 spending internal engineering teams
on a monthly basis to go over those costs so that they understand what they're looking
at, so that we understand the context behind it, so that we can understand what's on the
roadmap going forward, so that when things and the costs happen, we're aware.
And then we're just staying on top of things.
And if we have questions, we have an open dialogue with engineers and things like that.
In an ideal space, it would be great to have cost, I guess, more fit into the product development
lifecycle in a more deeply ingrained way. But at the same time,
I really don't want to serve as a gatekeeper. Our goal is not to stop any sort of engineering
process. And we haven't needed to do anything like that, although I guess every company is
going to be different in terms of what their needs are. But yeah, I'm totally happy kind of
being a little bit more reactionary in terms of looking at the numbers and responding and then proactive just in terms of the regular communication with people. us over-fixating on it. Left to my own devices in my personal account, I'll see a $7 a month bill,
and oh, I'm going to spend two weeks knocking that down to $4. And of course I can do it,
but is that the best use of my time? Absolutely not. Very often, what is a lot of money to an
engineer is absolutely not to the business, and vice versa. When you bring in a data science team,
it's, oh yeah, we need at least four more exabytes of data because we never learned to do a join properly. Yeah, maybe don't
do that. Like understanding the difference between those two approaches is key. But I've always been
of the mindset that I would rather bias for letting developers build an experiment and have
things that catch outsized things quickly than trying to wind up
putting a culture of fear around cost.
Because I'd much rather see
whether the thing they're trying to build
is possible to build,
then go back and optimize it later
once that's proven out.
But again, this is a nuanced thing.
Everyone seems to think I have this back pocket answer
that will apply to all companies.
And you've been doing this at Datadog
for almost four years with a team of people. I am an outsider. I see the global trend. I see what
works in different ways in different companies. But the idea that I can sit down and say, oh,
well, clearly the thing you're doing is completely wrong because that's not how I think about it,
is the hallmark of a terrible consultant. There are reasons that things are the way that they are.
And it's generally not that people are expecting to do a terrible job today, you know, unless they work in
the Facebook ethics department, which is neither here nor there. Yeah. I mean, like I said, the
product life cycle, when you're building something new, you want to go as fast as possible. When
you're launching it, you want it to be as reliable as possible. Once you're launched, once you're
reliable, then you can start focusing on costs. It's kind of like not the universal rule, but kind of the flow that I tend to see. So as you're at
a company that is regularly innovating, creating new products, going through that cycle, you're
going to have these kind of periods, as well as you have the products that have been around.
There's a lot of legacy code. There's a lot of stuff going on that maybe isn't the best or some efficiency work that has been
deprioritized for whatever reason that maybe it's time to start considering doing this.
So keeping track of all of that. And like I said, if for whatever reason the business wants to focus
on cloud cost efficiency or a team has decided
that in a particular quarter, or for a particular reason, they want to focus on that, being able to
assist as much as you can, being able to save all that work so that there's kind of like a queue
that you can go to when it is time to focus on cost efficiency stuff.
So here's a fun one for you. As of the time of this recording,
it's a couple weeks old,
but if you're anything like what we do here
for some of our more sophisticated clients,
we do occasionally build out prediction models,
models of economics that wind up
defining how some architectural patterns
should be addressed, et cetera, et cetera.
What's always fun is the large clients
who have this significant level of spend
on outlier service.
Like every once in a while, it was great that we got to do a deep dive into the Washington Post's
use of Lambda because normally Lambda is a rounding error on the bill. They had a specific
challenge and they did a whole blog post on this for the AWS blog. I believe the monitoring tools
blog, but don't take my that at face value. I never remember which AWS blog is which because
AWS doesn't speak with a single voice on anything. But yeah, most of the time it's block, tackle,
baseline stuff that is the big driver of spend. But a few weeks ago, they changed the pricing
dimensions for S3 intelligent tiering, where there's no longer a monitoring charge for objects
that are smaller than 128 kilobytes, and there's no 30-day minimum. So the fact that
those two things went away removed almost every caveat that I can picture for using S3 intelligent
tiering, which means that for most use cases, that should now be the default. I imagine you
caught that change as well, since that's one of those wake up and take notice no matter what time
of the world it is where you are when that gets dropped. How did that change your modeling? Or did that not
significantly shift how you view any of this? No, I mean, I think part of our role within the
organization is to pay attention to stuff like that. And then to just have those conversations
with the teams that I know we're either exploring intelligent tiering. We do some pricing modeling
for different products, S3 storage for
different types. So updating those and being like, hey, this might be something we want to actually
use and explore now. Similar, and I guess more of something that I actively worked on that I
consider in the same category is when Amazon announced savings plans as replacing convertible
reservations, right? Because at first they announced and being like,
okay, well, it's going to automatically like rebalance between different instance families
across regions to which convertible RIs could never do it. And it's going to be the exact same
price for a compute savings plan as a convertible RI. And we were kind of like, what's the catch?
And we spent a few weeks doing
a deep dive, working with our data science team, kind of being like, where is the catch here?
Yeah, the real catch is that you can't sell it on the secondary market if it turns out you bought
the wrong thing, which if that's your plan A, then good luck. Yeah, we definitely don't use
that secondary market. I don't have as much experience there, although I'm sure some people can use it to their advantage.
Almost no one does.
In fact, the reason that it exists, my pet theory,
is that once upon a time,
companies would try and classify
some of their reserved instance purchases
as capital expenditures,
which there has since been guidance
from regulatory authorities not to do that.
But at the time, the fact that you could sell it
to a third party on the secondary
market helped shore up that argument. If you're listening to this and you're classifying some of
your RIs as CapEx, please don't do that. Feel free to reach out to me. I can dig out the actual
regulation and send it to you. There are two of them. It's a nuanced topic. If you're listening
to this and have no idea what I'm talking about, God, do I envy you. Yeah, definitely don't do that.
There was a lot that was interesting about savings plans. When I was read in on a month or so in
advance of them being announced, it was great. I want to see this and this and these other things
too. And some of those things came to pass. It was extended to work with Lambda. Now, I don't
believe that that is financially useful in almost every case, but it doesn't
need to be because so much of cloud economics, from where I sit, is psychological in nature,
where, oh, we have this workload that lives on EC2 instances, and we want to move it to Lambda,
but we already bought the reserved instances, so we're not going to do it because of sunk cost
fallacy, which is not much of a fallacy when it's that kind of money in some cases.
Okay, great.
Now, if it can migrate to Lambda and still wind up getting the discounts you've paid for,
you have removed an architectural barrier.
And that's significant.
Now I want to see that same thing apply to,
oh, if you move from EC2 to RDS
or DynamoDB or anything else,
that should be helpful too.
But whatever you do,
don't do what SageMaker did
and launch their own separate savings plan
that is not compatible with the compute savings plans.
So effectively, it's great.
You're locked in architecturally to one or the other
because machine learning is, once again,
a marvelously executed scam
to sell pickaxes into a digital gold rush.
I mean, I like savings plans a lot,
and we've been slowly, as convertible,
our eyes have expired replacing them like savings plans a lot, and we've been slowly, as convertible, our eyes have expired, replacing them with savings plans.
And I think that it is pushing the other cloud providers forward because we're definitely multi-cloud.
And so that's really useful.
And I hope more people take on the compute savings plan type model just because it makes our lives so much easier or makes my life so much easier in terms of like planning it, selling the commitment internally, just everything about it has made my life easier. So, I mean, how many years later are
we? I definitely haven't found any big gotchas, I guess, than the secondary market, but that doesn't
really impact me. Yeah. I spent a lot of time looking for it too, doing deep analyses of, okay,
for which instance classes in which regions is there a price discrepancy?
And I finally got someone to go semi on record and say, yeah, there should not be any. Please
ping us if you find one. Okay, great. That is enough for me to work with.
Exactly. We got that too and didn't believe it. So we were like downloading price sheets and like
doing comparison, doing all that stuff. Oh, trust but verify. And when we're talking this kind of
money, I don't trust very far. They make mistakes on billing issues from time to time.
And I get it.
It's hard.
But there are challenges here and there.
I am glad you mentioned a minute ago
that you are multi-cloud
because my position on that has often been misconstrued.
I think that designing something from day one
to work on multiple cloud providers is generally foolish.
I think that unless you have a compelling reason
not to go all in on one cloud provider,
that's what you should do.
Pick a cloud, I don't care which, and go all in.
Conversely, you have a product like Datadog
where your customers are in multiple clouds.
And first, no one wants to pay egress
to send all of the telemetry from where they are into AWS.
And secondly, they're not going to put up in many
cases with their data going to a cloud provider they have explicitly chosen not to work with.
So you have to meet your customers where they are. In your case, it is absolutely the right
thing to do. And Twitter often gets upset and calls me a hypocrite on stuff like this because
Twitter believes that two things that take opposite positions cannot possibly both be true, but the world is messy.
Yeah. And I mean, the nice thing about us being in multiple clouds is we are our own biggest user,
right? And that's actually one of the reasons why I love working at Datadog is because I get to use
Datadog all the time. And not only that, Datadog is on everything and we have all of our products. I'm very spoiled with all of this.
But I mean, we are running in these different cloud providers.
We are using Datadog and those different cloud providers.
And that is just helping everything overall, too.
In addition to like supporting customers that are in each cloud, because that is a huge
reason as well.
Here at the Duckbill Group, one of the things we do with, you know,
my day job is we help negotiate AWS contracts. We just recently crossed $5 billion of contract
value negotiated. It solves for fun problems such as how do you know that your contract that you
have with AWS is the best deal you can get? How do you know you're not leaving money on the table?
How do you know that you're not doing what I do on this podcast and on Twitter constantly
and sticking your foot in your mouth? To learn more, come chat at duckbillgroup.com.
Optionally, I will also do podcast voice when we talk about it. Again, that's duckbillgroup.com.
One of the problems that I keep running into across the board is that with things like Datadog,
and again, not to single you out, every monitoring vendor to some extent has aspects of this problem.
It's that when I'm a customer and I'm hooking my accounts up to Datadog, I want you to tell
me about things that are going on. But the CloudWatch charges can be so egregious
on the customer side where it is bizarre
and frankly abhorrent to me
when I wind up paying more for the CloudWatch charges
than I am for Datadog.
And let's be clear here,
I am in fact a Datadog customer.
I pay you folks money, not a lot of money,
but I pay you money
because I have certain things that I need to know
are working for a variety of excellent reasons. And the problem that I keep smacking into on this is
it's not your fault. It's not anything you can do. In fact, you are one of the better providers as
far as not only not being egregious with the way that you slam the CloudWatch endpoints, but also
in giving guidance to customers on how to tune it further. And I really wish that more folks in your space would do things like that. It always bugs me when
I wind up using a tool that tries to save money that in turn winds up costing me more than it
saves. Yeah. Yeah. It's tricky there. I have less experience myself setting up Datadog and running
it in my own infrastructure as I'm more digging deep into the cost stuff and
us using the cloud. So I can't speak to that specifically. But yeah, you're not the first
person that I've heard have that experience. And again, it's not your fault at all. I've
been beating up the CloudWatch team for years on this, and I will continue to do so until
I'm safely dead, which, depending on Amazon's level of patience, might be in mere minutes.
In the larger picture-wise, we have to remember that we're super early in the cloud adoption,
even looking at the cloud economics, FinOps, cloud cost optimization world, right?
I feel like most businesses at this stage in their journey are still in data centers,
and they're dealing with the problem of how do we move to the cloud and do it cost efficiently? How do we set everything up? And that's where the world is right now. And I think
that dealing with, okay, we are 100% running in the cloud. What are the processes that we have in
place? How do we think of finance and the finance organization, not through the lens of we once had
data centers and now we don't, but how do we look through that
in the lens of, okay, we are cloud native from day one. What does the finance department look like?
And dealing with those problems is really interesting because Datadog has never been
in a data center, right? We are cloud native from the very beginning. And so it was interesting for
me to join the company and build up a lot of these processes because it is different than what a lot
of other
people were dealing with and doing. And it presents some really interesting problems and questions
that I think are going to be the foundation for the next decade of building companies and operating
in the cloud. I always love having conversations with folks who are building out teams to handle
these things because usually the folks I keep talking to or who want to have conversations
like this are building tools themselves to solve this problem through the
miracle of SaaS, where they will bend over backwards to avoid ever talking to a customer.
And we're all dealing with the same AWS APIs. There's not that much of a new spin you can put
on most of these things. But understanding what customers are actually trying to do instead of
falling down
the rabbit hole trap of,
hey, turn off those idle instances
that are all labeled DR site
because you probably don't need them
is foolish.
And after a few foolish recommendations,
tooling doesn't get there.
I'm a big believer that tools
can assist the process
and narrow down what to look at.
I believe they shouldn't have to exist.
I think that the billing dashboard should
be a hell of a lot better natively than having to pay a third party to make sense of it for me.
But by and large, I do believe this is a problem that is best solved from a consultative approach.
I mean, when I started this place, I was planning to build out some software,
tried doing it called DuckTools and wound up mothballing the whole thing because what we
were building was not what the industry claimed to want. And frankly, educating people into a position
where then they see the value and only then will they buy has never been a game that I wanted to
play. Yeah, I really like that article that you guys published about exploring that product and
the reason why you decided not to pursue it. But it's super interesting in terms of where the industry is going and building out those
tools, because I found that there isn't really any sort of new thing that you can do with the tools.
All the tools that exist for looking at your costs are largely the same. The main differences that
I've seen is that the UI is slightly different and they have different sales teams.
And if the sales teams are better, they're going to get more of the market share.
And if the sales teams are not as good, it's going to be a smaller market share.
And like, it's weird to to be in this industry for as long as we have been and seeing like, OK, well, Andersen Horowitz just funded this new company and this other company
got invited into Y Combinator or like all of these things that are happening. And I'm kind of like,
okay, but what is this tool really doing differently? And there are a few of them that
are doing something kind of innovative and different, but there's also a few that are just
like, this is a space where people are in, there's money here, we're doing the same thing, but we got
our sales team and we'll carve out our little
corner, and then we'll get acquired, and that'll be that.
Although I guess we're just at that stage of innovation in this space, I guess.
Yeah, I have no earthly idea what the story is around how these companies plan to differentiate,
because it seems to me that they're directly attempting to compete with Cost Explorer,
which it's taken some time for that thing to improve to the point where it is now, and it'll take further time for it to improve beyond
it. But long-term, I don't think you're going to outrun AWS on a straight line like that.
Yeah. I mean, when you work for one of these third-party cost tooling things,
and you're working with one of your customers, and they're like, how do I view this?
And it's kind of like, that is the easiest thing to find in Cost Explorer as well.
I can't imagine being like, well, you should pay me thousands, tens of thousands, hundreds of thousands of dollars a month to view it here when like Cost Explorer is free.
And I think Cost Explorer, it doesn't do everything, but it's gotten a lot better at what it does.
And it could probably solve 90% of people's problems without using a third-party tool.
You are at significant scale in multiple clouds.
So the answer that these companies always give is, ah, but we provide a single dashboard
so that you can look at costs across multiple providers in one place.
Is that even slightly useful to you?
Man, if you need dashboards, get a dashboard tool.
Don't get this crazy cost analysis tool, right? I mean, like there are some great dashboard
solutions that you can get where you can connect your detailed billing, cost and usage report,
whatever cloud provider is calling it, but like that really detailed gigabytes per hour report,
and then visualize it, build reports, do all that kind
of stuff, because that's not something that the tooling does well right now in terms of like
building out cost dashboards and stuff. But that's also right now. It could in the future.
Yeah. If you're a BI tool, wind up passing out templates that normalize these things. I am so
tired of building it all from scratch and Tableau myself. If you're tableau, sell me a whole bunch of things that I can use to view this stuff
through so I don't have to wind up continually reinventing that particular wheel. Yeah. Oh,
I like your approach. I didn't know the answer when I was asking the question. I was about to
learn something if you'd gone the other direction, but nope, it's good to know that my impressions
remain intact. Yeah. I mean, I've used different tools in the past. Again, I hesitate to
name any of them, but there's a few in the space that I feel like everybody, if they're in the
space, they know which tools I'm talking about. Yes, we do. And yeah, I've used them. They're
okay. A few of them are okay. A few of them are better than others. But I mean, I was trying to
evaluate the value add over me manually setting some things up and having some
sort of visualization and just like the value add in terms of what they were charging even if it was
like a significantly smaller percent of the bill because that alone like percent of bill is such a
difficult kind of cost model to do i hate that pricing is hard let's start there yeah i hate the
percent of bill because then it's, let me get this straight.
I'm paying you a percentage of things like data transfer charges that I know are fixed that I can't optimize.
I'm paying you a percentage of my AWS Enterprise support subscription.
I'm paying you a percentage of the marketplace stuff and so on and so forth.
And it doesn't work.
At some point of scale as well, it's, I could hire a team of 20 people and save money versus what you're charging me.
The other side of it,
oh, we'll charge you percentage of savings.
Well, then you wind up with people
doing a whole bunch of things.
Like before they bring you in,
they'll make a bunch of ill-advised
reserved instance purchases
or savings plan purchases.
You can then unwind after the fact.
When I was setting this place up,
I looked long and hard at different billing models.
The only thing I found that worked is fixed fee, the end.
Because at that point, suddenly everyone's on board with, hey, let's solve the problem and then get out as soon as possible.
We're not trying to build ourselves a forever job nestled in the heart of your company.
And it's the only model I found that removes a whole swath of conflicts of interest.
And that's the hard part. We have no partners with anyone in this space, including AWS themselves, just because as soon as we do,
it becomes extremely disingenuous when we suggest doing something for your sake that happens to
benefit them, such as maybe back that S3 bucket up somewhere. Well, okay, if we're partnered with
them, does that mean we're trying to influence spend the other direction? It just becomes a morass that I never found it worth the time to
deal with. Yeah, that doesn't work for SaaS. Yeah, that makes a lot of sense. And I haven't
actually thought about pricing model for consulting in the space that closely. But I mean, when you're
charging a percent of bill or percent of savings, you have the opportunity to screw the customer,
right? Through all the things that you were saying. If you charge a fixed fee, you have the opportunity to screw the customer right through through all the things
that you were saying, if you charge a fixed fee, you have the possibility of undervaluing yourself,
which the only one that's true in that case is you potentially. And if you're okay with that risk,
and you're okay with those dollars, that's great. Because yeah, if you're able to be like, okay,
here's the services that I do. Here's the fixed cost done, done. It just sets everybody's
expectations for the relationship in a much better way
that you're not constantly worried about upsells
and other things that might happen
along the way that screws the customer.
And that's the hardest part, I think,
is that people lose sight
of the entire customer obsession piece of it.
That's one of the things Amazon gets super right.
I wish more companies embraced that.
Dan, I want to thank you for taking so much time out of your day to suffer my slings, arrows,
and half-formed opinions. If people want to learn more about who you are and what you're up to,
where can they find you? Yeah, I have a website you guys can go to that links kind of everywhere
else. It is danb.org, and I spell my name with two Ns, so D-A-n-n-b.org. And I have LinkedIn.
I have Twitter.
I have a monthly newsletter
that is not really about FinOps or anything,
but I really enjoy it.
I've been doing it for like a year now
that you should sign up for.
And links to that will, of course,
be in the show notes.
Dan, thanks again for your time.
I really appreciate it.
Yeah, thanks so much for having me again.
It's been a blast.
It really has.
Dan Berg, Senior CloudOps Analyst at Datadog. I'm
Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast,
please leave a five-star review on your podcast platform of choice. Whereas if you've hated this
podcast, please leave a five-star review on your podcast platform of choice, along with a comment
featuring a picture of several cork boards full of post-it notes in string and a deranged comment telling me
that you have in fact finally found the catch in savings plans.