The Data Stack Show - 191: From Amazon to Consulting: Time Series Forecasting and How to Communicate Data Analytics Insights with David McCandless of McCandless Consulting
Episode Date: May 29, 2024Highlights from this week’s conversation include:David's Background and Journey in Data (0:30)Transition to Time Series Forecasting (2:03)Working on Time Series Forecasting at Amazon (2:55)Challenge...s and Experience in Time Series Forecasting (4:32)Transitioning to a New Role at Amazon (5:52)Tools and Methods for Time Series Forecasting (8:17)Forecasting Impact and Accuracy (15:30)Explaining Variance and Lessons Learned (18:58)Understanding Downstream Consumers and Empathy for Business Leaders (20:36)Amazon's Culture and Decision-Making Process (24:27)Assimilating into Amazon's Culture (26:04)Interpreting Data for Business Stakeholders (28:34)Consulting for Small Businesses (30:28)Challenges in Automation and Maintenance (32:18)Analyzing Financial Metrics for Small Businesses (34:51)Tooling and Data Solutions for Small Businesses (39:52)Empowering Small Businesses with Data (46:02)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Welcome to the Data Stack Show.
Each week we explore the world of data by talking to the people shaping its future.
You'll learn about new data technology and trends and how data teams and processes are run at top companies.
The Data Stack Show is brought to you by Rudderstack, the CDP for developers.
You can learn more at rudderstack.com.
We're here on the show with David McCandless.
David, welcome to the Data Stack Show. Thanks so much, Eric. All right. Well, so much to talk
about. And I love your story of going from sort of the biggest of the big to small. We'll unpack
what that means throughout the course of the show. But give us a little bit of your background. How'd you get into data?
And then what do you do today?
Yeah.
Yeah.
So I studied chemical engineering at Georgia Tech and started my career in oil and gas.
And then basically got married, needed to change jobs when the oil market was down.
So I kind of had to reinvent
myself. And I ended up in this budding field called analytics. I thought that it would suit
me really well. Started a master's degree online at Georgia Tech. I was actually in the beta cohort
of GTOMSA and graduated in May, 2020. So kind of 2017,
started different analytics positions,
first analyst, then manager.
And then my last full-time position for a corporation was with Amazon
from 2020 to 2022.
First year there, forecasting.
Second year, kind of a data engineering role.
And towards the end of my time at Amazon,
started working for myself and really enjoyed that.
And then towards the very end of my employment with Amazon,
they offered a voluntary severance package.
And I saw that as a way to make my side hustle at my full-time job.
So that's what I've been doing since early 2023.
Nice.
And David, one of the topics I'd really like to talk more about is
time series forecasting. I think we were talking before the show, that's an area that gets ignored
and it has so many practical, you know, practical applications for a business. So I'm excited to
jump into that topic. Is there anything you want to discuss? Yeah, I'd be happy to jump into the topic of
time series forecasting. Nice. All right. Let's do it. Let's do it. David, I want to talk about,
just give some context for the types of things that you worked on at Amazon to start out with.
So you had a number of different analytics positions. Were there
any specific projects that you worked on? Amazon is such a big company. So just interested in any
of the specific projects that you worked on? Yeah. Yeah. So I worked for a part of Amazon
called the Employee Resource Center. Amazon is much different than most corporations I've worked for
in that they really like to insource things
rather than outsource things.
And part of that is philosophical.
They have a very particular way
that they like to run teams.
But part of that's practical.
A lot of vendors have a hard time growing
with their scale of growth.
So whereas a lot of corporations I've worked for, they choose to outsource kind of their high transactional HR stuff like, hey, where's my W-2?
Or, hey, I think you messed up my time card.
Amazon chose to insource that. So I was responsible for time series forecasting for a team of about 500 agents distributed around the world.
And those agents served about the 1.2 million people that worked for Amazon at that time. I was forecasting demand for when and how much employees were going to call in
from down to a 30-minute interval. So our capacity planning team could plan shifts
out for the next two years so we can think more strategically about how big a budget
do we need to expand to another location to hire agents.
That's crazy. How did you get into time series? Did you work with time series data previously?
Yeah, good question. So prior to that position, I had taken one course on time series forecasting
in my master's degree. And the experience on the job was actually pretty different
from the experience in the classroom. I felt like the classroom was very theory heavy, like very much let's run some statistical tests to test stationary this data.
Whereas when I got to Amazon and was developing forecasts, I won't say that theory went out the window,
but we took a much different approach to time series forecasting.
So I'm curious with roles like that,
did you take over for someone else?
Or was this a, I mean, I don't imagine
that would be a Greenfield project at that scale.
So how was that transition?
And that seems like a big step from the classroom
to like, wow, there's 500 agents supporting over
a million employees.
Yeah.
Yeah.
So I worked for a really smart guy named Reninder.
And so you all might laugh about this.
So like I said, Amazon, they really like to insource things.
So there's this part of
HR called disability and leave services. And this group, they would handle really run-of-the-mill
stuff like, hey, I'm having a baby. I need to set up my maternity leave to really complex,
gnarly stuff like, hey, I got in a car accident. I was severely injured. I can't do my job like I used
to. Can you accommodate me so I can still work for Amazon? So Amazon had outsourced that service
and then they brought it in-house. Literally the weeks that they brought it in-house and went live
was March 1st, 2020. And so not only is the world turned upside down for this team, but then this team was told, oh, by the way, this whole pandemic COVID-19 thing, y'all get to handle this. Y, whenever you forecast with little history,
you're asking for a challenge, but then you add COVID-19 into the mix,
and you just want to pull your hair out.
Fortunately, now I'm bald, so I don't have much hair to pull out.
So anyway, Reminder, he had been wearing this hat
along as forecasting for other teams.
So a position was created.
That was my position to, to own forecasting just for part of
disability and leave services.
So yeah, I didn't come in and vent the wheel.
Fortunately, I I got to benefit from a lot of knowledge transition from
Reminder and then, you and then really forecasting for that team
and build out some of our frameworks for that team.
Makes total sense.
That's pretty wild timing.
I'm sure that was a heck of a ride.
That's like a hundred year low
of times of being in forecasting.
I can't think of, depending on how you look at it,
maybe high
because you could develop some skills quickly but i mean that's asking for magic yeah and i like to
get anything half accurate it's like literally magic but that's fascinating david i'm interested
in so what are the tools and sort of methods that you use to take time series data from analysis to forecast, right?
And time series data is interesting.
It can be so helpful,
but it's pretty challenging to wrangle and to get right.
You know, of course, at RedrSac,
we deal with time series data as one of our, you know,
sort of core pieces of like what we capture and deliver.
And so we're really familiar with the use cases. And time series, like rearward looking time series
analytics is the most common first use case, right? Like, well, how is this changing over time,
right? How many users did X or did not do X over the past 30 days, right? Describe going from time series analytics to time
series forecasting and what all goes into that from a methodology standpoint and then even a
tooling standpoint. Yeah, sure. Yeah. And I'll try to kind of work backwards a little bit from
what was the impact? Like, why did we even have people trying to
forecast? Like I said, we had 500 agents. That's a lot of money. And Amazon also had very high
standards for how long they wanted employees to wait on the phone. They wanted 98% of employees to wait less than one minute before
a human got on the line and helped them. So now I know Amazon catches some shade about how they
treat their employees. And when you're that big, there's going to be balls dropped, but I really admired the way that they made budget
available to treat employees right in that way. I can remember being on the phone 30
minutes for AT&T and I'm thinking like, I pay like 200 bucks a month. It doesn't make
much sense. So anyways, to have that kind of service level of 98% of your customers only having to wait one minute max, you basically can do two things.
One, you can staff way too many people, or you can have really accurate forecasts of demand and then translate those accurate forecasts into a really efficient staffing plan.
So Amazon did their best to take the latter approach.
How often did you create the plan?
Yeah, yeah, yeah.
So good question.
The short answer is at a minimum weekly. And I would create every week a forecast for the next two weeks at 30-minute increments.
And then my friends in workforce management, they would take that and translate it into
how many people do we need staffed in this 30-minute increment?
But then every week, we also refresh what we call a long
term forecast, which was at a weekly granularity through at least the end of the year. So that was,
you know, kind of the next level and planning like, okay, you know, do we need to start recruiting?
Or hey, you know, maybe volume is going down, maybe we need to let some of these temporary
employees, you know, not renew their contracts. And then at least quarterly, we're looking for the next year, two years out,
add demand, and then trying to translate that into a headcount slash cost figure.
But back to your question, Eric, of what does that actually look like? Again, I'm trying to
focus on what is the business problem, basically trying to provide an excellent experience to customers,
but at the same time, doing that in a cost-efficient manner.
So our stack was pretty simple for most of the time that I was there.
I, on my laptop, would run an R script,
and that R script would query our Redshift database
and get some aggregated time series data.
If I'm going to forecast weekly,
aggregate it weekly,
that data is coming out of our telecity platform.
And then when it's in R,
do some minimal cleaning, formatting, and then using a really common library called the forecast package by the godfather of forecasting.
This guy named Rob Hindemith has a great free book.
If you want to know anything about time series forecasting, Just look it up. So we would use his package
and use a model that's been around for decades.
We'd use ARIMA.
And then, voila, we get a forecast
for the next, whatever it is,
14 days in 30-minute increments
or longer-term forecasts,
weekly granularity through the end of the year.
And then we would load that back to Redshift,
and from there, my friends in Workforce Management,
they would pick it up and use it to plan shift.
More so for the long-term forecasting,
there is also a pretty high-touch process of taking the output, interpreting why there was change, and then sharing that with the business leaders who were responsible for the cost of running the business and the service level goals. And that was honestly the hardest
part of the job was explaining to, I mean, pretty, very sharp people, leaders at Amazon, but, you
know, they're not maestros in time series forecasting, trying to explain to them why there is movement.
Like, you know, why do we think demand is going to increase?
Or why do we think demand is going to decrease?
Or this is my least favorite question.
Why do you think demand is going to increase in October when three weeks ago you told me that it was going to decrease? and like i said i was forecasting in the middle of a pandemic and covet 19 was very unpredictable
and people and how they responded to covet 19 were very unpredictable
and that that was probably the harsh part of the job for me was just giving the best answer i could to the decision makers but um knowing you
know as soon as it's released it's wrong it's a forecast yeah was there a goal like some i mean
during the pandemic i don't know how you could measure this but what was like an accuracy goal
what was like good accuracy during the pandemic and maybe good accuracy pre-pandemic, post-pandemic?
Yeah, yeah, yeah.
So for our short-term forecasting, so that's for the next 14 days, if we were off by more than 5%, then we had to give an explanation of why there's a variance of more
than 5%. If I forecasted 100 calls and there's 94 or less, or there were 105 or more, then I had to
give an explanation of why that happened. And then same thing with the long-term forecasting. But basically, you're only allowed
5% of unexplained variants. Wow. Interesting. That's wild.
So that must have been the majority of your time. You probably spent a very small fraction of time
forecasting and the majority of the time like investigating is that fair yeah so i mean
the actual like refreshing of the forecast that was like a 15 minute exercise and you know if i
was more savvy i would have automated all that anyways i've gotten to that point now but
anyways yeah the the variance explanation took the majority of the time but the good news about that was
you could feed that back into the forecast to make it more accurate like whenever there's a
big driver of variance there's one of two outcomes one it's a black swan event that's
never going to happen again like the snow snowpocalypse of February, 2021.
And you can at least encode into your training data, a dummy variable to say like, Hey, there's this one time event.
We never expect it to happen again.
Right.
Or, you know, if you're really smart, you can, you know, file that away somewhere like in a table of, of events and say, you know, the next time you think that there's a snowpocalypse coming you can say well hey you know back in february 2021 here's what
happened that's most frequently what happens sometimes in a minority of cases you find a new
driver of your time series you find like hey you know i i find a strong correlation between
x metric and between my time series so then at that point you can say well great
maybe i should incorporate this into my model if it makes my model more accurate but then you have
the challenge of well great you know this
other thing is there a prediction is there a forecast available for it no if it's something
like you know world population yeah there's a lot of forecasts available for world population
but if it's something like you you know, I don't know,
how many thunderstorms
that are going to be in Texas
in the next three months,
like your guess is as good as mine.
It's how many thunderstorms
there's going to be
in the next three months in Texas.
Yeah.
One, I want to dig in
a little bit more
to the process of explaining
variance. And I want to tie that, bit more to the process of explaining variance.
And I want to tie that, you know, one of the topics we've covered in the show recently
is just talking about tying data to business results.
And I couldn't think of a more direct way of doing that.
You know, that was a huge part of your specific job role was actually explaining variance.
And so you kind of talked through understanding whether a particular variable had a strong correlation with the forecast. But what are some of the lessons that you learned about,
you know, or some of the big takeaways after repeatedly having to explain this variance to people who,
which is often the case, right?
Like if you're a business leader, you're probably like smart and driven.
And, you know, that's why you're successful.
And that's why you have the responsibility of looking at data and making decisions.
But you're not an analyst, right?
And you don't know the ins and outs of the R script
that's, you know, hunting to query Redshift.
And you don't even know,
and you shouldn't necessarily know
what all the individual variables are,
you know, that are inputs into the forecast, right?
So what are some of the lessons that you took away
from having to do that repeatedly,
or maybe some of the ways that you grew in that over time sure yeah definitely one of the ways that i grew over time
to to the kind of mind one keeping the narrative as concise and as consistent as possible
and like i said i worked for a really smart guy, Remender.
Remender had a great way of basically saying,
look, David, whatever drives our volume,
it boils down to one of three things.
Which of the three things is driving movement here?
And yeah, that's a little bit of an oversimplification,
but 80% of the time time that kind of approach worked.
And those are encoded in the model.
Your time series is a function of its own history plus these other three variables that influence it.
So that was one thing, just making the narrative more concise and consistent to consumers of the forecast over
time. And yeah, they had a really hard job. They were the ones that had to go to clients and say,
hey, I need more budget to hire more people. And these are the reasons why. So the more consistent
you can make that narrative, the more they're going to understand it.
When you start adding in all these one-off things, it can just confuse them and make it harder for them to do their jobs.
So that was one thing.
The other thing was understanding who is going to be the downstream consumer of a forecast like is this just hey you
my direct customer like you want to know what the latest i think or you're gonna like take this to
finance and argue for budget um because depending on how you answer that i'm gonna treat this
forecast differently like is this like a one-time 30-minute effort or like, oh, I need to spend the next week thinking really hard about this?
Yeah, I call that like a fidelity question, right?
Is this like a lo-fi directionally correct or is this like, hey, like you said, we're going to make financial decisions.
And, you know, this is going to be a problem if we're wrong.
Yeah. Do you feel like you, it took some learning to sort of grow? Like,
one thing you said that stuck out to me was, you know, they have a really hard job because they have to go to finance. Is that something you grew in empathy for over time?
You know, or did you sort of know that from the beginning?
No, that was something I definitely grew in empathy for over time, especially once I started
getting dragged to the meetings with finance.
And I'm like, okay, well, here's the guy who made the forecast.
Not the smart business leader. That doesn't matter. Business care business leader yeah david explain yourself yeah right yes especially
when finance started holding my toes to the fire so yeah i i definitely grew in empathy for the over time. And yeah, I think that even tied into
how I forecasted,
there's this trade-off between
explainability and accuracy.
Now, maybe we could have included
another two variables
that might have made the model
2% more accurate.
But then there's two other things
that the business leader has to be able to
basically vouch for and explain to finance.
Yeah, just the simpler that we could keep the model,
the easier it was going to be for everybody to work together.
I've got to ask this question because Amazon's famous, you know,
for the memo.
Oh yeah.
Is that, is that like a whole company thing?
Like how would you prepare for a meeting?
And like, did you use visuals?
Like, did you have, you know, Excel or something?
Like, I'm really curious to some really practical things about like what
looks like. Oh yeah. Yeah yeah so prior to working at amazon i worked at at&t business
and i i don't know if i could have had like any sharper of culture shock going from
like uber bureaucracy like i worked at the corporate headquarters on like the 15th floor
to Amazon. They say it's always day one and they want leaders to be single-threaded owners
of their fate. So even though I worked for a company that had more than a million employees,
it felt like I worked for a company that had 2000 employees because my director, Janelle, she had so much power and moved so nimbly.
Now, yeah, we had dependencies on other organizations, but it felt like I worked for a company of 2,000 given how fast we moved.
So, yeah, and how decentralized decision making was right so to back to your question about the memos
yeah we call them prfaq or press release frequently answered questions and so i can remember being
frustrated initially like i would have an idea that i wanted to get legs. I think this is a good idea. Let me go talk to some people, maybe write an email about it. And they would say, hey, this is great, but you need to ride a dock. And enough people told me that I realized if I want to get anything done here, I guess I assimilated into that culture.
And then once I did, I was like, oh, this is amazing.
You know, if you can just take the time to put pen to paper and granted, also get the
right people to read your doc, I mean, you can innovate like Amazon has done. It's not a matter of like,
who makes the prettiest slides or like, is the most compelling public speaker in front of a room.
So yeah, very peculiar thing that Amazon does. And I've, even though I've left Amazon, I continue
writing PR FAQs for different parts of my life.
Yeah, that's impressive. So when you were talking about the culture shift, I think you just listed
two really practical things. One of like, who has the best public speaking skills? Who's the best
PowerPoint slide creator with the best graphic designer behind them right like versus you know the dot the you
know memo which i would have felt like sure that probably makes people like flesh ideas out more
but i didn't at all think about the equalizing factor right yeah now like there's probably some
difference in writing quality between people but that stands out less than like a professionally really well done presentation and graphics versus just writing style. Yeah. David, one more question about
Amazon. And then I want to switch gears and talk about, you know, we said at the beginning of the
call, you know, sort of very big to very small and your work with smaller businesses. But before
we go there, I want to ask one last question on, you know, sort of taking data and speaking to business results or business stakeholders.
So I totally agree with you. And that's been a learning for me, like the more concise,
the better, right? And I loved your description of like, you know, accuracy versus explainability, and there's a trade-off there. Did you ever face
a situation where you had enough conviction to say, okay, we actually do need to dig into the
details here because optimizing for explainability would obfuscate something that was really
important? And if so, how did you handle
that with a business stakeholder? Because once you go down that path, that's a really tricky path to
walk. Yeah, that's a great question. Yeah. So I guess I go back to when you're trying to interpret
why a number's moving, it boils down to it's a
black swan event, it's not going to happen again, or it's a recurring theme.
And then you can try to move in response to that recurring thing moving.
But if that happens, then you have to have a forecast for that thing, for that predicting variable.
So there was a point where our demand was moving so much
in response to something we were aware of.
Basically, there was a change in staffing.
But there was so much movement that I don't
think I had to do much arm wrangling because the business really valued accuracy.
I said, we need to start predicting based off of this variable.
We need to make decisions based off of this variable but the
only way this is going to work is if we have a forecast for it moving forward and i kind of need
to get that from y'all so there's kind of some shared ownership of the problem and the solution yeah yeah I love it
okay let's switch gears Amazon
you know over a million employees
you're forecasting for all of that gigantic data sets
big decisions the clients
you work with now look very different so
tell us about your average customer
as a consultant doing data and analytics.
Yeah. So my average or median customer has less than 100 employees, more than 10.
And I'll say has been in business for like 20 or 30 years.
So for the vast majority of them, I am their only data resource.
And so you could think of me like a fractional data team for them. So typically, it looks like they have some question that they want to be able to answer better to better serve their customers.
And or they have some really tedious processes that they want automated.
Sometimes those go hand in hand. And so I'll work with them for engagements that sometimes are short of a couple weeks, sometimes a couple months, to build them a solution and get it up and running.
And then I'm available afterwards for little tweaks and handments, maintenance.
Yeah. So on the maintenance side, we talked about this a little bit before the show.
I think that's a really tricky problem
in this space, for this SMB space, where
there's tons of value in the automation. When it works,
it's great, but then somebody will change their API or something will happen.
It doesn't mean that are architected anything wrong. You could have made all the, you know,
perfect right decisions. How does that typically work for you? And even do you have like a
philosophical approach to that problem? Yeah, that is a great question. Yeah, I'm thinking about one project where
in an automation project
where in
testing, and we
tested with a variety
kind of hinged on this one API.
We tested with a variety
of, I'll just say, organizations
to play nicely with that API
and face no issues.
And then as soon as the customer tried to start onboarding real clients to use
the solution, all the organizations did not play nicely with the API.
Sounds right.
Yeah, that was lovely.
And I don't know if that's, you know,
completely stopped them from using the solution or just kind of hampered the rollout of the solution i'm not sure but i think that
a way around that problem is to pursue gain sharing models instead of the customer just paying for time and materials or paying a fixed bid. and consultant to see the solution through to the finish line and kind of close the loop
and make sure there are results. And, you know, if the results are not what's expected,
then the consultant is, you know, financially incentivized. I want to, you know, do what I can,
tweak what I can to make sure this delivers results for my customer.
I haven't done any gain sharing models as a consultant. I've been on the other side of the
table as a client, but I think if there were like a silver bullet, that's the one that comes to mind
right now. Interesting. Yeah, that is super interesting.
What, like thinking of gain sharing, what are the types of questions?
You mentioned that, you know, okay, between 10 and 100 employees have been in business
for a couple decades.
What types of questions are they asking?
And how is that?
I'm interested in the contrast between the questions they're asking and the types of questions are they asking? And I'm interested in the contrast between the questions they're asking and the types of questions that Or really just any kind of financial metrics
surrounding a customer right now.
How profitable is this customer?
How many resources does it require
to service this customer?
What's the lifetime value of this customer?
And then, of course, for my more SaaS or e-commerce,
for my more startup-oriented customers,
you can probably guess, you know, they want to know ARR and MRR.
The XRR goals.
But yeah, and you know, what, and that can
also start to get into the automation space, like, okay, well, you know,
great. We've built a very robust calculation
of all these sales metrics you know what if you
started automating more of your sales
commission process
oh that's a good one
yeah yeah yeah
so it's like yes the
insight leads to
like actual process
change or process automation and so
do a lot of that work too
I don't think I can say a lot but i've done that work yeah the sales commission one that's a sticky
one right because like that is not just about the data plumbing like you like because people if you
get into automating something like that there's going to be so many opinions people like oh this
is a great time to change it. That's a sticky one.
Yeah.
How are these businesses answering these questions
today or are a lot of them just not
answering it or we're just, you know,
it seems like
on a binary level
like we're not losing money
on this customer and so that's fine and we'll just
we're okay with dealing with that even if we don't
know the specific margin? Yeah. That so that's fine. And we'll just, we're okay with dealing with that, even if we don't know the specific margin. Yeah. Yeah. That's a great question. As far as like the how
it makes me think of, I recently saw these high ups from Tableau, Power BI and Oracle
shown off their database products. And they said, but the world's preeminent and
favorite BI solution is not on stage right now. And that is Microsoft Excel. And so recently,
that's, yeah, yeah, guys at the Gartner 2024 conference. And I think that's true for, you know,
businesses on almost any scale.
And so,
yeah,
technically for people,
they've got Excel as like their glue between systems.
And what that means practically is they don't have the reporting that they want as often as
they want it or maybe they get it on the frequency they want it but it's riddled with errors because
excel it's just it's hard to maintain something scalable. Or, you know, for that one employee that knows how to make that Excel file work, their life is like hell.
Because, you know, like, what if they have a baby or they're going to go on vacation?
Right.
The macro.
Yeah, I was going to say, don't touch the macro.
Yeah.
And then the IT, like, blocks the, you can't send the file anymore because it has a macro in
it like yeah totally yeah you know yeah i'll find it's common that customers they have you know they
have their sass platforms and maybe their sass platform will give them a figure that they're looking for, like maybe an aggregate.
Maybe an aggregate the SaaS platform gives them MRR,
but they want to be able to get more granular.
They want to go to Slice and Dice,
and if they want to ask that kind of grain of questions,
then they might be calling somebody like me for help.
Well, and you have to do that, right?
Like if you want to get to drivers and improve,
like you can't just have a rolled up number.
Like that just doesn't work
even from a business standpoint.
Yeah.
Sure.
Yeah.
And so how do you usually,
give us just an example walkthrough of a typical client,
let's say they're 50 employees and just pick maybe like an industry or something that they do. What does a typical engagement look like for you? And one of the things I'd love to hear about is, from a tooling standpoint, if they're using Excel, where do you go from, you know, SaaS can get really expensive really quickly when you start throwing subscription services,
you know, the data problems.
So how do you approach that for a typical client?
Yeah, that's a great question.
Yeah, given most of my customers are small, it automatically knocks out a lot of enterprise
solutions just because the minimum contract values are so high for those
solutions like a teradata for example so i've gotten comfortable with a number of products
that offer freemium um or they offer pay as you go so like one great example of a freemium solution
is retool so retool has a free retool they have a free postgres database yeah i was gonna say
it's postgres now yeah i think up to like four or five users. So I have one customer who had this quarterly reporting they had to give to the state.
It's like it had to get done.
This was not just like, oh, you know, the board wants some shiny metrics.
This had to get done.
And it just made like the end of the quarter hellacious for them to try to do all this Excel wrangling.
So I built them a retool database solution that really streamlines
and organizes the data entry.
And then they getting the reports that they need for the state as
like a one or two minute exercise.
So retool has been great.
Um, and then, like I i said pay as you go solutions
my most common data warehouse option of choice is snowflake especially since they offer a one
month trial up front i can tell customers like hey you know based off of your data volume i think
this is what your bill is going to be but you you know, we're going to get one month in, and I can give you a more refined
estimate of how much it's actually going to cost. And then, you know, kind of in a freemium space,
it's great that dbt, they offer one license for dbt cloud., you know, if I'm basically the data team,
then that model works
and we can have a great transformation
slash orchestration,
even some documentation tool.
And then for data viz,
most of these customers are on Microsoft.
And so the cheapest option for them is just to pay an extra 10 bucks per month for Power BI per user.
But if they're not on Microsoft, then there's a lot of other great options where, you know, you just you pay by the license like Tableau.
That's some of the common tooling that I'll use on a project.
Yeah, I love it.
I love it.
I love keeping it simple and scalable.
And I think one thing that's interesting about all of the tools you mentioned
is that you don't think about a
like, I don't know, I mean I would say even
maybe this is just my own, not maybe, this is certainly my own bias
being in the data industry
but you don't really think about a
30 person company or 50 person company using Snowflake or DBT necessarily,
especially if they've been around for 30 years and they don't have a data team.
And so it is really cool to sort of see
those tools and David sort of bringing that tool
set to them. Did they end up
taking over management of some of those things i mean you
mentioned maintenance john but do you eventually do a handoff for someone internally to run those
processes yeah i do a handoff to somebody internally and also develop such that I don't mean to be a part of the loop for these things to, for
the show to go on. Like, you know, David, you know, doing XYZ command once a month or
something like that. Yeah. Yeah. All right. Well, time for like one or two more questions,
John, I'm going to let you take us home because I've been dominating the conversation. companies were using okay maybe more than 15 years but like think about like oracle
informatica like like some like tools like that they were not licensed or structured in a way
where a small business could use them at all like a fraction of it like nothing it was like
corp you know corporate implementation and it's interesting so then you had this cloud
computing revolution right yep but that also opens up this really interesting new thing where you've got this fortune 500 level
you know like what the top tier companies are using where you can actually use it
at a small business whether you're using kind of a fraction of an entrance or you have your
own little like micro instance basically yeah that's actually like a pretty unique time in history where that's even possible and i mean the last
five years ish for data like data was never done this way like if you were a top tier company you
bought oracle and you paid more oracle more money like when your developers made mistakes and wrote
bad code you just paid them more money and the databases ran faster
and you could keep it going forever
until you had equity and it was
their problem.
It's just a really cool
time in history where this
is all possible.
I agree.
David, thoughts?
I agree. I agree. David, thoughts? Yeah, I agree. I really appreciate players that I'm sure make most of their money off of enterprise.
Like Snowflake, I'm sure that 80% available to you that, we'll say, Apple does. and I think that can act as a great equalizer
for
smaller businesses
I live in a small town
just 30,000 people
and
often the way I try to
explain
what I do
to people
I'll say
you like baseball?
and they say
oh yeah I love baseball and i'll ask have you seen the
movie moneyball or read the book and most people say oh yeah and if they haven't i'd say well in
short in the early 2000s the oakland athletics had the third lowest budget in the nlb but they
finished the regular season with the second best record.
They even had a better record than the New York Yankees, who had a budget three times larger
than theirs. And the way that they did that was they were really scrappy in the way that they
allocated their small budget. They basically used data to make better decisions. And I think that's encouraging for small businesses, particularly in regions like mine in Louisiana,
where we're always losing talent, losing opportunity to our giant neighbor, Texas.
So the solution for us to grow is not just pour more money and resources on it because
we're never going to win that way. We've got to do more with less. And I think data and some of
these solutions that we've been talking about are a way for smaller businesses to do more with less
and win the underdog battle. I love it. Well, David, this has been such a
fun episode talking about insane forecasting at Amazon and the ways that small businesses can use
very similar or many times the same tooling. So it really has been great. Thanks so much
for giving us some of your time today. Yeah, thanks, David. Thanks so much for your time.
We hope you enjoyed this episode of the Data Stack Show.
Be sure to subscribe on your favorite podcast app
to get notified about new episodes every week.
We'd also love your feedback.
You can email me, ericdodds, at eric at datastackshow.com.
That's E-R-I-C at datastackshow.com.
The show is brought to you by Rutterstack, the CDP for developers.
Learn how to build a CDP on your data warehouse at rutterstack.com.