Y Combinator Startup Podcast - #128 - Michael Babineau and Kevin Hale
Episode Date: May 29, 2019Michael Babineau is cofounder and CEO of Second Measure. Second Measure analyzes billions of credit card transactions to answer real-time questions on consumer behavior. They were in the Summer 2015 b...atch of YC and you can check them out at SecondMeasure.com.Kevin Hale is a Partner at YC. Before working at YC he cofounded Wufoo.You can find Michael on Twitter @mikebabineau and Kevin is @ilikevests.The YC Podcast is hosted by Craig Cannon.Y Combinator invests a small amount of money ($150k) in a large number of startups (recently 200), twice a year.Learn more about YC and apply for funding here: https://www.ycombinator.com/apply/***Topics00:00 - Intro00:35 - What idea did Mike apply to YC with?01:20 - Where did the idea come from?4:35 - From project to company10:20 - What info did investors want to know that Second Measure could provide?12:05 - Their first customers14:35 - The primary use case of Second Measure for VCs15:20 - What questions are they trying to answer?19:35 - Data examples from their blog21:05 - Post: Fashion retailers have nothing to fear (yet) from the rise of Stitch Fix23:35 - Post: Holiday sales rocket Peloton memberships ahead of SoulCycle active riders25:05 - Post: Prime members deliver for Amazon every day27:35 - Second Measure's product development process29:35 - Finding good data scientists who work from first principles37:05 - Why is credit card data so messy?42:05 - Cleaning data44:20 - Using their product for competitive analysis47:35 - Their sales process49:05 - Raising money from Goldman Sachs and Citi52:05 - Focusing on a specific problem54:05 - Keeping the product compelling when it's table stakes
Transcript
Discussion (0)
Hey, how's it going? This is Craig Cannon, and you're listening to Y Combinators podcast.
Today's episode is with Michael Babineau and Kevin Hale. Michael is co-founder and CEO of Second Measure.
Second Measure analyzes billions of credit card transactions to answer real-time questions on consumer behavior.
They were in the summer 2015 batch of YC, and you can check them out at second measure.com.
Kevin is a partner at YC. Before working at YC, he co-founded Wufu.
You can find Michael on Twitter at Mike Babonaner,
and Kevin is that I like Vess.
All right, here we go.
Mike, Kevin was your group partner when you did YC in the summer 2015 batch.
What idea did you apply with?
So our basic idea at the time was really to use credit card data to help investors make better investment decisions.
And I think like one thing that and that is actually not really far from what we do today.
The only like the main evolution is that now we work with companies.
as well, not just investors. But I think a big part of the idea, though, is not just to look at
credit card data and try to find interesting things and then tell investors about it, but instead
to build an analytics platform, throw that in front of investors, and then let them answer their
own questions. And what led you to coming up with that idea? Oh, that is a good question.
So I did not, like, I don't come from an investing background or I don't come from finance at all.
I actually worked in video games. And the same is true of my co-founder.
Lilian. So she and I met at Electronic Arts. We worked together there and then at another gaming
startup. Before that, I was in ad tech. And like, I've always been in, you know, we're both software
engineers. Like, we've always been in the tech world. But we've got plenty of friends in finance.
And one of those friends just out of the blue call me one day. I was like, Mike, I need your help.
I've got two terabytes of data on a hard drive. How do I load this into Excel? And that, that was like,
It was one of those moments where, again, as a software engineer, right?
Yeah.
You know, I get this question.
I'm like, oh, God.
You know, like, why?
Like, why are you asking me this, right?
So he's in New York.
Like, I'm in the Bay Area.
It's the middle of the afternoon.
Why, like, why am I fielding this?
And I, I, like, wasn't feeling particularly helpful.
I was like, like, what did, what did your engineer, like, what, you know, did you ask
your dev team?
Yeah.
engineering team.
And I just hear silence.
And then Mike, what are you talking about?
We've got an IT guy.
And that's it.
And that blew my mind because he was at a $30 billion hedge fund.
And like I just assume that all hedge funds, you know, look like 2 Sigma or Rentech
or, you know, these like just these places that have hundreds of quants and hundreds of
engineers.
But in reality, most hedge funds have a handful of analysts.
And just some back office support, right?
They don't have any coders in house.
I think that's when we realize there's this huge opportunity because investors are like, you know,
investors, they make money off of, off of having an information edge, right?
Off of knowing things that other people don't.
And they're like, and a lot of people who work at hedge funds are very, very clever, right?
They're looking for this edge like wherever they can find it.
And over the like over recent years, increasingly they've been looking at.
at things like Google Trends, right, to see like, oh, is there some leading indicator in search
terms that would indicate some, you know, some like bigger shift in consumer sentiment about,
I don't know, some company. Very unsophisticated sort of analysis. Yeah. Yeah. But at the same time,
like a clever idea and it, you know, oftentimes works, right? You've got investors like subscribing
to things like ComScore, looking at how many, how many visits to a website are happening. And because that
roughly is like roughly correlated with with actual sales. And it's also like this nice leading
indicator in the sense that public companies only come out with, they only report metrics
once a quarter. And it's like not right at the end of the quarter. It's actually sometime
afterwards. So you can actually look at how many people visit. If you can see how many people
visited Amazon.com over the past quarter, then like you can look at the full quarter of
information and then you can see how well that correlates with you know the resulting
reported performance so how'd you go from like helping someone with like a two terabyte
excel problem and working on video games to being like okay this is now time for us like quit
our jobs and then solve this problem because like what were you doing there at EA like what
with your role yeah so um so we are so we are not video game programmers right we were working at a
video game company, but my specialty was building high-scale infrastructure.
And Lillian's specialty is building data pipelines and analytics teams.
And when you look at the video game space, you know, like what, like how does,
how does like a company like Zinga, I think Zinga epitomizes this, right?
They're like very metrics driven, very data driven, right?
They were, one of the things they did very well is like optimize that.
They optimize the hell out of their games.
So when you think about an online game and you think about what you want to optimize for, right?
You want to actually, let's just let's talk about this in terms of fun.
If your game is too hard, no one's going to play it.
And the game is too easy.
No one's going to play it.
So you have to find like this balance where the game is not too hard and not too easy.
And if you have an online game, then you have this like amazing leg up over, over games that are, you know, like press to disc and then shipped out because you can update them.
if you and like the the best way to tell if your game is too hard or too easy is to simply look at
you know where like how far people make it into the game and so um for instance you could look at
how many players make it you know from level one to level two um and if there's like a severe drop
off right if like not enough people are doing it then it's a signal that hey you know it may be um
you know maybe we need to tweak this uh of course the person who needs to answer that question um is
a game designer. And usually game designers aren't writing sequel. And so, you know, you've got all these
metrics, like, you're tracking all these, all these events, right, of like, oh, a player, you know,
they passed level one or like player died or whatever. All of these events are being tracked and
you, you know, you're like, this is, this is like a standard sort of analytics pipeline, right?
You, you, you, you instrument your application. You have all these events streaming out. You store
them somewhere, you do some sort of processing on them, and then you dump them into some place
where you can, at some place that you can query. But then you've got these people who typically
aren't coders, like a game designer or a product manager, they want to answer questions about how
people are behaving in the game. And, you know, you basically have two pads at this point, right?
If a game designer says, like, how many people have made it to level two, then as, like, as,
you know, somebody on the data team, you can say, okay, you know,
let me run that report for you.
And then you go and you like query it and you put together the results and you send it back.
And then, you know, they look and they say, oh, this is great.
How many people made it to level three?
And you're like, oh, you roll your eyes and you're like, I see where this is going.
And at that point, you're like, okay, I have a choice.
I can either play this like go between, right?
And like fetching data over and over again.
Or I can build tools, right?
And if I build a tool and hand that tool to the designer and say, you know, here, like answer this yourself,
then, you know, I can focus on.
doing like much cooler things and much more interesting things. Also I'm out of the I'm like I'm out of the way now like I'm no longer in the way of
This this this person answering their own questions
This is this is exactly what we did in the in you know in the video game space and
This is a pattern that we recognize could be really useful in the investment space right? You've got all these all these
Investment analysts and they like they know so much about the companies
that they are making investment decisions
on and they know what questions they want to answer. And so you can either put yourself
in a position where you're trying to guess at the questions and like writing, prewriting reports
and trying to sell them reports or you can just give them like give them some sort of tool
that actually empowers them to answer like all the crazy questions that they thought they
couldn't answer. So this is the thing that's like fascinating. You guys are building tools
for understanding how to improve video games. How does that become all of a sudden the
skill set needed to sell financial and analytic software and insights to people who run
like hedge funds and investment firms or even to do corporate like competitive tracking.
Because to me it's just like I imagine they're going to ask about like what's your background,
how do I know?
Like how does that start?
Like what made you realize that like we could probably do this?
Yeah.
I think it comes down to like what is the fundamental problem being solved.
And you know like the fundamental problem is that you've got somebody.
who probably isn't a coder,
and they want to answer a question of behavioral data.
You're at video games,
and then how you decide what was the first product going to be?
Yeah, I mean, I think for us,
it was really digging in further to understand
what types of data investors were most interested in.
And what we found is that transactional data,
like specifically credit card transaction data,
is one of the things that they were really excited about,
but they were banging their heads again.
it, right? Like, this is fundamentally
credit card transaction data is
it's a messy
and, like, it's a messy data set
with unstructured data problems
baked into it. And the
skill sets of investors, even the more
technical ones,
those tend to lean more towards like
time series analysis as opposed to
dealing with large messy
data sets. What kind of questions
were the investors interested in?
Like from that data set.
Yeah.
I mean, I think like one of the one of the main things is just like how is Chipotle doing, right?
Like are they, so like they famously had a food poisoning incident a couple of years ago.
Actually, I think they had several.
But, you know, they wanted like investors, eager investors wanted to know, you know, what is the impact to their to their actual revenue.
How come there was no way to answer this before you guys came onto the scene?
Yeah. So this is one of the interesting things. So there actually was a way to answer it. It was just a terrible path. Right. The way to answer this before was with a survey. Right. So you go to some market research company and you say like, hey, there's this, you know, like Chipotle, the whole food poisoning thing. You know, can you, can you help me understand like how many people stopped going to Chipotle? And they just have to like try to find a bunch of people that match a demographic and then hope like these people answer. Are you?
going to represent like exactly it takes it takes weeks or months it costs tens or hundreds of
thousands of dollars you end up with this tiny sample of you know like oh good we got a hundred
respondents and they said that you know they um uh you know from this this pool of a hundred you know
like 20 of them said that they uh have considered stopping uh stopping uh stopping uh their their chippole like
dining altogether and then what do you guys do instead so for us you know we like because we have
direct observations of millions of U.S. consumers, like, we see all the purchases, right? We can just
look it up right away. In fact, like, we don't even need to look it up. We can just give you a tool,
and then you can find the answer yourself. And so you got your first customers during YC,
correct? How did you go about even getting them? Yeah, that's a good question. I have to think back.
So this is 2015. So our very first customer was a VC who also ended up investing in us.
and I think one of the things that was interesting is that actually this is one of the
this is one of the things where we got to I feel like we got to cheat a little bit because
because we were in YC like because we were in YC like all like VCs are always excited to
talk to YC companies and that's I mean they're trying to figure out who is in the batch and
then try to invest before Demo Day that's exactly right and so we had we had a whole bunch of
these these like funny these like funny meetings where you know
know, we're trying to get in front of them to, you know, pitch them on a product. And, you know,
they're happy to take the meeting because they want to hear about what we do. And so it ends
up being this like, dual purpose thing where they're like, okay, show me the product. Now tell
me your business model. And you're like, well, would you like to buy the product? And, you know,
fortunately, like a lot of times the answer was, like, did end up being yes. And now most,
most of the VCs here in the Bay Area, they are our customers, you know, but it was really
interesting navigating those early conversations.
What were they excited about?
Because with credit card data, there's some things that it's really good at showing and
identifying and somethings that are not so good.
So, for example, it tends to be great for predicting consumer trends.
Yeah, I mean, it's basically, I think you just have to keep in mind, like, what is it
we're actually seeing?
And what we're seeing is spending for a large proportion of U.S. consumers.
And so, like, if it's, if you want to understand a company that doesn't,
that doesn't target consumers,
if it doesn't target specifically U.S. consumers,
and more specifically,
if it doesn't sell things directly to them, right?
If it, you know, if they, like,
we're not going to see general mills, right?
That's all sold through grocery stores.
But if it's something that you might see
on your credit card statement,
then like those are the things that we can help with.
Like Uber, Lyft.
Exactly.
All the meal, like, gobble, etc.
And then what you're not going to see
is like BDB enterprise companies, et cetera.
it tends to be like lots of people are interested in consumer stuff because they're like the
fastest growing most interesting segment. Exactly. There's like there's more, yeah, exactly.
The market is more than big enough. And so are they using it as a market sizing tool?
Not as because if you're investing in a seed stage company. Yeah. So probably the primary use case
among, among VCs is actually diligence, right? And when you think about like, like put yourself in the
of a venture capitalist. So, you know, some company walks in and they say, you know, they
throw some numbers on, on some slides. They show it to you. And you're like, okay, great. Like,
is this, I have like lots of follow-on questions. Do I, you know, do I try to get these,
the numbers from you? And then additionally, there's a whole bunch of questions I have about
about your market, which you may not even know the answers to. So a good example of this would be
if you are like so as a VC somebody comes in and pitches you they're in actually let's just talk
about bird and lime right so so imagine you're a VC bird comes in and pitches you and they're like
they show you they show you this chart and it's like it's the perfect hockey stick chart you're
like this is amazing you know like I've never seen growth like this before and at the same time though
you've you know you've heard of other companies you know Lyme is out there you've heard of like
jump bikes you want to pick the best one
Exactly. Like, are you talking to number one? Are you talking to number two? You know, like what's, and also fundamentally are the like if, you know, if bird is showing good unit economics, like is that, is that best in class, you know, or could it be even better? And this is an area, like this is one of the key areas where we help VCs is in giving them visibility, not just into the company they're talking to, but into into their competitors, right?
into each of those, like every company in that space in relation to one another.
So we can say like, oh, yeah, bird, lime.
Like, here's where birds winning.
Here's where Lyme's winning, right?
Here are where the differences are in how, like, how well those customers perform, like, how much they spend and so on and so forth.
But when you say unit economics, how do you uncover that data?
So we don't see unit economics.
Sorry.
So obviously, like, we don't see the cost side.
We just see the spending side.
Right. So you could say, you know, an average bird customer spends $40 a week versus a lime customer that might be 20. Exactly. And I think and generally, like, again, if you're if if you're a VC, like you have, you have your own ideas for how to estimate the cost side of the equation. Okay. Gotcha.
What other metrics are you able to show? Like, I was always impressed when looking into the dashboard instead of second measure about like, wow, I can not just see like how much revenue is like being pulled.
But also things like cohorts, lifetime value, et cetera.
And so like what metrics get investors like super excited?
Yeah.
I mean, let's, I guess taking a step back, let's think about like, what are the main problems that we're trying to solve?
So one is generally the one is generally focused on company performance, right?
And this includes things like competitive intelligence and benchmarking, right?
Like show me, you know, what is?
I don't know, like what what is the relative market share of the various meal kit players?
You know, how long do their customers stick around, right?
How much do they spend over time, right?
Like what are the lifetime sales after 12 months?
And again, if we split those into different cohorts, you know, are those, are newer cohorts
performing better or worse than older cohorts?
So there's all of these things in and around company perform.
and then separately, there's stuff around consumer behavior, right?
And these are things like, where else do my customers shop?
Things intended to help you get a better picture of, you know, who your customers are
and, like, really help you hone in on, like, who your best customer.
And I'm saying you, but really, it could be you.
It could be your competitor.
It could be a company you're doing diligence on, you know, some target company.
What are some good examples of that?
because your blog is basically just this, right?
It's like just insights.
Yeah, yeah.
It's interesting, right?
Because our core product is really about, it's really about empowerment and saying, like,
hey, you know, you as a user, you can answer whatever questions you want, like, within this space of U.S. consumer spending.
But then, and we don't sell, we don't sell research.
Oh, so you don't answer questions for people correctly.
So we'll do it on a case by, like a project by project basis.
but we're not the ones coming up with the questions, right?
If somebody comes to us and says, like, I have the specific question, you know, I tried,
I tried this in your application.
Like, you know, I can't quite answer it yet.
Like, I have this more specific question.
Can, you know, could it be answered?
Those are cases where we can, you know, we can do it like a one-off research project.
But those are, and those are like paid projects, but we don't publish those.
The thing we don't do is we don't proactively do research and go out and, like, you know,
call up 10 of our clients and try.
to sell it to them.
Gotcha.
What's some stuff that you guys have put on the blog recently?
That's your favorite.
Yeah.
So we've started some, one thing we've started doing is so, actually, if we talk about our blog,
we also need to talk about like our press mentions.
So we actually work with the press a whole ton, right?
And so we keep getting quoted in like Wall Street Journal, Financial Times, et cetera.
And I mean, this has been great for us.
It's great for the reporters too because, you know, they're trying to write about like the
upcoming like potential lift IPO or you know whatever and they want to support their reporting
with more information and we can help provide them with that information we're happy to do so
the uber lifting is like a recurring topic and so in our blog we've decided you know what
we're just going to keep publishing periodically are the publishing updates on that so when you
choose like a question you want to ask about like the uber versus lift do you guys like
I've come up with the initial questions,
and now you listen to what the press are kind of asking you
that they want to verify,
or is it always you guys are coming up with?
So I'd say it is us always coming up with it.
We actually have a dedicated editorial team.
Gotcha.
So we've got, you know,
we literally have a team of data scientists and writers
who just pay attention to what's going on in,
like in the news,
you know,
what's going on,
you know,
with companies that like could potentially be interesting to others.
the person who runs it like you know she has a journalistic background i mean this is this is their
core focus right is find interesting things to write about it write about and then write about them so
let's talk about some examples so uh before we started recording one you mentioned was stitch fix
and where the customers of stitch fix do and do not maybe spend yeah so specifically we had um so this
is a really interesting thing right because uh one thing it so so part of uh understanding what
people are asking is like just going out and talking to people. And one recurring question we heard
about Stitch Fix was like is Stitch Fix cannibalizing like department store sales, right? Like are they,
are they competitive with department stores? And so we decided to dig in. We had no idea what the
answer was. We decided to dig in and we attacked the problem by basically saying, okay, let's look
at let's look at people spending a department stores before and after they become a Stitchfix customer.
And what we found is that Stitch Fix had no impact on department store spent.
People just started spending more on clothes, period.
Right.
And in fact, the people who Stitch Fix's best customers actually spent even more on clothes before
becoming a Stitch Fix customer than after.
Oh, like Stitch Fix inspired them to go out and find more clothes or to buy more.
to characterize it is that is that it you know piques their interest in in fashion and so they
they don't they don't spend any any less they just but part of it is like it probably jump starts
like a variety they're like oh I'm introduced to a variety of stuff I never would have considered
beforehand and now it's like oh now when I'm out there at in the real world looking at stuff
I'm like oh I'm more I there's more things that might appeal to me because I've been exposed to
them yeah and the key thing is that it's not displacing the spend right and that was I mean
that was a real surprise. And also, like, it's also like a really important question to answer
because if you're at a department store and you're trying to figure out, like, you know,
is this sitch fix, friend or foe, right? Like this, this really points more to friend.
So do you actively track like the rise and fall of brands? Because I'm wondering,
there must be instances of certain things being swapped out. On a recent post was Peloton memberships
going up ahead of Soul Cycle, right? So that's really interesting. Is that, is that, are there
trades happening that you could follow?
So sorry, when you say trades, do you mean people, so, you know, sign up for Peloton instead
of SoulCycle?
Yeah.
So, I mean, really, we, again, we, this is something we will attack from an editorial
perspective.
But again, it's, you know, like our core business is about putting a product in front of, in
front of our clients that they, through which they can answer their own questions.
Now, on the blog side, yeah, I mean, the Pelotons, Peloton's, Peloton and SoulCycle
story is super interesting. Like Peloton is a beast. And SoulCycle is, uh, has some interesting, like,
actually, so, so after we came out with this article, SoulCycle, basically they had a, uh, a nice,
like, non-denial denial, um, where they basically said, uh, like, we don't know what they're
talking about. Their numbers are, like, our numbers are great, um, but didn't actually dispute the
metrics. To give some context, what did your blog post say? And then what was it that like, SoulCycle was
nervous about. I mean, the short version is that Peloton has now surpassed Soul
Cycle in terms of like the number of active Peloton members, right? And this is based on a
spending, based on spending behavior. Active Peloton members on a monthly basis have surpassed
the number of Soul Cycle, like active riders on a monthly basis. Is there an overlap, like a
Venn diagram of like people who were used to be Soul Cycle and they've switched to Peloton? There is,
there's both like a current overlap and there's like a you know the the sanky diagram type thing of like
you know people who used to be one and now or another well have you been following how amazon
basics has developed their products uh i am generally familiar with it i'd say for us that is
uh not something what we have a lot of visibility into because um it's at the end of the day we just
see an amazon general amazon exactly but you've done some research
about Amazon Prime people.
Yes.
Yeah, we did.
So this is a case where we did a much deeper dive,
and we actually gave, we gave several talks on this.
So one thing, and this is, you know, this is spearheaded again by an editorial team.
You know, one of our data scientists, Brandon.
So he, so he dug into Amazon, Amazon's customer base.
And specifically, you know, he wanted, he wants to.
wanted to understand really the differences in behavior between Amazon Prime members and non-prime members and like how that's changed over time and really like how important Amazon Prime's members are to Amazon.
And one of the, I think one of the interesting takeaways is that increasingly Amazon is looking more and more like a subscription business.
Like they're increasingly reliant on Amazon Prime customers for their revenue.
And then another interesting thing is that even people who, so people who became an Amazon Prime subscriber, even if they laps, right?
Even if they are no longer a subscriber, they're still spending more on Amazon after than they did before.
How do you get to that conclusion that like, what was the evidence that showed that like, oh, Amazon is more focused on a subscriber?
Like, how do you guys sort of like get to that conclusion?
I would characterize that they're less, it's not that they're more focused on subscribers,
but instead that an increasing proportion of their revenues derive from people who are Amazon subscribers.
I got you.
So it's one of these things where it's like, oh, it's turning out like Amazon's most valuable revenue streams comes from the Amazon Prime subscribers.
Yes.
And we don't know the reason why, but like there's obvious things that people can sell it,
10, for example, it's just like, hey, they already pay for this membership, so they might as well
use it when the ordering and buying stuff. So it's like an excuse to have something delivered
to your house versus go to the store because I'm already paying for the membership. It's like a
cost-sunk thing. And so when it comes to product development on your side, are you incorporating
this data in any way or is it just talking to your users developing product from there?
Yeah, so when we think about, when we think about improving a product, like we have a few
different streams for like really feeding the backlog. So one is internally driven, right? And this is,
it's based on, you know, it's based on like where we know we want to take our,
our application. And also factors in, you know, us going out and proactively speaking with
their own customers, like doing that user research and really like digging into their use cases,
then use cases and then figuring out where the gaps are and then attacking those. That's one. Another is,
I mentioned earlier that we do some custom research for customers.
This is like, you know, think of it as a professional services like approach.
You know, this is something that also helps feed our backlog because if we see recurring requests, then, you know, this is probably something we should add to our product.
And then finally, we have like the editorial side, which, you know, for us is like the best form of dog fooding, right?
So we're, you know, we can go in and like try to use our apps.
to answer a question if we find that we hit a wall, right?
We can't, it's like, well, we've dug as far as we can go,
and now we have to go to the data behind it to answer the question.
Like, you know, that's a great signal that this is something we should probably build.
One thing that's interesting to me is that I feel like we just like recently just talked
to J. Clamka at Insight Data Science.
And I feel like data scientists, like hearing about your company, like this seems like a dream job.
I work on interesting problems and questions
and then even if it's with your editorial board
that's figuring that stuff out
it seems fascinating to me
as like oh every problem is going to be kind of different
we put that out there
and whether solving it stuff for your customers
or stuff that like promote the company
like how do you look at like finding
like because you guys are hiring right now right
yes like how do you find a good data science
like what are traits that you're looking for
that you know it's going to be a good fit for this kind of like
nebulous work. Yeah. It's such a good question. I feel like data scientists is such an overloaded
and I think a bit overused, like an overstretched term. I think for us, specifically what we're
looking for are people who are like scientists with a capital S who have very strong quantitative
backgrounds and can understand from first principles like the problems that they're trying to
solve. I think very frequently what you find are, you know, people interested in, interested in data
science, you know, they learn a lot of the tools, but maybe skip over the fundamentals.
When you say, like, are able to think from first principles, I think this is something I hear
as a common theme also for people who are looking for good engineers or product manage, etc.
Like, what does that mean exactly? Yeah. So let's think about it this way. So we have,
So a third of our company have PhDs, right?
We have, we're basically equally, so most of the team is technical.
How big is the team?
So we're 60 people today.
And most, so most of the team is technical.
And it's about an, you know, 50, 50 split between engineers and data scientists.
Now, on the data side, what you'll find is that we have people, you know, with backgrounds
ranging from statistical genetics to cognitive neuroscience to string theory to like to, like, to
like earth science to climate, you know, climate science, like really all over the place.
And like the common theme, though, is that all of them are extremely good in statistics, right?
So that you've got this, there's sort of this statistical foundation that, you know, that in our
opinion, like everything is built on top of. And it's our view that if you come in with that,
that strong, that strong, like, you know, mathy foundation, that learning the tools, like the tools can
be taught, right? We can, like, we're happy to help, uh, to help people get onboarded with,
like using Python. Like, okay, cool, you've only used our. Like, that's fine, right? We can, we can help
you like learn to switch over to Jupiter notebooks. Um, but the thing that we're not going to teach you,
uh, is we're not going to teach you how to do math. Mm-hmm. And then how does that translate,
like, into the first principles. So, because I usually think of it as like, someone who's willing to
challenge, like, I will give someone a task and so. And so,
Sometimes they will come back and say, like, actually, can we just dive out?
I was like, what's the reason behind this task?
And maybe just be able to be like, oh, actually, I think I can improve the question we need to be looking into instead.
Yeah, I think this, a lot of this ties in with like the nature of the types of problems we're trying to solve.
Right?
You can't, like, there's no, there's no like, I don't know, playbook of best practices for dealing with the problems associated with transactional data.
Right?
There's no playbook on building an analytics platform focused on consumer spending behavior.
A lot of the things that we're doing, you know, they're either, like we're either,
we're doing them for the first time.
And in some cases, maybe they are simply being done for the first time.
So it's something where we benefit from people who, you know, who can approach these like big,
nebulous and open-ended problems and come in and figure out how to structure and decompose the problem.
And then tackle it piece by piece.
So do you train for that or you just hope that they have it?
Like, what is the test is my question, really?
Yeah.
Because I mean, because it's really just like, here's a problem.
But then before you get overwhelmed by the problem, because often you're told like,
hey, you have to take route A or B.
Usually there's options like C through infinity, right?
And so you have to ask why.
And so how do you, whether it's through interviews or training, get that out of employees?
Yeah.
For us, I mean, I think of this.
is less something that we train people to do and more something that we hire for,
like we screen for in the hiring process. So we've taken great care in designing and actually
iterating on our interview process. And I'd say that there is significant technical evaluation
where we're trying to test for exactly these types of things. For data scientists, one of the
things that we do is we actually, you know, give them a big messy data set. And we say,
do some, like, it's open-ended. Do some research. Tell us what you're, and then present it to us.
Right. Tell us what you were looking for and tell us what you found. What's some common mistakes
that, like, people do that end up not working out so well? And what's some stuff that the really
great employee and applicants have been able to do? I know I'm trying to help people like cheat on
on your. I'd say, like, the number one mistake that people made.
make is that they, you know, they assume, they assume too much of the data. They assume the data is perfect, right?
They assume that what we give them, you know, that like, oh, like, this is easy. All I have to do is just,
like, you know, load it into whatever, like into pandas or load it into, like, throw it on a database and
just start running queries, get the answers, and then throw it into a slide and be done with it.
like it never like that never really works because like this is and this is just isn't how data in our
world works like there are always dragons like somewhere and so a big part of this of this exercise
is like well how you know like how diligent were you in looking for dragons right and anticipating
these these like problems and then you know you don't necessarily need to solve all of them but you need to be
aware of them because they actually can distort your findings. And so as long as you, like,
if you identify them and even if you have findings that are invalid, but you're able to identify
that, you know, hey, like, I found this thing, but I made this, like, I deliberately made this
assumption, the simplifying assumption so I could complete it in a reasonable amount of time.
Like, that's fine. So the good people, what they're good at is like not starting from their own
assumption, but actually trying to query and figure out what were the assumptions that I'm
working with.
Yeah, exactly.
Whether it's in the data, the question, et cetera.
And so once you have that, it helps you understand as like, how strong or how weak is my
ultimate conclusion going to be as a result.
Yeah.
I mean, it's like, it's sort of like building a house, right?
If you, if you were to hire a construction crew to come out and build a house and they just
came and they just like came out on site and they just started like erecting walls and then,
you know, they hand over the keys.
you slam the front door, the whole thing falls over because it was on a shaky foundation,
right?
Then, like, clearly they failed.
And so for us, you know, what we like is to find people who really like to understand
the foundation that they're working with to make sure that it will be sound when they build
the house.
So I've never done a project involving credit card data.
Can you, but then I use these like tools like mint and it consistently classifies things
as the wrong thing, right?
Can you explain to me why this stuff is?
not normalized because it seems like incredibly valuable, potentially not that difficult.
Obviously, it is difficult.
But like, why isn't it normalized?
Why do you have to clean it all?
Yeah.
So I think, I guess, you know, maybe the easiest place to start is like, think about your,
think about your last credit card statement, right?
Like, think about a time where you've looked at your credit card statement and you saw a
transaction on there and it says something like S bucks or like, I don't know, like MW.
San Carlos, which would be like men's warehouse San Carlos. It doesn't say men's warehouse. It doesn't
say Starbucks, right? It says something, which if you like squint at it and you scratch your head a little
bit, like you as a human can probably figure out what it is. Now, the problem is that like that, the problem is that
the problem is that there are many different companies all, you know, putting in, you know, some piece,
actually the fundamental problem here is that some human decided how to represent that
that store in a credit card statement.
And they're working within this constraint of a limited space, right?
They only have a certain number of characters and they have to type something in,
which, again, communicates to a human that, like, yeah, you were at Walmart.
Yeah.
So you don't dispute the charge.
But it was never designed for a machine to read.
And so, like, the result of this is that there are,
you end up with this, this cardinality problem, right?
You end up with many different variants for a single, for a single merchant.
And part of our job is to find all the variants and to map it back to that singular merchant.
But they're, so you're saying there are multiple text strings associated with men's warehouse in San Jose or San Jose or whatever.
Yeah.
So within our data set, we have, so we're looking at,
like 50 plus billion transactions, we have one billion unique transaction descriptions.
And I'll tell you what, there are not one billion merchants in the U.S.
Right.
Okay.
So like Macy's alone has like three million different representations.
Yeah, I'm just like kind of baffled that it was never like, hey, Macy's, your store number 1,200, whatever.
So there are two, there are basically two layers of problems.
So one is that, you know, one is that, one is the human layer, right, where you've got somewhere you've got a human and they're setting up the point of sale, you know, system, like the swiping device for, you know, for a certain Macy store.
Let's actually, let's just talk about McDonald's for a second.
So McDonald's, you've got franchises.
So when somebody sets up their franchise, you know, they work with like a point of sale provider and they get their point of sale set up.
And like, okay, well, you know, what should this be?
It should be like McDonald's, I don't know, like F.
139. Okay, great. Right. Now we've got this this one location. The problem is, depending on, depending on how the transaction is processed, the apostrophe that you expected to appear in McDonald's could be a space, it could be a star. It could be deleted, right? Could just be, you know, McDonald's nothing. Right? And like basically, the two problems, you know, one is a human one where different humans could describe things differently. They can even tell.
typo the name of their own company, which happens.
And then the second problem is there are like various perturbations that can take place
in the processing chain.
I think part of it was like the corrections had to happen by users of mint.
Yeah.
And I think humans don't want to correct that data.
No.
Diligently.
And also if it turns out, it's like, oh, I can see.
a human getting really frustrated where it's like, this is the 50th time I had to correct that
this is coming from McDonald's. And therefore, like, I no longer want to correct this anymore
because, like, this is just not any good. And so the problem actually is like, oh, all of them
are so different. And so humans are giving up on the classification when really it's like,
this is actually more complicated. I have like such limited incentive to classify my end.
I don't really care. I mean, I'm sure some people do, but I don't really care how much I spend
on food. The problem gets even worse, right? Sometimes I don't want to know. It's like I need to sit in
that like fast food denial. Yeah. If Amazon was all classified in one category, that would not be good.
So, I mean, like this, you know, if you're coming into this like with a, I don't know, like a software
mindset, right, you're thinking like, oh, yeah, there should be some like unique identifier for,
for blue apron, right? But if you actually just look at all the blue apron transactions,
what you're going to find out is that, you know, there's actually more than one blue apron.
Did you know that there's a blue apron grocery store? That's very close. It's in Brooklyn.
Yeah, like things like that or like United, like United Airlines, of course, but then there's also a United grocery store.
And they show up, in some cases, they show up the exact same on your credit card statement.
How much time are you guys spending cleaning up data?
Is it like perpetual and nonstop?
So we don't think of it as like a fundamentally human.
There are human elements of it, but I mean, really it's something that we, you know, try to use machine-based approaches to get,
to like operate as a giant lever.
For, I guess we think of it this way, right?
We've basically had to build two different products.
So one is this pipeline, which ingests raw transactional data,
and then output something useful.
And like, you know, the things that we do in that process are things like,
like this entity resolution, which is what we've just been talking about with merchants.
But it also includes, like, you know, figuring out for an Uber transaction,
it says San Francisco.
but always said San Francisco.
But obviously not all Uber rides are like in this city.
Oh, looking at other transactions around at the sea of like, oh, maybe this originated somewhere else.
Exactly.
So we figure out the location of the purchaser based on where their other purchases are.
And that lets us like fill in the gap.
So we say like, oh, you know what?
Ignore this location for Uber and instead, you know, use this computed location.
There are other things that we need to solve.
And then there's this whole other thing around debiasing, right?
Because we basically have this longitudinal study going on, right?
We have this panel, the panel of consumers.
And obviously, it's not going to be a perfectly representative sample of the U.S.
So we endeavor to figure out all the ways in which it isn't representative and then apply corrections to make sure that, you know, whatever results you get do represent the greater population.
So anyway, so that's one thing that we're building is this pipeline.
And we've got 10 to 15 people working on that.
But then we also have our analytics platform, right?
This is the, think of it as the hyper-specialized tableau where, you know, we've built in lots of different analyses that operate on this nice, clean data set that the pipeline is outputted.
One increasingly growing set of customers for you guys are like corporations doing this for like sort of.
I guess competitive analysis?
Yeah.
How did that come up?
And so like why is that?
I mean, I can see why it would be interesting to them.
But I'm just wondering,
are they looking at questions very differently when they're looking at your platform to answer them?
Yeah.
I think this is like,
this is a really interesting journey for us because,
you know,
we started out building a platform that was focused on helping investors,
understand company performance, right?
And YC,
hammers,
you know,
hammers in that you need to like focus, focus, right? That it's not like, it's better to have
something a small number of people love than something that many people just like. And we took that,
like, you know, we really took that to heart and we didn't want to work with companies for a long
time because we were afraid that it would spread out our focus. One of the things that
changed our thinking was this, so there's a book from Clayton Christensen. So he's a professor at
at HBS and he wrote Innovators Dilemma.
More recently, he published a book called Competing Against Luck.
And in it, he talks about the theory of jobs to be done.
And like the basic premise is that when you're thinking about, you know,
substitutes for your product, you shouldn't be thinking about things that just look similar
to your product.
Instead, you should be thinking about, you fundamentally, what is the job that your
customer is hiring your product to do, right?
And if, and this, this, I guess, this changed the way we thought about focus because, you know, like this whole time we've been, we've been thinking like, oh, investors, investors, investors.
But in truth, there are many different use cases for investors, right?
A fundamental discretionary hedge fund, right?
Like, think of it as a group of analysts who are, you know, working in Excel and trying to figure out, like, is, you know, is stitch fix a good, like,
poised for growth in the longer term.
Like, they have a very different use case from a quant investor who's focused, like, someone
who has a purely systematic strategy and is trying to trade, you know, on a daily, weekly
or even like, like, just quarter to quarter based on where they think companies are likely
to beat or miss relative to expectation.
Right?
These are different use cases.
Now, if we think about one of our core use cases,
is this being, uh, helping people understand company performance, then that's when we began to
understand like, okay, well, investors want to know how companies are performing, but so do other
companies, right? Companies want to know how they're, uh, how their, uh, competitors are doing.
And, um, we had a really convenient way into this because we were working with so many VCs.
They were actually bringing our product into the boardroom. You know, they were showing like they
were showing their portfolio companies. And then the CEO would raise their hand and say, like,
wait, how do I get that? It's an interesting sales strategy. Yeah, I think, like, maybe you
could speak to that a little bit more because there are so many YC companies. And oftentimes people
just think, like, YC is just consumer. Very much not true. YC is just software, also not true.
How do you guys think about your sales process? Yeah. I mean, this is, this is an area of
focus for us now. We were very, very fortunate to have just a ton of, I mean, really, like,
a ton of virality, which is like a funny thing to talk about in the context of really enterprise
sales. But we actually haven't done any outbound sales yet. We have 150 clients. Every single
one of them came to us through inbound, right? They basically, you know, somebody signed up and
then they told their friend about us. Their friend reached out.
love what they saw, signed up, told their friends, and so on.
I mean, it's a box of secrets.
Yeah.
And so to me, it's just like, hey, I have this thing and it lets me see stuff that it's
like that I've never been able to see before.
And so, like, that's a very remarkable thing that's easy to spread around.
Yeah, exactly.
Like, yeah, everyone knows that, you know, Uber's bigger than Lyft, but like how much we can
actually quantify it.
And I think that's, it's a lot of, like, it's a lot of fun.
And for certain people, right, it sort of, it unlocks like a new way of doing their job.
and so it's it's it's it's become like table stakes and that's that's been great for us um but now like
you know we just raised our series a um so that was led by Bessemer and co-led by by Goldman Sachs
um and then we also had participation from city um the city bank correct golden Sachs and Citi that's such
interesting partners or investors to be leading around what why were they super excited especially yeah yeah i think
So we fall into, so I'd say that the reasons are different for each.
So we fall into this general category.
When you're talking about the investment world, we fall into this category of companies,
generally known as like alternative data companies.
So alternative data basically refers to anything that can, any information that can help you
understand how companies are performing, that isn't just the traditional reported fundamentals
or like stock prices or things like that. So this collectively, it's referring to credit card
data, satellite imagery, web traffic data, geolocation, like data for mobile devices and so on.
Goldman Sachs is making, has made a big push into the alternative data space. And, you know, they
they had not made an investment in any company touching, dealing with credit card data.
And so we're like, you know, we're their horse in that race, if you will.
Awesome.
And they've been just phenomenal.
I think, I think like here in the Bay Area, there's like so much of, like, you know,
everybody's focused on working with, you know, with like big traditional VCs.
But I think, you know, we've actually had a tremendous success working with sort of like,
I don't know, less, less expected players, I guess, out here.
So our seed round was actually led by Jeffries, another investment bank.
And one thing that we found to be true for both Jeffries and Goldman is that they are
extraordinarily well connected, you know, like in New York City, in the East Coast with not just
investors, but also with companies, right, because they're investment banks.
So they've been just tremendous in terms of helping us get in front of more, you know, more of the types of, you know, clients we want.
Now, for city, of course, they have a ton of of transactional data.
And like this is something that, you know, they, like this is a pain point that they feel internally.
Like all the things that I described about messy transactional data, they understand.
It seems odd to me that they wouldn't have a handle on this.
already themselves.
So it's a really,
really hard problem.
Like,
I can't understate that enough.
Like, why are they so bad?
Why is everyone else so bad?
It's not,
I wouldn't say that it's,
that everyone else is so bad.
I think it's just that,
you're so good.
Or their other products are so profitable.
Yeah.
I think it's that people are focused on solving specific problems.
And so,
like,
I wouldn't say that,
you know,
like mint is,
I wouldn't say that mint is terrible at,
at identifying,
at like understanding transactions, right?
They're just, they're, they're good at different things because they're focused on solving
a different problem, right?
Like, mint.com is not trying to, like, they're trying to solve the problem of, you know what,
we need a best guess as to what this transaction is, but we need to do it for all the transactions,
right?
Like, we flip that problem upside down.
We say, you know what?
We don't care about most transactions.
We only care about the, you know, 5,000 or so companies that we track and growing, right?
We care about that and we can't be wrong.
Because if we're wrong, somebody's going to lose millions of dollars.
So the constraints actually help make it much easier as a result of not having to focus on everything.
Exactly.
It makes the problem tractable.
And because we're focused on that, like, what we're discovering is that there are surprisingly interesting applications of this thing that we built for this, like, hyper-specific use case.
You know, suddenly we're finding out that, like, oh, this could, you know, this could help, you know, this type of company.
I don't know, find, find new customers, right?
Like, it's a company that sells to other businesses
and they want to find fast-growing businesses
so they can sell to them.
This is, I think this has been one of the interesting parts
about our journey is discovering, like, really by accident,
you know, all of these additional use cases
that we really didn't anticipate.
One thing that's tricky,
and it's probably one of these, like, great problems
that have as a company,
is that if you're like people's secret weapon
and it becomes table sticks
to be like, hey, if we want to stay ahead of the game
and I have to, like, Bloomberg is a good example.
It's like, oh, I have to sign up for Bloomberg
if I'm a trader to use this.
And I think second measure might easily be come
into that category as well for a lot of investors.
I feel like the tricky part is then
like if all of a sudden now everyone is using us,
like how do you develop the product?
Like how do you keep it interesting?
Yeah, so...
Keep people on board.
versus like jumping ship or trying to find some other solution.
Yeah, I mean, this is a really, really good point in particular for the investment audience, right?
Because investors are looking, like they make money off of information edge.
They make money off of knowing things that other people don't.
And this actually informed a lot about how we tackled this problem because we could have very easily focused on selling.
quote insights or quote signal to hedge funds, right?
Where we say like, oh, here are the most interesting, I don't know, like trading signals
and we send those out.
But as we add more and more customers, then, you know, the value to each one becomes
significantly diluted.
And so, you know, we took the view that in particular because transactional data, there's no
single owner of transactional data.
there's no way to to like control how many people have access to it.
Why not just assume everybody's going to have access to it one day and then focus on building,
building a tool to help people, you know, answer more creative questions, right?
And our view is that even if everybody has access to the same data, that if they simply focus
on asking better questions, they'll still find their own edge.
Now, that's for the investment community, though.
the corporate side, on the corporate side, I mean, really, the fact that you're, a fact that
somebody else used to the product doesn't. I think that would be delightful. It's like every major
corporate company is like, we have to use this for competitive analysis. I mean, like, if the
worst case scenario was you were Bloomberg, you'd be okay. Yeah. I think Bloomberg's doing just
fine. Yeah. Right. All right. Awesome, Mike. Thanks for coming in. Oh, definitely. Thank you.
All right. Thanks for listening. So as always, you can find the transcript and the video at blog.
combinator.com. And if you have a second, it would be awesome to give us a rating and review
wherever you find your podcast. See you next time.
