The Data Stack Show - Shop Talk: Will the Future of the Customer Data Platform Include a Shared Logic Layer?
Episode Date: October 17, 2022In this bonus episode, Eric and Kostas talk shop around the wide world of data. ...
Transcript
Discussion (0)
Eric Miller, CFP®, CEO, Welcome to the Data Stack Show Shop Talk, where Kostas and I talk shop.
I know it's an extremely creative name.
And this week, Kostas, I think it's your turn to bring the question, which I have not heard in advance.
So what's been going through your head?
Kostas Pint through your head? Yeah.
I'd like to ask you, Eric, about your thoughts regarding the future of the customer data platform.
Wow.
Yeah.
I want to know what's going to happen.
You are the perfect person for that.
Okay, I'm going to respond.
Initially, I'm going to respond with a question.
Okay.
Do you have a definition in your mind for customer-native platform?
Like, what does that mean to you?
Or is that part of your question to me?
I mean, I have something in my mind.
Do you want me to share that?
Yeah.
Like your way to deflect my question?
No, I'm happy to, I'm happy to answer, but I'm trying to use the Socratic method because
it seems very appropriate.
Yeah.
Yeah.
I think when we are talking about a customer data platform, I think we are
talking about some kind of data infrastructure that has like two main components.
One is, let's say more of like a database system that can manipulate in with a good
developer experience, the data that are associated
usually with user activities, because they have some unique
dimensions or some unique ways of working with them and some very
specific processing things that you want to do on them.
So that's one thing, which is the more, let's say, technical side of things,
and the more like data engineering side of things, and then you have anything
that has to do with like, okay, yeah, sure, like we do that, then what?
Like, how do we use this data?
And like, how do you, I don't know, like expose this data to the right audience
and how the marketing teams can work without, how the marketing teams can work without how the product teams can work without.
And that's a completely different, let's say, functionality.
I don't think that CDPs in the general case needs to implement both to be
considered like a CDP in my mind, at least.
But the company...
Yeah, different flavors of the CDP in my mind, at least. But the company... Yeah, different flavors of the CDP.
Yeah, but at the end, I think that a company will need both sets of functionalities if
they want to extract file out of the customer data.
So that's what I have in my mind when I'm thinking about CDP.
Yeah.
Yeah, that's a great question. And something I think about a lot, obviously, at Rudderstack, we're building products that sort of directly answer that question. But I'll try as much as possible to put my objective hat on and answer as a user of these products, because I've used a lot of them over the years and I'm fairly familiar.
So I think there are a couple of things. I think we're already at the very early stages of the
first phase, which is moving to some sort of central data store as your primary repository
for all of this data. And I say we're really early because there are still tons of customer data platforms
that are doing really well, you know, as businesses
and that are really good products
that manage this customer profile store
in their own, you know, in their own systems, right?
Whatever that is, right?
But they basically have their own set of databases
and they store a copy of the data and they operate on it
and can do a bunch of different stuff.
But I think increasingly that we will see
that customer profile store
move to
some sort of data store owned by
the company who's
trying to collect all their customer data or whatever.
Currently,
the data warehouse
seems to be the predominant pattern
that you see there, which makes a lot of sense
because I think
there are a lot of sense because I think, you know, I mean,
there are a lot of really great things about data warehouses, but I think we also have to keep in
mind that like sheer practicality of it. They were already there. They were already being used for a
lot of different things. And so literally from a convenience standpoint, it just is sort of the
path of least resistance, right? That's the initial
tool that is kind of used for that, right? You probably already have whatever you're like
replicating your product database into your warehouse to do some sort of analytics, your,
you know, whatever, pulling transactional data in from this system or that system, right?
They're like, okay, well, if you're trying to like build, to build, collect all of your data in one place,
already started that process in the warehouse. And so it doesn't make sense to try to create
something really different. I think that trend will continue. I actually don't know about the...
It'll be interesting to see what happens in the future. I think that probably has a lot of legs
in the future, but I also think we'll see interesting different kinds of architectures that emerge around enabling that, that aren't necessarily just the data lake or just the data warehouse.
But conceptual, I think that makes a ton of sense.
And I would say like as a user, I'm a big fan of that just because it sort of gives you like ultimate flexibility which is really nice and
that that tends to be the challenge of like package you know at least with the ones that
i've used before is that you know you kind of you kind of have like and actually like most sass is
like this intentionally right you build use case you build your products to cover a set of use cases for a set of users.
And even then, though, it is a bell curve, right?
You can cover this primary set of use cases for a certain number of users.
And when you get out to the edges, it's really hard to serve all of those particular use cases when it comes to data
especially like what you can do with your customer data manipulating it all that sort of stuff
i think having it on your own data store you can almost think about it as like sort of flattening
the bell curve a little bit in terms of like the things that you can do calculations i mean whatever
right i mean it's your data warehouse so it's a guy sort of limit. I also think that as the, you know, one of the big things like,
okay, this new architecture comes in, you have tools that are enabling that. And that's sort of
the David and Goliath story. But I absolutely think we'll start to see some of the really,
really big players, like move down market towards that architecture, right? So you have like the Salesforce's, the Adobe's,
you know, like the Oracle's, right?
All these gigantic companies have,
I think Microsoft even, right?
They all have some version of like a large enterprise CDP.
And what's really interesting when you think about that
is that they, a lot of these companies also, I mean, Google has a marketing cloud themselves, much more emphasis on advertising, but they also have like data stores themselves.
Right.
And so it is pretty interesting to think about those companies also moving towards that architecture.
It's a lot harder for them because their entire companies are sort of, they've spent decades building products that are not on that architecture. It's a lot harder for them because their entire companies are sort of,
they've spent decades building products that are not on that architecture, but that's really interesting to think about. So here's the other, here's another one. So that
I think is probably pretty established. I think a lot of people can like, you know, that's not like
a huge revelation.
Here's a thought that I have that I think will be really interesting to see play out.
Maybe isn't discussed as much.
I actually think we'll see a lot of the logic around like acting on the data move further down in the stack.
You know, so you said,
well, what are you,
like, you collect all this data.
Great.
Like, what do you actually do with it?
You know, marketing needs to go drive more page views,
create more leads or whatever, right?
Customer success needs to,
you know, whatever,
upgrade accounts, mitigate tickets,
blah, blah, blah.
Product needs to, like,
optimize towards feature adoption, et cetera, et cetera.
And one of the challenges you have there is that you have a huge amount of business logic
that is, even if your data is not siloed anymore, right, because you like have it all
in a centralized data store.
So let's assume you have all of your customer data
and whatever, right?
You're, and let's say you're like activating this data,
you know, you're getting it out
to all your downstream tools, et cetera, right?
I mean, those are pipelines that exist.
Like that's not, those patterns are like available
and they're not rocket science, right?
There's nothing novel there.
But even still, you have a ton of logic that lives in these downstream tools.
So for marketing, you say like, if these certain conditions are met, then like a status changes, right?
Which is actually a pretty big deal.
That's like kind of an event, but also kind of a user trait like status changes are
really interesting that happens a ton in like sales crm software or whatever you know you have
like for in product for example you have a bunch of logic around like participation in experiments
or stage of onboarding you know all that sort of stuff. You even have a ton, I mean, the classic business logic silos in
analytics, right? Where you have a bunch of complex logic that lives in reports and stuff
like that, which is probably like the lowest level in the stack. And so that actually creates a lot
of challenges because sharing logic across downstream tools, especially across teams,
is very challenging.
Right?
I think it's become a lot easier to solve the data silo problem, but the business logic, business logic being siloed is pretty difficult.
Henry Suryawirawanacke...
So do you think, do you think that's, let's say like tools like Market or
Braze or whatever, do you think that we are going towards like a future where more, let's say, the
business logic that is being implemented there is going to move to the data
warehouse and these tools will become like thinner in terms of like the
functionality that they have and they will be more into the distribution and
like the marketing execution sides only?
Great question.
I think that, I mean, the honest answer is I don't know exactly what this will look like.
I don't know exactly what this will look like, but in short, my answer would be yes.
I think that in a way, those downstream tools like Marketo will get thinner. Now, I mean, Marketo is like an Adobe products, right? I mean, you know, we think about the primary use cases that drive revenue for those companies, like this change isn't going to happen quickly, especially for those gigantic incumbents, right? We're talking about decades and decades of established ways of working in processes and ways of doing business logic.
And unwinding that is no small thing. If you remove all of those sort of like practical barriers, if you think about having a shared logic layer, it really makes a lot of sense, right? Because you can have multiple downstream tools access that logic layer, which means that they get thinner from like a logic standpoint. And ultimately, they kind of become like more of a last mile mechanism as opposed to like a keeper of business logic.
I don't know if the warehouse,
it almost has to be like a layer on top of the warehouse.
It almost has to be a layer on top of the warehouse.
I don't know if the, like, it's, you know,
is it something that Snowflake will build, like,
into their ecosystem?
Like, I don't know.
I think in a way they do already, like, this whole thing of, like, data applications.
I think that's what is, like, actually targeting or it's a good use case for that.
Like, how you can build.
Yeah, it's a good use case for that. Like how you can build... Yeah, it's a great point.
...the business logic from these kinds of tools.
Because if you think about it in a way that's already happening, right? Like with
if something happened with UniversityL, it's that part of the work that you would do on this
downstream application, like creating a new audience, right? Now you would do on the downstream application,
like creating a new audience, right?
Now you do it on the data warehouse and you just have to create a new audience and put the data there and the audience is created.
You don't have to do any querying or filtering or whatever of the data
in market or whatever tool you use to create the audience, right?
It's already done in the data in Market or whatever tool you use to create the audience, right? Like it's already done like in the data warehouse. Now, can you do this like with all the complex cases that you have there with
the signals and all that stuff that you talked about with reverse ETL?
Probably not.
But I think that, I mean, the data, the data base, not the data warehouse, okay?
Just like to make it more generic.
I think half, like pretty much at least like 99% of like the expressivity that is required.
Totally.
For this kind of jobs, specifically for like the stuff that you're doing like with marketing, right?
Yeah.
At least like from the use cases that I have seen.
So instead of enriching,
like pulling data out,
enriching the data,
pushing it back to the downstream application so the marketer can go there
and at the end make a filter
to filter down based on some attributes, the audience.
Yep.
Why not do that, like, in the data warehouse, right?
Like, it's a complete waste of resources, like, and time, like, going
back and forth to do all that stuff.
So that's what I'm wondering.
And I'd love to learn more about, like, what is missing, let's say, from the
data breach and snowflakes of the world out there to make this happen.
We are not there yet, right?
I think it's, this is what I think is interesting.
I agree
100% that the
expressivity,
like,
the functionality
from an expressivity standpoint
is definitely there.
When I said, like, is the data warehouse in place?
I guess what I meant is,
it's more of a question around the interface,
which we've actually talked about
on a couple of recent shows, right?
What does the actual experience of the end user look like?
Now we were talking about developer experience
and how they interface with whatever,
or something like that.
And that is a really tricky question, right?
Because one of the big reasons that a lot of this logic
has remained in these downstream tools
is because it is really easy for those downstream teams
like marketing to interact with an interface
that ultimately produces some sort of logic
without being super technical.
To your point, though, it is becoming increasingly technical, right?
And the place that a lot of those base operations are done
is increasingly becoming some sort of database, right?
And yeah, it is interesting to think about, you know, the... know the i'm take reverse etl as an example
right which is a weird example because like the companies have been doing this pattern for years
and years right it's not necessarily something new really what reverse etl is is just an interface interface that allows that makes it easier to like interact with a database and then
you know sort of like orchestrate it whatever right like skeleton drops all that sort of stuff
you don't have to do that me and you right it really is just a basic interface layer
for pipelines that existed for a long time i though, one, there are two things that come to mind that I think are
really helpful examples of patterns. So, and we'll use like reverse ETL as one, and then I'll actually
use great expectations as the other example, because they show this dynamic happening in two So if you think about reverse ETL, one interesting pattern there is that ultimately what's happening is that you're building logic in some sort of interface, and then it's SQL at this point, right? Like, okay, I want to like get some data from the warehouse
and pull it into Salesforce or Marketo or whatever, right?
And so I use this interface to interact with it.
And ultimately what it's doing is producing SQL under the hood
that runs an operation and sort of, you know, executes that or whatever.
So that's interesting, right?
So you like have, you have an interface that's producing like right? So you like have your,
you have an
interface that's
producing like
logic as code
underneath the
hood.
Great expectations
is interesting
because it's the
same concept,
but in the
other direction,
right?
You define
data definitions
and great
expectations.
I don't know
if you remember
this, but it
automatically
produces human
readable documentation for those definitions, even though you're essentially like it's a Python library,
right? And so you're making definitions in a Python library using Python, and it literally
produces like human readable documentation. So those two patterns, I think are really
instructive for thinking about potentially the ways that this could look in the future. But, you know, I don't have a crystal ball, so we'll have to see.
We'll see. Indeed, indeed. Is that the buzzer? That went by really fast. Did I talk too much?
Oh, we both talked too much. It's okay. Not why we are doing this.
Indeed. All right. Well, thank you for joining us for Shop Talk.
And we'll have more good tantalizing conversation for you on the next one.