The Data Stack Show - Shop Talk: Will the Future of the Customer Data Platform Include a Shared Logic Layer?

Episode Date: October 17, 2022

In this bonus episode, Eric and Kostas talk shop around the wide world of data. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Eric Miller, CFP®, CEO, Welcome to the Data Stack Show Shop Talk, where Kostas and I talk shop. I know it's an extremely creative name. And this week, Kostas, I think it's your turn to bring the question, which I have not heard in advance. So what's been going through your head? Kostas Pint through your head? Yeah. I'd like to ask you, Eric, about your thoughts regarding the future of the customer data platform. Wow. Yeah.
Starting point is 00:00:41 I want to know what's going to happen. You are the perfect person for that. Okay, I'm going to respond. Initially, I'm going to respond with a question. Okay. Do you have a definition in your mind for customer-native platform? Like, what does that mean to you? Or is that part of your question to me?
Starting point is 00:01:05 I mean, I have something in my mind. Do you want me to share that? Yeah. Like your way to deflect my question? No, I'm happy to, I'm happy to answer, but I'm trying to use the Socratic method because it seems very appropriate. Yeah. Yeah.
Starting point is 00:01:22 I think when we are talking about a customer data platform, I think we are talking about some kind of data infrastructure that has like two main components. One is, let's say more of like a database system that can manipulate in with a good developer experience, the data that are associated usually with user activities, because they have some unique dimensions or some unique ways of working with them and some very specific processing things that you want to do on them. So that's one thing, which is the more, let's say, technical side of things,
Starting point is 00:02:07 and the more like data engineering side of things, and then you have anything that has to do with like, okay, yeah, sure, like we do that, then what? Like, how do we use this data? And like, how do you, I don't know, like expose this data to the right audience and how the marketing teams can work without, how the marketing teams can work without how the product teams can work without. And that's a completely different, let's say, functionality. I don't think that CDPs in the general case needs to implement both to be considered like a CDP in my mind, at least.
Starting point is 00:02:42 But the company... Yeah, different flavors of the CDP in my mind, at least. But the company... Yeah, different flavors of the CDP. Yeah, but at the end, I think that a company will need both sets of functionalities if they want to extract file out of the customer data. So that's what I have in my mind when I'm thinking about CDP. Yeah. Yeah, that's a great question. And something I think about a lot, obviously, at Rudderstack, we're building products that sort of directly answer that question. But I'll try as much as possible to put my objective hat on and answer as a user of these products, because I've used a lot of them over the years and I'm fairly familiar. So I think there are a couple of things. I think we're already at the very early stages of the
Starting point is 00:03:33 first phase, which is moving to some sort of central data store as your primary repository for all of this data. And I say we're really early because there are still tons of customer data platforms that are doing really well, you know, as businesses and that are really good products that manage this customer profile store in their own, you know, in their own systems, right? Whatever that is, right? But they basically have their own set of databases
Starting point is 00:04:09 and they store a copy of the data and they operate on it and can do a bunch of different stuff. But I think increasingly that we will see that customer profile store move to some sort of data store owned by the company who's trying to collect all their customer data or whatever.
Starting point is 00:04:35 Currently, the data warehouse seems to be the predominant pattern that you see there, which makes a lot of sense because I think there are a lot of sense because I think, you know, I mean, there are a lot of really great things about data warehouses, but I think we also have to keep in mind that like sheer practicality of it. They were already there. They were already being used for a
Starting point is 00:04:57 lot of different things. And so literally from a convenience standpoint, it just is sort of the path of least resistance, right? That's the initial tool that is kind of used for that, right? You probably already have whatever you're like replicating your product database into your warehouse to do some sort of analytics, your, you know, whatever, pulling transactional data in from this system or that system, right? They're like, okay, well, if you're trying to like build, to build, collect all of your data in one place, already started that process in the warehouse. And so it doesn't make sense to try to create something really different. I think that trend will continue. I actually don't know about the...
Starting point is 00:05:35 It'll be interesting to see what happens in the future. I think that probably has a lot of legs in the future, but I also think we'll see interesting different kinds of architectures that emerge around enabling that, that aren't necessarily just the data lake or just the data warehouse. But conceptual, I think that makes a ton of sense. And I would say like as a user, I'm a big fan of that just because it sort of gives you like ultimate flexibility which is really nice and that that tends to be the challenge of like package you know at least with the ones that i've used before is that you know you kind of you kind of have like and actually like most sass is like this intentionally right you build use case you build your products to cover a set of use cases for a set of users. And even then, though, it is a bell curve, right?
Starting point is 00:06:33 You can cover this primary set of use cases for a certain number of users. And when you get out to the edges, it's really hard to serve all of those particular use cases when it comes to data especially like what you can do with your customer data manipulating it all that sort of stuff i think having it on your own data store you can almost think about it as like sort of flattening the bell curve a little bit in terms of like the things that you can do calculations i mean whatever right i mean it's your data warehouse so it's a guy sort of limit. I also think that as the, you know, one of the big things like, okay, this new architecture comes in, you have tools that are enabling that. And that's sort of the David and Goliath story. But I absolutely think we'll start to see some of the really,
Starting point is 00:07:18 really big players, like move down market towards that architecture, right? So you have like the Salesforce's, the Adobe's, you know, like the Oracle's, right? All these gigantic companies have, I think Microsoft even, right? They all have some version of like a large enterprise CDP. And what's really interesting when you think about that is that they, a lot of these companies also, I mean, Google has a marketing cloud themselves, much more emphasis on advertising, but they also have like data stores themselves. Right.
Starting point is 00:07:55 And so it is pretty interesting to think about those companies also moving towards that architecture. It's a lot harder for them because their entire companies are sort of, they've spent decades building products that are not on that architecture. It's a lot harder for them because their entire companies are sort of, they've spent decades building products that are not on that architecture, but that's really interesting to think about. So here's the other, here's another one. So that I think is probably pretty established. I think a lot of people can like, you know, that's not like a huge revelation. Here's a thought that I have that I think will be really interesting to see play out. Maybe isn't discussed as much. I actually think we'll see a lot of the logic around like acting on the data move further down in the stack.
Starting point is 00:08:46 You know, so you said, well, what are you, like, you collect all this data. Great. Like, what do you actually do with it? You know, marketing needs to go drive more page views, create more leads or whatever, right? Customer success needs to,
Starting point is 00:09:01 you know, whatever, upgrade accounts, mitigate tickets, blah, blah, blah. Product needs to, like, optimize towards feature adoption, et cetera, et cetera. And one of the challenges you have there is that you have a huge amount of business logic that is, even if your data is not siloed anymore, right, because you like have it all in a centralized data store.
Starting point is 00:09:27 So let's assume you have all of your customer data and whatever, right? You're, and let's say you're like activating this data, you know, you're getting it out to all your downstream tools, et cetera, right? I mean, those are pipelines that exist. Like that's not, those patterns are like available and they're not rocket science, right?
Starting point is 00:09:43 There's nothing novel there. But even still, you have a ton of logic that lives in these downstream tools. So for marketing, you say like, if these certain conditions are met, then like a status changes, right? Which is actually a pretty big deal. That's like kind of an event, but also kind of a user trait like status changes are really interesting that happens a ton in like sales crm software or whatever you know you have like for in product for example you have a bunch of logic around like participation in experiments or stage of onboarding you know all that sort of stuff. You even have a ton, I mean, the classic business logic silos in
Starting point is 00:10:26 analytics, right? Where you have a bunch of complex logic that lives in reports and stuff like that, which is probably like the lowest level in the stack. And so that actually creates a lot of challenges because sharing logic across downstream tools, especially across teams, is very challenging. Right? I think it's become a lot easier to solve the data silo problem, but the business logic, business logic being siloed is pretty difficult. Henry Suryawirawanacke... So do you think, do you think that's, let's say like tools like Market or
Starting point is 00:11:01 Braze or whatever, do you think that we are going towards like a future where more, let's say, the business logic that is being implemented there is going to move to the data warehouse and these tools will become like thinner in terms of like the functionality that they have and they will be more into the distribution and like the marketing execution sides only? Great question. I think that, I mean, the honest answer is I don't know exactly what this will look like. I don't know exactly what this will look like, but in short, my answer would be yes.
Starting point is 00:11:40 I think that in a way, those downstream tools like Marketo will get thinner. Now, I mean, Marketo is like an Adobe products, right? I mean, you know, we think about the primary use cases that drive revenue for those companies, like this change isn't going to happen quickly, especially for those gigantic incumbents, right? We're talking about decades and decades of established ways of working in processes and ways of doing business logic. And unwinding that is no small thing. If you remove all of those sort of like practical barriers, if you think about having a shared logic layer, it really makes a lot of sense, right? Because you can have multiple downstream tools access that logic layer, which means that they get thinner from like a logic standpoint. And ultimately, they kind of become like more of a last mile mechanism as opposed to like a keeper of business logic. I don't know if the warehouse, it almost has to be like a layer on top of the warehouse. It almost has to be a layer on top of the warehouse. I don't know if the, like, it's, you know, is it something that Snowflake will build, like, into their ecosystem?
Starting point is 00:13:08 Like, I don't know. I think in a way they do already, like, this whole thing of, like, data applications. I think that's what is, like, actually targeting or it's a good use case for that. Like, how you can build. Yeah, it's a good use case for that. Like how you can build... Yeah, it's a great point. ...the business logic from these kinds of tools. Because if you think about it in a way that's already happening, right? Like with if something happened with UniversityL, it's that part of the work that you would do on this
Starting point is 00:13:43 downstream application, like creating a new audience, right? Now you would do on the downstream application, like creating a new audience, right? Now you do it on the data warehouse and you just have to create a new audience and put the data there and the audience is created. You don't have to do any querying or filtering or whatever of the data in market or whatever tool you use to create the audience, right? It's already done in the data in Market or whatever tool you use to create the audience, right? Like it's already done like in the data warehouse. Now, can you do this like with all the complex cases that you have there with the signals and all that stuff that you talked about with reverse ETL? Probably not.
Starting point is 00:14:19 But I think that, I mean, the data, the data base, not the data warehouse, okay? Just like to make it more generic. I think half, like pretty much at least like 99% of like the expressivity that is required. Totally. For this kind of jobs, specifically for like the stuff that you're doing like with marketing, right? Yeah. At least like from the use cases that I have seen. So instead of enriching,
Starting point is 00:14:51 like pulling data out, enriching the data, pushing it back to the downstream application so the marketer can go there and at the end make a filter to filter down based on some attributes, the audience. Yep. Why not do that, like, in the data warehouse, right? Like, it's a complete waste of resources, like, and time, like, going
Starting point is 00:15:11 back and forth to do all that stuff. So that's what I'm wondering. And I'd love to learn more about, like, what is missing, let's say, from the data breach and snowflakes of the world out there to make this happen. We are not there yet, right? I think it's, this is what I think is interesting. I agree 100% that the
Starting point is 00:15:33 expressivity, like, the functionality from an expressivity standpoint is definitely there. When I said, like, is the data warehouse in place? I guess what I meant is, it's more of a question around the interface,
Starting point is 00:15:52 which we've actually talked about on a couple of recent shows, right? What does the actual experience of the end user look like? Now we were talking about developer experience and how they interface with whatever, or something like that. And that is a really tricky question, right? Because one of the big reasons that a lot of this logic
Starting point is 00:16:11 has remained in these downstream tools is because it is really easy for those downstream teams like marketing to interact with an interface that ultimately produces some sort of logic without being super technical. To your point, though, it is becoming increasingly technical, right? And the place that a lot of those base operations are done is increasingly becoming some sort of database, right?
Starting point is 00:16:40 And yeah, it is interesting to think about, you know, the... know the i'm take reverse etl as an example right which is a weird example because like the companies have been doing this pattern for years and years right it's not necessarily something new really what reverse etl is is just an interface interface that allows that makes it easier to like interact with a database and then you know sort of like orchestrate it whatever right like skeleton drops all that sort of stuff you don't have to do that me and you right it really is just a basic interface layer for pipelines that existed for a long time i though, one, there are two things that come to mind that I think are really helpful examples of patterns. So, and we'll use like reverse ETL as one, and then I'll actually use great expectations as the other example, because they show this dynamic happening in two So if you think about reverse ETL, one interesting pattern there is that ultimately what's happening is that you're building logic in some sort of interface, and then it's SQL at this point, right? Like, okay, I want to like get some data from the warehouse
Starting point is 00:18:06 and pull it into Salesforce or Marketo or whatever, right? And so I use this interface to interact with it. And ultimately what it's doing is producing SQL under the hood that runs an operation and sort of, you know, executes that or whatever. So that's interesting, right? So you like have, you have an interface that's producing like right? So you like have your, you have an interface that's
Starting point is 00:18:26 producing like logic as code underneath the hood. Great expectations is interesting because it's the same concept,
Starting point is 00:18:33 but in the other direction, right? You define data definitions and great expectations. I don't know
Starting point is 00:18:41 if you remember this, but it automatically produces human readable documentation for those definitions, even though you're essentially like it's a Python library, right? And so you're making definitions in a Python library using Python, and it literally produces like human readable documentation. So those two patterns, I think are really instructive for thinking about potentially the ways that this could look in the future. But, you know, I don't have a crystal ball, so we'll have to see.
Starting point is 00:19:09 We'll see. Indeed, indeed. Is that the buzzer? That went by really fast. Did I talk too much? Oh, we both talked too much. It's okay. Not why we are doing this. Indeed. All right. Well, thank you for joining us for Shop Talk. And we'll have more good tantalizing conversation for you on the next one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.