The Data Stack Show - 216: From Raw Data to Business Results: Building High-Impact Data Teams with Ethan Aaron and John Steinmetz

Episode Date: November 20, 2024

Highlights from this week’s conversation include:Ethan's Background (0:47)John's Background (1:16)Data Teams vs. Engineering Teams (2:04)Career Paths in Data (3:40)Pressure of Large Companies (6:10)...Contrasting Industries: Expedia vs. Gallo (9:02)Establishing Trust in Data (11:30)From Sales to Data (16:30)Understanding Success Metrics (18:58)Creating Daily Business Value (23:03)Aligning Data Work with Leadership Goals (29:30)Differences Between Data and Software Engineering (31:25)The Role of Data in Business (35:56)Understanding Data Contracts (39:35)Accuracy vs. Usability in Data (41:51)Observational Skills in Data Roles (44:03)Defining Product in Data vs. Software (47:07)Final Thoughts and Takeaways (52:42)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, I'm Eric Dotz. And I'm John Wessel. Welcome to the Data Stack Show. The Data Stack Show is a podcast where we talk about the technical, business, and human challenges involved in data work. Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. Welcome back to the Data Sack Show.
Starting point is 00:00:37 We have two guests today, Ethan Aaron of Portable and John Steinmetz of Gallo Mechanical. Gentlemen, welcome to the show. Thank you so much for the conversation. All right. Well, give us just a quick background. Ethan, why don't we start with you? Totally. So I'm Ethan Aaron.
Starting point is 00:00:49 I'm the founder and CEO of Portable. I've been working in data for almost a decade at this point. I've been the head of data at a small startup, at a thousand person company. And now for the last five years, I've been building data integrations so that data people don't have to worry about extracting data from systems and centralizing it into their warehouse. So we have 1500 different integrations. I've built hundreds of them, almost 1000 at this point. So I can speak to all the different nuances of the Seekers system. John. Yeah, John Steinmetz. Right now I'm a head of data over at Gallo. I've implemented three data teams from scratch for startups, led some of the bigger teams over at Expedia, HomeAway, and Bizarre Voice.
Starting point is 00:01:29 Started out as an engineer, worked my way up, decided to move to products. Now moved to a CTO role where I would be administrating over product data and engineering. And now primarily over the last five years, I've been working on startup data and focusing on that. Recently worked for Schiffke, a startup that eventually and is now probably about $2.5 billion close to that. And now I'm taking my talents, if you want to call them that, to Gallo Mechanical to try to change the construction industry. Because that is a very underserved data industry. So yeah, that's me.
Starting point is 00:02:04 So guys, in our wrap here a few minutes ago, we talked about data and engineering teams and some differences specifically around product. So I'm excited to dive into that, talking about data product people versus product people on the engineering side. What are some topics you guys are excited about? Talking about that problem
Starting point is 00:02:22 in terms of the similarities and differences, I think there are a lot of differences between data teams and product and engineering teams. And then also thinking about the nuance of that when you're at a one-person data team, a company that can afford a one-person data team versus a company that can afford a
Starting point is 00:02:39 100-person data team, because it changes. Just like engineers. A one-person startup with one engineer is very different from a company with 300 engineers and how you have to trade. So I'm excited to dig into that as well. What about you, John? Yeah, very similar. I think that data is all about doing what's right for the business from a value perspective. And with any engineering task, if you don't have a business goal or a business
Starting point is 00:03:05 lead into that, you will eventually waste a lot of money. So tying all that in, I always run all my teams like a product. It's got an engineering side, a product side, and a design side as well. So you have all of that in there and leading to that business value is really critical and not all companies are the same. So you got to kind of figure out like, what does it mean for one person? Like, like Ethan said, versus, you know, what does that growth look like? What do you need right now? Versus, you know, what do you need later and making sure you don't spend a lot of money up front and the product side of that really drives that home. Love it. Well, tons to talk about. Let's, let's dig in. Right. Let's do it. All right. Both of you have really interesting careers.
Starting point is 00:03:49 You know, Ethan, you started in banking. And John, you started in, you know, sort of, let's say, traditional software engineering as a software engineer. And both have ended up in the world of data in different places. So could each of you just give us the couple minute version of your story? Where did you start in the world of data and what got you into it? And then, you know, how did you end up where you are today? We'll switch it from the intro. So John, why don't you, why don't you lead us off? Yeah, so this is great. I actually started out as an engineer for more of a marketing side where I'm building, you know,
Starting point is 00:04:25 probably 20, 30 applications a year for various brands. Loved it. Love the engineering side. Love the design side. That's actually my degree is actually a graphic design degree. Oh, wow. Yeah. And then I realized very quickly that the other side of my minor in college was sociology. So I realized quickly that I loved how groups think. And that was a really important aspect of what I was doing. And then I realized as I was building these applications that there were nuanced ways of building them for each different brand according to what goals they had. And then I saw the marketing data behind it. And then I saw all the marketing data behind it. And then what really inspired me to make that switch into more of a product slash data role was when I saw the effects. It was at the very
Starting point is 00:05:14 beginning of social media, right? When I saw that just some small differences that you can make in these applications, they would lead into big things on the other side, value for the businesses. And then I realized that's really where I wanted to put most of my time and effort. And that led me to leading data organizations, organizing them in such a way so that everything rolled up to the company's goals, not just the group's goals. I have a personal mission so that data no longer becomes what you see in IT, which is a kind of like a, we have to have this mentality, cost center to, we need, this is a profit center for us. I want to move data into that space as opposed to what most people do is they just hire a
Starting point is 00:05:59 data team because they need it. And I say you do. And the reality is we need to be the bookends of everything at the start and at the end. And that's really what my mission is from where I was to where I am today. And you did some time at some really large companies, both established companies. You were at Expedia.
Starting point is 00:06:19 You managed the homepage team there. Can you describe just a little bit of the pressure of that? Because I don't have direct experience with a homepage that there. Can you describe just a little bit of the pressure of that? Because, you know, I don't have direct experience with homepage that large, but the little experience I do have is, you know, when you push to production, it's a big deal because it can mean, you know, if you screw something up, it can cost the company a huge amount of money. Oh yeah. And I'll tell you, like being at Expedia and it was Expedia HomeAway. So it wasn't a big one, but like we integrated with Expedia because all of the data is shared between those organizations. Oh, interesting.
Starting point is 00:06:53 So my first day at Expedia, I was told you are going to be presenting to all 10,000 people in the company. My first day. I had never talked to anybody. I was like, all right, what do I need to do? I roll up my sleeves and do it. So pressure, yeah. And there's a guy over there. Well, he's not over there anymore. He's now with PayPal, John Kim. And I think I learned more in the time I was at Expedia on how to do data for leaders from listening to him give me feedback in those big product meetings, because these were all
Starting point is 00:07:25 televised over the internet for the entire company. So I was literally speaking to not only them, but the questions he asked and the way he presented those questions really got me thinking. It was the first time I had ever heard of OKRs and KPIs, right? And I thought I knew everything when I got in there. So the pressure was there. But at the end of the day, Expedia and most big companies like that, their process is essentially a workflow. The homepage leads to the next page, leads to the next page. And your job is to pass the right amount of people in the right ways to the next team. That was it. And if you think about it from a manufacturing perspective, that's a phenomenal way of thinking because you're not worried about the whole thing. It takes the pressure off.
Starting point is 00:08:15 The pages that came after that page told me what they needed. And then I designed my systems to say, let's figure out the best way to send the right people in the shortest amount of time to that next page. And then they did their thing for the same thing, right? I mean, we're talking millions of dollars of impact, especially at the homepage, because I was the tip of the spear. If I didn't drive the right people to the second page, everything's off. So yes. And it was a very data science heavy role. Very data science. That makes total sense. Yeah. So it was fun. It was a learning experience and it was really good for me to learn how big companies do it because then you can parse that
Starting point is 00:08:58 out into how smaller companies can also drive that impact as well. Okay. And then contrast that with Gallo because, you know, that sort of experience is, you know, let's just say bleeding edge, digital, you know, almost purely digital journey, tons of traffic, tons of SEO implications, you know, just the nice edge, if you will, or the sharp end of, you know, a digital funnel. And now you are doing data in the construction industry. And those things are, those are different. So I would just love to hear a little bit about the contrast there. Different, but the challenges are still there, right? The challenge in an industry like this is, and Ethan could probably talk about this too, the challenges in the finance industry 10 years ago is what construction is doing right now. They don't necessarily have the same systems. They don't necessarily have people like me with
Starting point is 00:09:54 the experience that come into the construction because there's not as much money to be made in that space. So it's typically, it's a different kind of situation, but the foundational elements are all the same, right? You got to have a data warehouse. You got to centralize all your analytics. You can't tie in directly to these systems because you can't affect production. Like all of that stuff is very similar that I would say the one biggest difference is most of the stuff I'm doing today is one way.
Starting point is 00:10:23 It's all read only. Whereas in these other systems, I'm pushing data back into these platforms. Right. Right. So that's the biggest technical difference. Yeah. But from a business perspective, it still goes, what are the company's goals? What are you looking to do?
Starting point is 00:10:39 And how can I provide those? Now, I know how much we're going to talk about this today, but there is a method I go through, which is determining a company's analytics intelligence. And I have a very specific formula that I use for that. And that helps me determine what path the data culture needs to follow. Companies like this here at Gallo, very smart people, but we're still stuck in Excel. Whereas some of these other companies use Excel as a sandbox tool, but never as a production tool.
Starting point is 00:11:09 Yep. So you kind of have to like figure out what's important to the business and how do we get them away from Excel and trust the data. So that trust, I would say it's more difficult to establish trust in an organization like this than it is at Expedia because the trust in our big data world comes with the territory. Right. It's accountability instead of trust.
Starting point is 00:11:31 Like you have to be accountable to the things you make. Whereas here, and I'm not saying this as a detriment to this company, it's not. But in companies that aren't served by data very frequently. Yep. Everything is an amazing thing. Sure, sure. Right? Yeah. So that's a little harder to kind of come back to. Yeah, yeah, yeah.
Starting point is 00:11:51 No, I'd love to talk about the analytics framework. But Ethan, okay, give us your backstory and then we can dig into the juicy stuff. I mean, this is already juicy, but I mean the top. My background's kind of all over the place. So I started my career at Goldman Sachs doing real estate investing. I was buying properties, office buildings, residential units, logistical warehouses, that type of stuff. But I found myself getting more excited about the spreadsheets and how do we write VBA code or how do we restructure these to be more
Starting point is 00:12:20 efficient or how do we streamline a process or how do we replace light bulbs in a building, all the operational stuff. I also hate authority and I hate being told what to do. So I was going to say, that sounds like a great cultural fitting. Heavily regulated industry, maybe not the best. Yeah. So a couple of years in, I went to a 12-person data startup and I was supposed to do sales. And again, I knew nothing about any, I didn't know sales. I didn't know startups. I didn't know data, but it sounded like an amazing place to learn. And I got there day one.
Starting point is 00:12:49 The CEO was like, actually, we don't need sales right now. We need someone to build dashboards and implement customers. Do you know SQL? Do you know Shell Script? I was like, no, but I'll figure it out. And it gave me a really interesting perspective because it wasn't, can you go learn SQL and can you go learn data tooling? It was, I need you to know this because you need to build the stuff. I need to run this company.
Starting point is 00:13:09 It was very much in service of running the company. It was not in service of learning SQL for the sake of learning SQL. Did they did product, sold some data, and then we got acquired by LiveRamp. I did product management at LiveRamp for about a year. So I've been on the data side, but I've also worked with engineers at small companies, large companies, and now back at portable. And then I set up the data team at LiveRamp. So we were a thousand person company, did not have a centralized data team. I made a case for, hey, we should have a centralized data team. Here's what it should look
Starting point is 00:13:40 like. Started interviewing all of our execs, what matters to the CRO, what matters to the CEO, what matters to the CMO, and coming up with our global list of KPIs and OKRs that we can actually measure with data. Did that for about a year. We sold off our parent company for $2 billion. And I went and I worked for our chief strategy officer at Libram, trying to figure out who we should partner with or who we should acquire. So I spent a lot of time digging into data integration companies, small companies, big companies, and give me a very good understanding of the ecosystem of ELT tools, ETL tools, CDPs. Reverse ETL wasn't really a category yet, but
Starting point is 00:14:16 iPaaS tools. And about a year into that, I was like, I can do that. Started portable. And what we've been doing for the last five years, our goal has always been build 10,000 integrations so that data teams don't have to. We want to build a platform on which we can build and maintain 10,000 integrations that pull data from systems and put it into data warehouses. At this point, we have 1,500 integrations. I personally probably read thousands of sets of API documentation. I've probably built 750 integrations myself at this point. And I'm also the data person at portable. So in addition to building integrations, being the marketing face of portable, doing sales, customer support, I'm also back to the lens of I'm the CEO. I need to run this company. What data do I want at my fingertips?
Starting point is 00:15:09 And I'm doing that, but I'm not doing it for the sake of I like data. I'm doing it because I need to run the company. It's going to be an interesting perspective on all of this. I also talk to data people all the time. I probably have 15 meetings a week with heads of data at small companies, big companies. I host events in New York and really big events at Snowflake Summit every year. But excited for the chat today. Yeah, yeah. So one quick observation
Starting point is 00:15:30 before we move on. I don't think I've ever heard anybody get hired into a sales role and a CEO go and say, hey, we don't need more salespeople. I want you to do data stuff. I don't think I've ever heard that.
Starting point is 00:15:43 That is a great point. It wasn't we don't need more salespeople. I was going to be the first salesperson. The funny part about that, though, is when I got hired, it wasn't me being like, I have a sales background. I can do that. The CEO was really looking for passion and just like, are you willing to work hard? So the entire conversation, my interview with the CEO of this company for about an hour was just us talking about building efficiency light bulbs energy efficiency all the stuff I was working on at Goldman the things I got really excited about I went very probably too deep into but I got hired
Starting point is 00:16:15 because of that I like he said you need a salesperson I said I could do that and then when he said I need a data person I was like I do that. Like when you join a 12 person company, like I would assume you're signing up for whatever needs to be done, whatever. Yeah. Okay. Yeah. You got hired based on like, Hey, this is the right guy. Like we're going to, we're going to slot them somewhere.
Starting point is 00:16:37 You got initially slotted and just like move positions. Basically. That makes sense. Yeah. I honestly, like as a founder now, I view it through the lens of like like that's probably a really good way to screen like it's too late like you should do it beforehand but it's like probably a really good way to screen out people at small startups is like you thought you were going to do this day one i'm actually gonna and yeah okay let's do it yeah and if they don't then they, yeah, it saves you some time.
Starting point is 00:17:05 I'll say this. Most of the unicorns that you see all start out like that, where you have a very small group of people that can do multiple things. I'll say this about Schiffke. When I was hired, I was in a CTO role and I left to go take a product role, very low level product role with Schiffke because I saw the vision and they knew all the things i could do and within i was the first product hire and within three weeks they were like we just want you to do data yeah i mean it's all the successful startups i've heard tons of stories like that yeah that kind of reminds me of your background you like at rudder sex don't bunch of different things and it's like so it was a hard
Starting point is 00:17:46 pivot day one for me like look i didn't know anything about sales and i didn't know anything about data so it was really just like which one am i going to get up to speed up one yeah yeah totally we actually had someone from braze on the show a gentleman named spencer burke yeah and he Spencer Burke. Yeah. And he has been at Braze, it's 11 years? 12, yeah. 12 years? Yeah. Which is a really long run at a startup.
Starting point is 00:18:09 He was, I mean, I think they were just a couple of people. He was employee number like one or two outside of the founders. Yes, yeah. And I think, I mean,
Starting point is 00:18:17 he was like running around New York City just trying to see if people would install this SDK in their mobile app, you know, like all these mobile startups. Anyways, great episode and like really good story of like that playing out over a long period of time within the same company because his role has changed really significantly. But yeah, so great episode if anyone's interested. Okay, we've already had
Starting point is 00:18:41 such a fun conversation, but let's dig into our first topic. And I'll lead it in by mentioning an article that bubbled up on Hacker News this week. It's a great article called How I Ship Products at Large Companies, or something of that nature. And really interesting article, a number of just generally good pieces of advice. But one of the points that the author makes that really stuck out to me was how you measure success for a product. And one of the points that he makes is that one of the success criteria needs to be buy-in and excitement from your boss and from management on what you ship. And he even goes so far as to say, which I loved the provocative nature of this, he said, you're probably thinking like, oh, you need to measure like whether people use it as well. And he was like, actually not true. Like it's like, management buying into and getting excited about what you're shipping, right?
Starting point is 00:19:47 Because if they're not doing that and it is, like, numerically successful, it still doesn't matter, right? You know, which people have different opinions on that. But I loved the point that he made in drawing a really hard line on you can measure things in different ways, but there's really only one or two things that actually matter. And when we were chatting before we hit record, both of you made that point around the business value of data. And I mean, Ethan, you even alluded to this, you know, you're doing data stuff at Portable, but only to serve extremely specific needs, you know, the specific needs of the CEO, which, you know, also happens. But yeah, John, why don't you, I'd love to just hear you talk about your experience with that.
Starting point is 00:20:35 And yeah, just give us some insight into that dynamic around business value. Yeah, I think that article probably has probably some merit to it. When I get in, my number one thing is to create excitement, right? Because when you're creating a data culture, especially from scratch, that business value leads to excitement a lot of times. Most of the time, those two things intertwine because business leaders get really excited when they see the numbers going up in a very strategic way. But they're also intangibles that when you have discussions with these business leaders, especially at the C-suite levels, like, what do you care about? What is important to you? And I've actually heard C-suite members say,
Starting point is 00:21:16 I don't care what the financials are right now. I want to get people talking about our product. I want to get people understanding that we have made a mark in this industry or whatever. So that gets my wheels turning, right? Like what can I do from a business perspective that will show those things, right? I preach to my teams, well, not here, but like my previous teams, right? Daily business value. And they will tell you, I say it every single day. What have you done today to provide value to us?
Starting point is 00:21:45 Right? Some days you have better than others. And that's okay. But like, you should always strive for that. And by business value, I mean, what did you deliver that someone is using? Right? It doesn't matter how many dashboards you deliver. It doesn't matter how many APIs you create.
Starting point is 00:22:02 It doesn't matter any of that. If nobody's using it, it holds zero value. And that's why I also teach my teams, the faster you can get something to production, even if it's not perfect, the better it is for you. Because if you spend six months planning something and you push it out there and people go like, what the hell am I looking at? Like you've just wasted six months of company time. But if you put something in and you build it and you get it out there in two or three weeks, even in a partial spend, and you start getting feedback,
Starting point is 00:22:31 feedback is business value because you are now learning what the business needs to be successful. So it really is that simple to me. Generating that excitement comes from really finding those champions within the business, getting those people excited about what you're doing. And that usually leads to more money for your group too. So I always try to put
Starting point is 00:22:51 that in place and try to make people understand that, you know, value for the most part equates to dollars, but it doesn't always have to. It could be just that excitement. Yeah. Can you speak to both the listener and me? Because I'm guessing that some of our listeners are having the same reaction where they hear that and the idea of daily business value sounds great. But in the back of their mind, they're thinking, things are pretty complex here where I work. There's a lot of technical debt. Shipping stuff is hard, probably for some reasons that aren't that great, but also for some reasons that are legitimate, maybe from both a technical and cultural perspective. And so shipping daily business value, I think for some people may feel like a really challenging thing to do. So can you speak to those people and maybe just give an example of if I'm feeling that, like, oh, that sounds so great. I just don't know how that would be possible. Help us think through how we can do that and like, you know, go into work
Starting point is 00:24:01 tomorrow and try to create some daily business value. It's simple. Make a list of everything you see on a daily basis that your business or your teams are struggling with. Could it be lack of data definition documents, lack of understanding of certain things? Well, you know what? Carb out some time today instead of doing development work or building out an insight. Start trying to model out things efficiently. Go into Miro, create a workflow document that generally determines
Starting point is 00:24:31 what your business is doing in this particular workflow instead of just assuming and building, lay it out, right? Like do those types of things. It doesn't always have to translate to shipping code. Sometimes it could be, I mean, it was an easy one at Schiffke when I got in there, you know, they had already had some processes in place. And I said, you know what? Yeah, I'm going to spend some of my time building this out. But if I just went and talked to a few people, I could really, from a business perspective, model out the path from
Starting point is 00:25:05 sales all the way through BI, like all the way through, and then write it out as a business document. That's the thing that most people miss when you're building out teams to do things like stop thinking like a technical line. There's so much more to what we do than just thinking, oh, we're building a workflow, right? Ask why, what is happening here and uncover problems. Just because you can't solve them doesn those people is don't get overwhelmed by your actual job. Get overwhelmed by everything. Put your business stuff in place. Yep. I love it. I love it. All right, Ethan, do I need to re-ask the question? What was the question? Oh, focusing. Yeah. Yes. The data. okay. So tell us about the interactions between your data person and your CEO and the data person having to prioritize,
Starting point is 00:26:10 you know, the core things that the CEO needs. Yeah. So I think there's a, I got a few different things to say on this. Number one, for the last like four years, I keep going to people being like, focus on business value, not infrastructure. Focus on business value, not infrastructure. Which I think for the last four years, I keep going to people being like, focus on business value, not infrastructure. Focus on business value, not infrastructure. Which I think for the last four years was correct. It
Starting point is 00:26:29 was like, you went from a one-person data team to a 10-person data team, and now you're using 10 times as many tools. I think we enjoy it. That was not the right thing to do. You should focus on helping business stakeholders. I think now that most teams have realized that the 10 person teams have shrunk, the people that are really great at what they do have expanded. But it's like, I think the term business value is no longer, this is going to sound crazy because I use it all the time. We've been talking about it nonstop. It's like, I think you have to go one step further than that to actually think about what that means. And I think what John said is very much on point. Business value is not always revenue goes up and costs goes down.
Starting point is 00:27:10 Like if you do those two things, generally you are, should be okay. But like, that's not always aligned with what matters to a company. And as the data person at Portable, my CEO is the perfect example of how that's the case when so yeah um like as the ceo and the data person if i had another if someone else was the data person at portable they would look at me and be like all the stuff we did two months ago
Starting point is 00:27:38 you threw it out and now we're starting again on something new and then two months from now it's like we just threw all that out and we're starting again on something new. And then two months from now, it's like, we just threw all that out and we're starting again on something new. Like, why did our priorities change? And our priorities are changing every two months because we got 80% of the answer to a question and that was good enough. And now we need to move on to the next question compound on each other. So I think to John's point, it's like business value is better than infrastructure. Focus on business value. But when you think about business value,
Starting point is 00:28:10 think about it through the lens of get as close to your leadership team, your board, your CEO, your C-suite as possible. They are priorities. They could be KPIs, OKRs, just like personal goals and find a way to help them with that.
Starting point is 00:28:26 So like adding business value, adding, creating business value daily. You don't have to like to a certain extent, you can ignore data. I view data as a skill more so than a role. And you're really great with data. You can go to a marketing team, be like, I'm going to have value that you didn't even know you could accomplish. And you can do that when for them it would take six months or they know they might not even realize it's possible so like thinking about your job through the lens of what are the strategic
Starting point is 00:28:55 priorities at any point in time they are not always revenue and cost like sometimes it's brand awareness sometimes it's new logo acquisition even if it's not money figuring out what those are for execs and then going and be like, I'm going to help you accomplish that all faster is where I think we now as an ecosystem need to move the conversation. We need to, I think we've moved it now from stop focusing on infrastructure, focus on business value. But I think to your point, Eric, like there's a lot of listeners, a lot of data teams out there. They're like, cool. I hear that business value, but like, what do I do? Yeah.
Starting point is 00:29:36 My personal recommendation is always find the leader, find the top 10 leaders in your business. Ask them their top three biggest pain points as leaders, the goals that they are being compensated on, the goals they have to report to the board, to the CEO, and find one of those 30 goals, three goals per 10 people that you can make an outsized impact on and just do it. That's my macro take on business value. I actually think I use it a lot, but I think it's the wrong term now.
Starting point is 00:30:01 I think people get that. And now we got to move on to the next one. Next one is how do you actually align yourself with strategic goals of your company and impact them? Love it. Love it. All right, John, you had, I've monopolized the conversation, but you had some really good questions around product as it relates to data and some of the differences there.
Starting point is 00:30:22 So what? Yeah, I'm excited about this. Yeah. Yeah, I think, yeah, I mean, great discussion on the business value. I do have to say that I agree and I actually really identify with John. I went from this transition of I'm a deeply technical person, data person, into an executive role, had the ability to do my own data stuff for myself, but also had a
Starting point is 00:30:46 team. So as time went on, my team did more of it than I did. And I got, sounds like similar to John, got really obsessed with OKRs of all the way down to the team at my director level, just to bring clarity of what are we doing and how are we going to measure it and really trying to align around that. So very much identify with that. So, okay. So switching gears into this data product, engineering product thing. I'm so excited about this conversation. Yeah. With Eric, head of product here. I think to start off with, I think for both of you, all of you, even maybe I'll take maybe I'll have your take first.
Starting point is 00:31:26 Just your thoughts on, let's just start with the differences. What do you feel like fundamentally if you're a data engineer versus a software engineer, what are the crucial differences? They're both technical roles. Some of the skills can kind of overlap, but what are those crucial differences
Starting point is 00:31:38 just between the roles without talking about product yet? Let's take a typical software company, SaaS business. I know this is going to change depending on the organization. Well, let's say take a typical SaaS business. Let's say you're a 100-person company. In that SaaS business,
Starting point is 00:31:54 50 of the people could be software engineers. In a 100-person SaaS business, maybe you have three data people. Maybe you have two, maybe you have one. I would say that's probably the reason. Maybe none. I was in a company that was a thousand people and we had no centralized data people. Yeah.
Starting point is 00:32:12 But what you have to realize there is in a similar size company, the relative software company, this is going to be different in something like construction, might not have the software engineers, but you might have the data people people you need to realize that when you compare your team structure your responsibilities your roles in your tech stack and your approach to development you're not comparing your one person data team to your 50 person software engineer yeah you need to compare your one person data team to a company that has one engineer. So even the idea of saying, like, so if you have one data person and they call themselves a data engineer, I think even that is wrong. I would just call yourself a data person.
Starting point is 00:32:57 Like, you can't differentiate between a data engineer, a data scientist, a biologist. Analytics engineer, yeah. You do not have the other people, so you are just a data engineer, a data scientist, a biologist. Like, you're just... Analytics engineer, yeah. Like, hey, you do not have the other people, so you were just a data person. Just like, like, portable for two and a half years, there were two of us. We had me as the CEO and my CTO. He was our one engineer, and I was everything else.
Starting point is 00:33:21 And when you think about the tooling, the processes, the way you have to think about the world, when you have one engineer, it's so wildly different than when you have 50 engineers. So I, a lot of data people today are being, are drawing these analogies to product and engineering orgs that are 30, 50, a hundred people. There are less than a hundred companies in the world that should 30, 50, 100 people, there are less than 100 companies in the world that should have a 50-person data team.
Starting point is 00:33:49 It's just not like that. That is such a massive investment in just data that that's 50 people, let alone the engineering orgs of 1,000 or 10,000 people that have to be shipping really complex things. So like when you think about drawing analogies between the two i think the best analogies and the best framework for data teams to model their tooling processes software development life cycle is startup engineering or one day with one engineer up through 30 engineers. Look at how those teams are constructed, how those teams ship, how they focus on what's production
Starting point is 00:34:29 versus what's just get it out the door fast. And unless you're at Google, Meta, Amazon, OpenAI, your data team should never be looking at an engineering org that's bigger than 50 people and saying, I need best practices from that. So that's my perspective on a lot of the analogies is there is an analogy there. It's just not the analogy most people think it is. Yeah. Yeah, I tend to agree on that. I think the one thing that will be interesting, and this, I mean, that makes a ton of sense for SaaS, but SaaS is in the business of building software. So
Starting point is 00:35:03 of course, the people that build the software is like a big chunk of the headcount. And data is more supporting that initiative or effort. But do we see more and more companies get into the business of like, data is what we do. We do, I don't know, benchmarking or something. All we do is ingest data from a bunch of different places. And we're basically just in the business of data you would you would think in that scenario like maybe you know maybe you have more of that like framework where you've got product data product people and data you know data people split out into multiple roles however at that point you're
Starting point is 00:35:41 still kind of just mirroring the you know product and engineering like data is just like a it's like they're engineers like it kind of doesn't matter right so anyways John I want to get your take on this as well. Oh yeah like this is prime for what I love to talk about right and I'm going to take a little bit more of a technical approach to this. First and foremost I would say that data is what most companies are doing today. The software is just something that moves data or allows entry of data. Like Netflix, you would think, oh, it's a streaming service company, but you look at their
Starting point is 00:36:15 stock price, it mimics the amount of data flowing through their system. Like, I mean, those kind of parallels. I look at the difference between data engineering and software engineering, and I've done both. So I can speak to this a little bit more closely. Think about building a house. You have very defined specs. You have very defined rules for what you're building on a house, right? That's software engineering, right? That's you have very things that have to be done a certain way for it to work
Starting point is 00:36:46 data engineering is more like how does the water move through the house right what temperature does it need to be over here versus over here hot cold like there there are nuances to the workflows but you're essentially moving data throughout this software that's there. So fundamentally it's very different because you do only have a couple of people typically focused on that. And there are more ways, the precision of software that it needs, because you can't have anything break. You can't have anything that doesn't work. Where in data, it's a little different, right? Like you're looking at data, you're looking at the outputs as opposed to the entirety. So you have a little bit more flexibility. I find data engineering is more configuration than it is anything else at this point.
Starting point is 00:37:36 We have, as a data engineer, we'll have 10 tools to build our stuff. Whereas a software engineer might have one, right? They build structured software. They put it in JIRA. They push it up. We get a peer review. Like there's a specific thing that is there. Whereas us, we can choose portable.
Starting point is 00:37:52 We can choose a different tool. We can choose. We have a myriad of tools that have no bearing on the rest of the company. They don't care. It becomes a business decision. Do we want to build this little thing? Do we want to buy it i mean it's truly product to your point like it it is truly more product driven than say software
Starting point is 00:38:11 development so that's the way i kind of look at the difference between a software engineer i mean you look at look ml or you look at these middleware languages right i mean it's just basically all the same stuff just a different flavor with different features. And you have to learn in software engineering, if you're a, say, you know, a Java developer, you know, Java and you're good at it. And that's what you build in. Whereas in our world, I mean, I may choose R for a project. I may choose Python.
Starting point is 00:38:39 I'm choosing the right tool for what I need to build for what it can do. So there's some, I think we have as data engineers, more flexibility in our day-to-day than say a software engineer would. That's a really, that's a great point, John, around, especially with some of the new architectures and some of the new, I mean, we talked about Iceberg a couple of times in the show, other technologies like that actually are really accelerating this ability to say, you know, you can actually pull this apart and kind of use whatever tool system or, you know,
Starting point is 00:39:14 whatever tools you want, and you don't even have to move the data, you know? It's really interesting from that standpoint, whereas your software is built in Java, you know, or Go, or it's like, okay, I mean, yeah, that's, guess what, software engineer, like we hired you, you're going to write Rust, you know, or whatever. So yeah, that's a super interesting point. Yeah. But data contracts, right? So just to that point, data contracts are essentially what allows software to interact at those touch points. So you think about something like microservice strategy in software development, where that came out of looking at how data did their stuff and being as flexible and simplifying it so that there's a data contract between two pieces so you could interchange. And that's why most companies are moving to a microservice
Starting point is 00:39:59 strategy because it says, hey, I'm going to use Mongo over here. Oh, I'm going to use Postgres over here because some databases have better functionality for operational purposes. Instead of using a monolith, now they're moving to these smaller. And that, in my opinion, I mean, some people might just argue about this, but I think they got those benefits from the data work. They saw what we were doing in those workflows that we could interchange and inject and move things through. And now all of a sudden they're doing that because they realize technical debt was getting to be such a burden and having to restructure every four to five years of rebuilding their entire platform. Now they don't have to do that because everything's an API. I'm going to collect data from here.
Starting point is 00:40:42 I'm going to send it over here. And those data contracts are so critical. Yeah. Yeah. I really like your analogy of the water and the plumbing in the house. My previous gig, we were in the specialty plumbing and water industry. And I've thought about that analogy a lot, right? That's right. Yeah. I thought about it a lot because it's actually really complicated. There's the basics of like, don't want it to leak right that's like the basics but when you get into like what region you live in and what your water source is and what you want to filter out of the water and then you're getting down to parts per million you're not ever like getting perfectly pure very rarely do you have like perfectly clean pure water like distilled water like nobody it tastes gross nobody wants
Starting point is 00:41:23 that typically but when you're going through you're like all right i'm gonna get the like chlorine down to this level and iron down to this level and like that's so much more similar to data as far as like we're not ever going for perfect we're going for like this manageable like level of accuracy that's that is the right level of accuracy for the business decisions that have to be made on the data and that's so different than like hey we frame we frame this in, we use the exact spec screws and wood and drywall, et cetera. The big takeaway there is if the CEO says, I want this report to be 100% accurate, you now have license to say, nobody wants that.
Starting point is 00:42:01 It tastes gross. Yeah, exactly. So the slightly different take on the water analogy is also like data in most scenarios is an internal facing role. Like your job is to give the CEO or the CRO or the CMO information to do their job. So let's say you have an office building. It's like if you know that 90% of the people in the office building all work on the second floor not the first floor you need
Starting point is 00:42:30 more faucets you need more toilets you need and you need to have them dispersed in the floor in a way that actually makes it so those people don't have to walk down two flights of stairs right across all the back because now you're just wasting everyone's time and you're not doing your job so it's a cool job as a data person is to not put a thousand like faucets in the corner like you have to be all next to each other it's like what's the right number of faucets yeah um given the is it a faucet a dashboard in this analogy okay but it's like that's a really interesting point because it's one of those things where if you have too many you just you spent way too much money
Starting point is 00:43:13 way too much time sharing those stuff it's not being used if you get companies like now they're like everything gets delayed and it's like so like I think a lot of data teams think about like the oh is it a shiny faucet is it not a shiny faucet what's the parts per million when in reality they need to take a step back and be like what are the people doing on this floor yeah yeah where are the desks where are the like meeting rooms like when do people have lunch like no that's like we need a faucet in the kitchen and it's like i think it's difficult for data people in a lot of companies to watch everyone for a little bit of time to figure out which way.
Starting point is 00:43:54 Awesome. Observed observation is probably the number one thing I could tell a new data person coming into a company, observe everything. Hmm. Yup. Yup. Then to john's point too one more i love the plumbing analogy so we'll like we're i love this one but this is a medium this is like a juicy need post you know 1500 words on data data i've thought of this post like it's somebody had somewhere but there's the other thing with the tooling that you mentioned. Again, in like plumbing world, you've got point of use,
Starting point is 00:44:33 which is like I'm going to do an under sink filter or like a refrigerator filter. You've got whole house, like you can do your whole house. You've got the municipal level where you're doing like a whole city. And data, I mean, data can really be similar where you have this big centralized team. They're doing everything together. But maybe that's right. But sometimes it's like, hey, we just need a little point of use, like refrigerator filter, like right here. And it services, you know, five people. And that's great. And we would have way overbuilt if we made the change up the municipal. In the next episode, we're gonna, we're going to, we're going to complete this analogy
Starting point is 00:45:01 by figuring out the data corollary to the guy who wheels in those big tanks on a hand truck and puts the water thing on and it's not connected to anything. My favorite part of this whole analogy, and I guess we can get off the water analogy, or we can stay on it. I hate self-service analytics. I think most companies go way overboard and build things that everyone can do. That is a very controversial take. But from a water perspective, so taking the analogy one step further, I think what you should do if you're building plumbing for an office, for a floor in an office building, give people a faucet. Let them fill up their glass and bring it back to their table. You built the faucet.
Starting point is 00:45:44 They can use it. they can go drink the water at their table. Or buy a bevy machine. Let them pick their six different types of juice, and they can walk it themselves to their desk. Self-service analytics is like putting a miniature faucet on everyone's desk. I think that's stupid.
Starting point is 00:46:02 You don't know how many people are going to be there. You don't want to have too many faucets. You don't want to have too many faucets. You don't have too few faucets. Pay a little bit more for 12 packs of Diet Coke that sit in the fridge that the people deal with at themselves. Don't try to put a soda dispenser
Starting point is 00:46:17 on everyone's desk. We need cooks in the kitchen. What's worse is that the faucet at everyone's desk has 19 knobs that change things and you know they don't really know how it works but and everybody's making decision about the temperature of the water and they're right doing different temperatures i love it i love it okay we have time for one more question here i feel like we could keep going for hours we have
Starting point is 00:46:42 time for one more question okay i'm gonna break going to break the analogy. You know, so I'm so sorry. But one of the things, like, if you think about like the, you know, the pipes and the actual hardware that are like interfaces for water, in some ways, those are much more similar to software in that they need to reliably handle scale, they need to be robust, and they operate largely the same way every time in an ideal world. And data is pretty different than that in that the products, and Ethan, what made me think about this was you were saying, okay, we were starting over two months ago because we got 80% to a question, right? And so the product actually can look very different at different phases, right? Now, okay, maybe you have like a KPI dashboard that is durable and there are some really good things there. But I'd love really quickly to get actually all three of your takes on this. So going for the triple threat here on the differences between the definition of product
Starting point is 00:47:50 in software and in data. So John, why don't we start with you? Back to our discussion, I think, of like, you've got the team size thing. And I think it's what I would say is similar to what Ethan said. If it's a very small team, where you've got one engineer, if you have one engineer and you have one data person, probably at different companies, because they don't typically scale one-to-one.
Starting point is 00:48:15 Say different companies, one engineer versus one data person. Similar. You have to have some sort of at least slight product ability as a software engineer or a data person. Some sort of slight design ability at least enough to communicate out to people. And those are more of your startup
Starting point is 00:48:33 small company unicorn type people. I think it's similar. It just depends on the size. Ethan? There are a few more but I think there are three main ways people are building products with data. Number one is dashboards. It's a way, and dashboards are really just helping execs make decisions.
Starting point is 00:48:56 There's two different ways to think about a dashboard. One of them is a pipe that's continuously flowing water. It's not going anywhere. It's going to be there for the next years. And then the other one is just a one-off answer to a question. But like I put those into the dashboard. An insight bucket, that's the read-only use cases. That's the just like, hey, I'm going to get you the insights
Starting point is 00:49:17 so we can make the best decisions possible. Yep, yep. The second one is workflow automation. This is, you have a manual task where we have to move data from point A to point B. And right now it takes 10 hours a week. As a data team, you can use an iPass tool. You could use RutterStack.
Starting point is 00:49:34 You could use anyone to take data from point A, transform it, put it into point B. And the goal of that is remove manual tasks and productionize things. The third one, which I've actually been seeing more of recently, I find it fascinating, is marketing. I'm seeing more in-house data people either change their roles or being hired into companies that have unique data assets and use them to create public-facing insights about their own data internally to create benchmarks.
Starting point is 00:50:06 Like Carta is doing a great job of this. This guy, Peter Walker over at Carta, able to look across every startup's fundraising, et cetera, and he's using it to create insights that then drive people to Carta. And Matt Schulman over there, the CEO, they do compensation and benefits for companies around the world. And they have a unique data asset of benchmark salaries, benchmark benefits. And similarly, their data team is not building internal insights for strategic making. They're not automating workflows.
Starting point is 00:50:38 They are creating insights that show the world that they have either data products that sell. Internal versus external, yeah. It's a real external product. But those three use cases, I would stoke most data teams down to that. And if you start bleeding into most other stuff, you're either, not in a bad way, but like you're either a data team
Starting point is 00:50:59 and a software development team or you're doing something else. Or you're a marketing team that has really smart data people in it. And none of these are a problem. Just think about how you want doing something else. Or you're a marketing team that has really smart data people in it. And none of these are a problem. Just think about how you want your company structure. But I think people with data skills, it's those three use cases.
Starting point is 00:51:12 One quick comment on that that I think is a really important common thread is that each of those use cases have very well-defined consumers. The middle one, you probably have multiple consumers because you're getting data into a ton of different systems, right? Product, customer success, whatever, right?
Starting point is 00:51:34 Finance. You're getting data into a system. But each of those that you called out, it's crystal clear that there's a consumer on the other end, even for the external. So that's great. Really like that. All right, John, you get the final word. and what the companies are building, there's a couple of paths that I would have taken or that I have taken. Definitely running things as a product, understanding that business component of what is expected of the data team.
Starting point is 00:52:13 That is primary. You got to know coming in what the expectations are. I've seen it work where somebody comes in and they really don't even know what they're supposed to be doing because the business just says, we need somebody to do data. Um, it's very rare. That person is successful in that role, unless they take that, that approach of, Hey, yes, I can do this, but I'm also going to be aware of what the company needs. Um, now there's also something we haven't talked about, which is capitalization, right? Like if you truly want to be somebody as a data leader coming into a new company, you are immediately a cost center immediately, right?
Starting point is 00:52:53 So you have a target on your back of when cuts need to be made, you're the first one to go. Yep. Or your team is going to be primarily on that cut list and they're going to shrink it down. Now the way you target and you combat that is build one thing for a customer within the product. Start one thing and then all of a sudden your reward becomes capitalized. Yep. And what I mean by that for people watching that don't understand the business aspect of this
Starting point is 00:53:19 is the government will give money to your company as a kickback part of your salary and the work and the tools you build as as as kind of like a incentive to to build more stuff. Right. They incentivize that if you are building only for internal purposes, you are 100 percent not capitalizable. So start looking for ways to get in front of customers, get your dashboards in front of customers, get your reports, right? Build what I call exception reporting so that you can create these things within the product. Instead of somebody having to look through a thousand things, they know the five things they're looking for, serve those up. All of a sudden now you could conceivably become an analyzable asset, right? Changing your mindset from purely technical to leading into product. You really have to do the, why am I building this? Not just,
Starting point is 00:54:11 I need to build this, right? And being able to say no is a very important part of that prioritization component. Yep. I love it. Alrighty. Well, we are at the buzzer, as we like to say. That was such a great conversation. And there were many things that we did not talk about. So we'd love to have you all back again soon. We need to talk about integrations. We need to talk about your analytics framework, John. So we'll have Brooks find another time for us to have you back on in the next couple weeks. Love it. Amazing. Really enjoyed the chat. Thanks, guys.
Starting point is 00:54:47 The Data Stack Show is brought to you by Rudderstack, the warehouse-native customer data platform. Rudderstack is purpose-built to help data teams turn customer data into competitive advantage. Learn more at Rudderstack.com. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.