The Data Stack Show - 107: Building Modern Data Teams with dbt Labs, REI, and Robinhood

Episode Date: October 5, 2022

Highlights from this week’s conversation include:Introducing our guests (3:05)Defining “data team” (4:40)How data teams emerge and evolve (14:11)The need that forces the creation of a data team ...(21:12)The backbone of the data team (26:23)Building a career within a data team (36:39)Advice for new data team managers (47:35)Question and answer time (52:38)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com..

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You'll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by Rudderstack, the CDP for developers. You can learn more at rudderstack.com. Welcome to the Data Stack Show live stream. This time, it's all about teams. So I hope I won't disappoint you, but we are not going to talk about technology or we're going to try to avoid it. And we are going to focus solely on the people
Starting point is 00:00:40 behind the technology. Incredible panel today. Really incredible. So Costas, I think one of the things that I really want to ask this huge repository of Mindshare is about the best way to form a team, especially when data teams are sort of nascent, right? Because it's rare that a company, you know, just says, okay, we didn't have a data team and then poof, like now we have a data team. It happens over time and it's generally iterative, right? Like someone starts doing the job and then the role grows and then a team just kind of forms organically and it can go a lot of different places. So I want to ask about their experience, seeing that happen, managing that process and see what we can learn about how to do that really well.
Starting point is 00:01:31 How about you? Yeah. I'm, I want to ask them about the individual roles in the team. Like trying to identify like how their position, what are the, which ones like the backbone of a data team. We share so many terms and we are still like creating new roles. Like, okay, we have like analytics engineering, engineering, and we have the traditional data engineering, like we had in the past, like the DB admin, whatever, but like so many different roles. And I'd love to hear like about like the dynamics and like what the structure
Starting point is 00:02:14 of the data team and see how they understand this and also like how you, like what's like the career path for each one of them, right? Like how do you become like a data engineer and how is this different like to, I don't know, like a data scientist or whatever. David Pérez- Totally agree. Well, let's dig in and learn everything we can from these brilliant people about building data teams. Let's do it. Welcome to the Data Stack Show live stream.
Starting point is 00:02:34 This is one of our favorite things to do. And I have been looking forward to this one for a really long time, in part because we're going to take a break from talking about a bunch of data tooling and technology and get out what I think is probably the most important subject, which is people, the people behind the data, the people working on data. And we're going to talk about that in the context of teams. And it's going to be great. So let's just do some intros and then
Starting point is 00:03:05 we will dive right in. Paige, do you want to start off? Sure. My name is Paige Berry. I use she, her pronouns. And I'm currently enjoying a month of fun employment between jobs before I start my new role as a data analyst at dbt labs which i'm very excited about previously previously i was a data analyst on the data team at netlify and before that i worked at new relic as essentially a full stack data person at a couple of different a couple of different departments there so that's me wonderful sean you want to pick it up next sure hi i'm, I'm Sean Halliburton. I'm currently Principal Cloud Data Engineer for REI's Consumer Insights team. And prior to this, I was a staff data engineer with CNN's data intel team. And before that, I built and managed the Clickstream data engineering scheme for Nordstrom, as well as the test and learn platform there.
Starting point is 00:04:07 Awesome. And Sri. Yeah. Hi folks. My name is Srivatsan Sridharan. I go by Sri. I'm also in between kind of jobs right now, but most recently I was at Robin, building and leading out their data infrastructure organization. And prior to that, I spent close to a decade at Yelp where I was doing
Starting point is 00:04:25 similar things, data platform, data engineering. Yeah, I've been in the data space for about a decade as an engineer, then as a manager. So really excited to hang out with OLC today. Great. Well, let's just dive right in. And one thing that we love to do on the Data Stack Show is go back and look at what we kind of call first principles around terms that we tend to take for granted. And one of those terms is data team, right? And, you know, like all of you have worked on data teams, built data teams, etc. But it's one of those terms where if you just ask someone on the street, like, hey, do you know what a data team is? They would say yes.
Starting point is 00:05:04 It's like, well, could you give me a definition? And you step back and think about it and it's like, well, that can mean a lot of different things depending on the company, et cetera. So let's just start by having each one of you define what a data team is from your perspective. And then we'll dig a little deeper and we can just go in reverse order. So Sridhar, do you want to kick us off? Yeah, yeah, definitely. I think the debate's still out there, right? I think different people perceive this differently and different companies have the structure differently.
Starting point is 00:05:32 But I think from my perspective, fundamentally, it's a cross-functional department that is responsible for making sure that the business moves forward based on data. And so I would assume that a data team's primary responsibility is to make sure that decisions are taken in a scientific way, in a data-driven manner. And that requires a multitude of disciplines from the people who are modeling the data, from the people who are writing algorithms, from the people who are visualizing the data, from the people who are running the infrastructure that makes this data crunching and slicing possible.
Starting point is 00:06:11 And so I truly do think it's one of the disciplines that requires multiple stakeholders to be together towards a common mission. Love it. And so many questions, but we'll go. So many questions and comments. Sean sean you want to go next yeah i sure like the cross-functional part you threw in there because a data team can be anywhere from just one to two people in an analyst role to analysts and data scientists. Even you might have a technical program manager, often you don't. But that team really sits in the confluence of a number of other teams because you typically
Starting point is 00:06:59 have a business stakeholder asking questions that need to be data-driven. And the data team is the principal researcher to get the answers for those questions based on any data set that they can get their hands on. Frequently, just getting a hold of that data is half the challenge. But as that cross-functional person, it's on you to make those connections to be able to do the research and be able to come back with as definitive of an answer as you can. Love it. Paige? Yeah, I think a lot of what I would say has been covered already, which is awesome.
Starting point is 00:07:39 I definitely think about the data team, a data team as related to what the mission of a team doing this work would be. So when I started at Netlify, we had a mission for our data team that was like helping the business to make decisions by providing timely, accurate, and actionable insights. And so when I think about people with various skills and various talents around working with data, coming together with that focus in mind, that is a lot of what I picture a data team being. Well, let's get, let's dig down and get one sort of one click deeper and more practical and would love each of your perspectives on, you know, data team is, I think, a term that we use to describe the functions that each of you talked about. But I think a lot of times the actual team name is more specific. And then I would say even, and this is, you know, interesting to think about whether this is a good thing or a bad thing, but the functions that you talked about often can actually be
Starting point is 00:08:53 separated out into different parts of the organization, even though they technically sort of functionally roll up to like, you know, delivering data as, you know, the final product. So let's talk through that a little bit. And Paige, do you want to speak to that? Because you kind of have like, if I work on a data team, that could mean specifically like data engineering, data analytics, you know, I mean, there are data infrastructure, like there are lots of things. What are your thoughts on that? Like, are you seeing more centralization happening? Yeah, it kind of depends. I've worked in a couple of different SaaS companies and there were different models for each one. So one that I worked at, we had more of a, there's a sort of a central data team, you could say, that were really mostly focused on
Starting point is 00:09:40 a lot of the data engineering and analytics engineering kind of roles of getting the data from various sources and putting it, creating it and putting it into a central data warehouse and making models that then rest of the company could use to get insights from. And then I worked on what was almost like a subset data team that was the product analytics team. And we were focused mostly just data analysts. And we did some analytics engineering, of course, too, but mostly focused on, okay, there's the data warehouse. There's all of the data from the company. Let's see what we can use, pull out of there to make insights that help the product or make decisions. And then at Netlify, we were like a central data team where all of the data came into the data warehouse and we had a data engineer who helped bring those data sources together.
Starting point is 00:10:28 And we did analytics engineering and data analysis also as part of the team and had like a hub, which were a few of us who'd kind of worked on cross-functionally or cross-organization functions. Yeah. And then we had some data analysts who were spokes who like particularly would work with the product org or particularly work with finance. Got it. I think we talked about this actually when you were on the show previously, but I think that model is becoming more popular, but is I think the exception as opposed to the rule and fragmentation seems to be more common. Sri, you're giving a very affirmative nod there. Do you want to weigh in?
Starting point is 00:11:11 Yeah, no, I mean, what Paige said, you know, totally resonated with me because a lot of the companies do have it fragmented. And I think there are pros and cons. So at Yelp, for instance, the data science function rolled into product. And then the rest of the functions like engineering oriented kind of rolled into engineering. And the benefit of that was the data scientists were sitting in product and therefore they were able to influence product managers to use data drivendriven insights for decision-making. So, you know, when you're reporting to the kind of same boss, so to speak, things move faster, right?
Starting point is 00:11:49 But then there are cons as well, which means that now we have all of these different teams with, you know, different leaders with different priorities and you have to align them together. So there are definitely pros and cons to kind of either org structure. Yeah. Sean, would love your commentary on that.
Starting point is 00:12:04 And I think maybe we can maybe you can take it a step further and give your opinion on, is there an environment where one of those works better, right? Are there certain company structures or business types where those two models, because it's not a right or wrong. Right. Yeah. we talked a little bit about patterns in our initial warm-up chat this morning, right? And two dominant patterns that I see in the industry are, I guess I will call it chaos and confed, it's a lot of chaos because everybody is shifting roles and tasks so quickly. And whoever ends up in that data seat, it might be an ad hoc decision and it might be a very short term one. Whereas in an older company, and I've worked mostly with older companies. And my experience has been that you have very separate and, and siloed components between data engineering team and the database admin team. And then on the other side, you might have the actual analysts and data science teams.
Starting point is 00:13:27 And there's that wall. You know, some separation can be okay, and it certainly can work. I think the most important thing is trust and a spirit of empowerment and wanting to enable self-service. I think that's the most important thing. So that if either side says this isn't really working for us, then both can come together to identify a quick resolution that gets everyone back up to that, back up the acceleration curve. You all have worked in quite diverse type of companies, right? We have from CNN to really extremely fast growing startups. Do you see a difference in how data teams emerge and also how they evolve depending on, let's say, the pedigree or where the company comes from? Like, is it a company that exists for like 50 years and they have realized that,
Starting point is 00:14:28 okay, we have to go and implement a digital transformation program and let's go to Gartner and hear what the profits there have to say about what is right to do and what not, right? And on the other hand, you have companies like Netlify or like Robinhood that probably must, I guess, from day one, they start with some kind of like data driven strategy, right? So do you think that, let's say, this is like something that affects the way that like data teams are formed or like how they are involved inside the company? And we can start with, oh, I feel very powerful that I can do that.
Starting point is 00:15:08 Paige, you go first. Okay. Yes. Yeah. So my experience has actually been in, with the SaaS companies, there are some similarities. They're not a whole lot different in terms of how long they've been around. I guess New Relic has been around longer than Netlify by some amount of time. So my understanding, I think, of how things have evolved might not be as broad as some other folks, because I'm really interested in hearing what a company that's been around for 50 years, how they put together data teams. So yeah, I do think that there is a difference between a company that starts out knowing, okay, we have a data team're we're going to put one together early in our career and really or you know our
Starting point is 00:15:49 life cycle and really think about data from the beginning versus another company where I think there was a lot of ideas of like well we have data as part of what we you know what we do with our product so and our software engineers are smart and they can query the data and they can kind of figure things out for themselves for a while that can actually you know that can work but then there's a point i think where okay we actually need to really start codifying and really identifying okay what are the actual metrics and how do we really define them and let's let's sort of corral the chaos so that's what my experience has been, like seeing how New Relic did that. Sri, you're next.
Starting point is 00:16:31 Yeah, I think some of the newer companies definitely have an advantage because the technology has advanced so much, right? Like today, if a startup were to bootstrap their data team, they have all the tools at their disposal to get going, which can be harder for a company that's been around for, let's say, 50 years because they probably have already processes and data debt and reconciling all of them. They have to build a system that scales from day one. Whereas in a startup, you don't have to build a system that scales from day one and you can organically scale it. So that makes it easier. But I think one of the challenges that smaller
Starting point is 00:17:05 startups face, and I've definitely seen this both at Yelp and Robinhood when they were growing, is you don't have mature processes for governance for data and structure and modeling of data, which a larger company might have because they've been around the block for decades. So definitely, each of these companies face different, different challenges based on where they are. But I can definitely, you know, see that in the last like seven to eight years, it's been much, much more easier to bootstrap a data team than, than before. Henry Suryawirawan, That's great.
Starting point is 00:17:38 Sin, I left you for a reason is because you can tell us about how it feels like for a company that is at least, I don't know how old CNN is, but I'm pretty sure it's older than... Yeah, CNN is about 50 years old. Oh! Still only, you know, still a toddler compared to Nordstrom, but yeah. And again, that's a great point about governance and existing processes. And that's often a reason for the divide you might have between the data team and the teams that provide the tooling to that data team is just the simple idea of, well, we've always done it this way. You know, contrast that to a startup where, in a way, not having that history can be a luxury.
Starting point is 00:18:28 You can't say, well, we've always done it that way because we haven't been doing anything for very long. Again, either can work as long as there's a willingness to break things in a controlled manner and in a smart manner, a willingness to experiment and try new things versus a bias toward consulting. I think, Costas, you brought that up. That's definitely a thing at older companies where your management team might be older and have longstanding relationships with other business leaders
Starting point is 00:19:06 in the community that come in and out of the consulting scene. And again, sometimes they feel like they need to bring in that outside third party to come in and shake things up and break that history up. We've always done it this way. But why shouldn't a data team be able to ask the same question? Why shouldn't we be able to kind of pull the handbrake and say, well, wait a minute. Can we stop for a second and evaluate these other opportunities and tools that might be out there, many of them for free?
Starting point is 00:19:42 Henry Suryawirawan, This is great. So we had a very interesting episode on the Datastack show which I think two weeks ago with Ben Stansel you probably know him and from ModAnalytics and he said something super super
Starting point is 00:19:57 interesting at some point he said that let's say the data industry started, we usually say that like a catalyst for this was the cloud data warehouse. But that's from the technology side of things, right? From the organizational side of things, which is also like important because if you don't have that, like technology doesn't do much anyway, there was like
Starting point is 00:20:22 this moment in history where data became, and data analysts became, let's say, business partners. We wanted in the business to make as much as possible data-driven decisions. And what is very interesting is that pioneers in that
Starting point is 00:20:39 was gaming companies, like Zynga, for example, right? I found this super interesting because it really like clicked in me that yeah, like technology is like, we focus a lot on the technology, but we keep forgetting like the human side of things that also has to be there, right? The reason I'm saying that is because yeah, like we take for granted today that data is, let's say, important for making business decisions. And we probably consider data teams as business partners, right?
Starting point is 00:21:10 But I'd like to ask, through your experience so far, what was, let's say, the first business need that forced inside the organization to form, let's say, a data team or start having people who they have to work full-time on working with the data and giving insights. And this is more about the startup environment, obviously. I think things are a little bit different when we are talking for established companies, but we'll talk soon about that. But I'd love to hear like your experiences and like what you have seen there. So Sri, do you want to start first? Yeah, I can, I can go first. I
Starting point is 00:21:51 think what I've seen in my past experience is the first set of questions really come from business metrics where the executive team saying, I want to know kind of how the business is performing. And when your data sets are small, that's a query that an engineer can fire or a product manager can fire on top of your production database. And I think from an infrastructure, I'm kind of sharing this
Starting point is 00:22:13 from an infrastructure point of view, things start to get interesting when the scale increases and you can no longer query your production database or you start to have tons of metrics with different product lines being launched. And I think that really becomes a catalyst for a dedicated function because you need to extract the data, store the data, organize the data. Until that time, you can probably make do with just querying your database and getting what you need.
Starting point is 00:22:37 So I think it really starts from the executives wanting to know more about the health of the business and predict the future of the business. And then that ends up being the catalyst is kind of what I've seen. Makes sense. Paige? Yeah, my experience with this would probably be around when I started working more with product analytics at New Relic and there was a lot of,
Starting point is 00:23:02 there was that emphasis around product led growth and to really make progress in that we had to understand a lot about customer journeys and who was, you know, essentially like who is clicking what, where, and at what point did they decide to pay us money? So like, and that need for data was a sort of a different, it was different because that was, right, starting to drive decisions on like, what are we going to build next? What are we going to put our attention towards? How do we make this work? Like get our engine online in that way. So that was my experience and my initial experience of saying, hey, we need this data and we need it to be able to look at it this way.
Starting point is 00:23:45 And then we're going to start actually making decisions based on what we're seeing here. Yeah, that was mine. Sin, what's your experience? I have walked onto existing data teams wherever I've gone, interestingly enough. I have not put in time in that startup scene, at least not yet. But I imagine a lot of companies start out just with the simplest data integration between
Starting point is 00:24:16 back-end database and some paid SaaS platform. And so that executives can try to run their own reports best they can without having to hire a data person yet. But then inevitably, you start offering more products, you start developing more questions, and you, you know, additional questions branch off of those questions, depending on what your business is. And those third-party platforms only scale so far because they're built for the masses, right? And my experience, both as a manager and as an engineer, is that you can submit as many feature requests
Starting point is 00:25:04 to those products as you want all day long. But until enough other users and clients ask for the same thing, you're not going to get it. And if you do, you're going to pay through the nose for custom development hours. So there is an inflection point where those tools start failing to answer the questions. It takes too long to get answers or the reporting interface is too rigid and doesn't work with your proprietary data set. scaling up a data team from, like I said, those one or two initial heads, which probably fall under the business side and not the technology side, to a more cross-functional, more empowered
Starting point is 00:25:53 team with, you know, better tooling underneath. But then again, if you don't have the right people seeking those answers and that know the data and know how to work with it, then the tools really don't matter. So I have a question because you said that you have like experience so far, like more already established like data teams out there. Is there like data teams also like, I mean, from my understanding at least, there are like a couple of different roles that you can identify as like members of the data team, right? Like, well, BI analyst to database admins, that today probably are called something else. Data engineers, like many, many different roles, right? If you had like to identify one role as the backbone of the data team, which one would it be?
Starting point is 00:26:48 Wow, great question. I think it starts with a great analyst and a great manager. And everything layers on from there. As your analyses get more complex and you start relying on more data sets, you need to bring in technical reinforcements. So you might add on your first data engineer. As your outputs get more complex, then you start making the analyst role, which is probably the right place to start because you have to answer those first fundamental questions based on the data sets that are getting you the fastest answers. manager to help keep all those roles in sync as you scale up to say half a dozen analysts and then three to six data scientists or even more. And if you're really lucky, you will have a program manager not only for your analytics
Starting point is 00:27:58 work stream, but for your data platform itself. We were really lucky to have that at CNN. It was a major asset. And it's all part of that three-legged stool between your engineering management, your program management, and your product management. That to me is the ideal of a fully scaled data team with in-house technical capabilities on both the data engineering and the analytics and science side. You're taking in requests from your stakeholders on a weekly or bi-weekly basis. Your product manager is meeting with them regularly and is able to speak to all the technical issues that respects your work stream. And your engineering manager, again, is building not only the platform to support all of it, but the people and the technical careers that underpin it all.
Starting point is 00:28:59 One extra bonus question for you, Shane, because I keep thinking about that while you were talking. In these large organizations, what's the role of the traditional IT to the data engineers or the data teams out there? Yeah, so I think the answer to that is kind of twofold. It's a little bit further out to the edges. And by that, enterprise message buses, your Kafka clusters, managing your more advanced database platforms. Redshift and Snowflake and similar cloud data warehouses you called earlier. Yes, absolutely. Those were great advances and they were super empowering in the same way that AWS itself has been super powering
Starting point is 00:30:09 in giving less technical team. And I use that term. I mean, every team to me can be a technical team. That's just what you decide you want to be. But those cloud data warehouses have really empowered
Starting point is 00:30:25 the analyst teams, the scientist teams, to take so much more into their own hands, to spin up their own servers, to deploy their own code to those servers, to run your batch transforms, to run your secondary ETL, similar tasks like that. But yeah, I think traditional IT will always have a place
Starting point is 00:30:49 in owning that very first copy of the data, so your transactional databases, and putting the governance in place so that as you make your own working copies of that transactional data, you do it in a consistent and controlled manner. Evangelism and education is a part of it. Helping to coach other teams on, you know, the best way to create your own production
Starting point is 00:31:17 clones so you configure your own blue-green deployments in your own data flows, things like that. That's awesome. All right, Bates, let's go back to the backbone question. So what's your take on that? Like, what have you seen out there? Yeah, I actually have a pretty similar answer, maybe partly because I've tended to be in the data analyst role more recently.
Starting point is 00:31:52 But there's a point of you can have so much data in your data warehouse and you can have beautiful models in there. But if you don't have that last mile, if you don't have the way to communicate to the stakeholders and the business folks what the data is saying, then you're not fully unlocking that value. And so I think that's one of the things that's, to me, I really see as a necessary piece of this whole data process. Although I also have like an additional thing that I think is interesting. It's something that I've learned from Emily
Starting point is 00:32:22 Asherio, who was my manager at Netlify originally. And she talks about how when you have kind of like a new data team and it may be an earlier stage company and you're just trying to get like, how do we quickly get value from our data? She talks about starting with operational analytics piece or the like getting data from the source to the tools that the company's using to actually make their decisions so getting data into you know salesforce or getting data into whatever marketing tool you're using so that folks can like actually act on that data quickly and
Starting point is 00:33:00 that's something that could be done by data engineer, analytics engineer. So that's another piece of why there's like, it's kind of a tie almost for me. So it sort of depends on what level your company is at and what you really need first. But for me, it's whatever is going to get value to the businesses as quickly as possible. So yeah, makes a lot of sense. Priya, what's your experience? Yeah, it's a really spicy question, right? I have to take sides now.
Starting point is 00:33:31 No, I'm just kidding. I think the way I see it, I feel like there are two types of companies, you know, broadly speaking. There are companies for whom, you know, data is their core product. And then there are companies for whom data is their core product. And then there are companies for whom data adds business value.
Starting point is 00:33:48 So if we take the first category, think of any company that has an ads business, right? Data is their main product because they need to figure out who clicked, who are we targeting to, who are we charging for the clicks and impressions. And so in those organizations, I think influence is wielded by the people who
Starting point is 00:34:08 are making it happen. And so it could be the product leader, it could be the engineering leader, it could be the architect, because the business, you know, the technology is lagging behind and needs to catch up, right? So the technology ends up driving that. If you look at the other category of companies where data adds more value to the business, but it's not central or critical to the business, I think the most important role is the person who's sponsoring that. Because oftentimes I feel like data teams have to fight an uphill battle because the business might refuse to, you know, take their insights. You know, we're all human. We all have confirmation biases. We all want to think we are right. And so I've seen a lot of times where, you know take their insights you know we're all human we all have confirmation biases we all want to think we are right and so i've seen a lot of times where you know business leaders will say
Starting point is 00:34:50 oh i don't care about the data we're still going with this decision and so for a data team to succeed the person who can push back on the business and be the champion of truth seeking ends up being kind of the most influential. And that could be an IC that could be a manager. In smaller startups, that could be one of the early engineers, early analysts. In a larger company, that could be a director or VP level person. But I think that's kind of who I see as the influencers who end up making sure data succeeds at an organization.
Starting point is 00:35:23 Yeah, that's super interesting. That's like, I think like a very good insight that applies not just data teams, but pretty much everything. We all have to be a little bit of like a salesperson in our life. Like whatever we do, we need to sell it internally. And I think that's something that like everyone who is having like a career in tech, like at some point has to learn, even if you want to be, let's say the most individual contributor that you can be in tech, like at some point has to learn. Even if you want to be, let's say, the most individual contributor
Starting point is 00:35:47 that you can be in engineering, you still have somehow to sell your work and like convince the people around you that there's money there. And this is like, obviously like much more important for something like data, because especially like in data, there's always, I mean, people think that data is like this binary thing. It's either like, it says you do this or do that, but that's not the case. Data is there like to help you with your own biases that you have and your own intuition. So there's always fuzziness in whatever is delivered there.
Starting point is 00:36:19 Anyway, that's the topic of another. So I think, but I'd like to chat a little bit more about like how someone can build a career in a data team, how someone can get into a data team and what you have seen. Like, first of all, it would be great to hear how you got into that personally, each one of you. Did you start by saying like after college, oh, I want to be like a professional or whatever, or something else happened there. So share your personal stories first, and then we can discuss more about like what you see today happening in the industries. Sri, let's start with you. Sri Srebarajanthamthi Rao Yes.
Starting point is 00:37:04 For me, you know, I didn't even know what a data scientist or what a data engineering role meant when i graduated then and back then probably there weren't formal definitions of these roles anyway so i came into the industry wanting to build software and naturally you know i kind of gravitated towards where there were interesting technical problems and it was just by accident that my manager was like, Hey, you know, we have to build this data warehousing ETL solution. Are you interested in this project? I was like, yeah, that sounds cool.
Starting point is 00:37:36 And so it was really by accident, not by something that I planned. I thought the technical problems were interesting. So I jumped into that and then, you know, one thing led to another and that's kind of how it started. Okay. So we have one person who started from being an engineer getting into like data. Paige, your turn. Sure. My first jobs were customer service at coffee shops.
Starting point is 00:37:59 So the way that ended up getting into data is very interesting. I ended up getting a job at the college that I was attending in the IT department going and helping people with their problems with their computers because I had a good customer service attitude. And then I learned how to work on computers. parlayed that into a job where I worked at the foundation for the college and ended up administering their database and working with their donor data. And that's where I fell totally in love with data, databases, data work. So that's really where it all started. And then I moved into deeper into higher ed and data, and then I made the leap to New Relic in, I think, 2018. So, yeah.
Starting point is 00:38:44 Awesome. So, yeah. Awesome. John, your turn. Yeah, a little bit of both of those for me. I came in from outside the tech industry, period. I was an English major in college and an editor right out of college. And 20 years ago, found myself unemployed or fun-employed. Too far. I am in a computer
Starting point is 00:39:11 and so started messing around with just general web development. And so also, I think unlike a lot of engineers, I came in a little unconventionally in that I came in from the front end. I specialized in capturing data, optimizing ad verticals, five pages, landing pages, and form flows. And as I became more responsible for those, I became more responsible for the data flowing through them.
Starting point is 00:39:40 You know, I got more and more questions about why am I seeing this in the data? And curiosity really took over from there. I mean, you have to be curious enough to be proactive enough to be able to get out in front of those questions. And the more I did that, the more I became dissatisfied with the tools at hand to process that data and to be able to answer those questions. And I was lucky enough that at the same time, I was offered the chance to start hiring others to help me do that and to build out the, again, first the test and learn platform at Nordstrom and then the wider Clickstream engineering platform to be more robust and tailored to our business. And I became addicted
Starting point is 00:40:34 to defrecating those big, expensive, third-party, off-the-shelf suites for cloud-based proprietary tools and solutions frequently based on on open source software and that alone takes so much curiosity to even be to even want to do that let alone to know what questions to ask to to lead you to that path and all along the way, that curiosity is what I have looked for when I am hiring others to help me do just that. And I think that's how a lot of people get into data. At the same time, you know, going back to the very beginning of our conversation today, how do data teams start? Often it starts with one person that is curious and is dissatisfied with the tools that they have to answer those questions. And so they start kind of shaking the hive as, as, uh, and looking for different ways to get at those answers and independently going to it and saying, Hey, can you at least spin me
Starting point is 00:41:41 up this so I can try deploying what I'm trying to do? Can you give me slightly elevated privileges so I can create the relations that I'm trying to create and flows that I'm trying to pull off? There's this new thing called Airflow. I really want to try things like that. that can often be hard to get traction on either because, like I said, you're in an older company with a certain way of doing things. And that's not how we do it here. We already have tools for that. Or you're in a very young company and it's a matter of prioritization and time. And you might be, you know, you have 40 hours to dedicate to what work has been handed to you. You take an extra five hours a week on the side to coach yourself up and try those new things and then demo them to the people you work with and get traction that way and show off what you really could do if you were given more.
Starting point is 00:42:49 And I think that's where a lot of data teams start and it kind of spirals from there, from one person to two people, from one type of role to another type. You layer on complexity, like we said, that progression from analytics to data science, as your models progress and mature along with the people developing it. Henry Suryawir it. That's awesome. And I have to say that like, it was like super, super inspiring to hear your story. Although it made me feel a little bit bad because I think my journey is a little bit more boring. It's like just working with computers since I was like 15 years old.
Starting point is 00:43:21 Oh, 10 actually. So I mean, just doing the same thing like for the past like 30 years. So Hey, Drafts are always greener, right? Yeah. The Commodore 64 that I had when I was 10 and the two weeks I spent with it and then I put it down. Yeah.
Starting point is 00:43:42 Yeah. Actually it was kind of funny, like just like make like a comment here. These past like two days, there were like two very interesting events that happened. So yes, I think yesterday we had like the event from Nvidia where they announced like the new graphic cards of the series four. And I think like a day before or two, they knew the latest version of a game called Monkey Island was released, which for anyone who knows was like a game back in the 90s that was like, you know, much, much more primitive.
Starting point is 00:44:15 And it's very interesting to see like the two very contradictory things like happening at the same time and really make me feel like a little bit nostalgic. Anyway, Paige, your turn to tell us about what's, what's like the journey that you see for like a person who wants to have a career like in data. Paige Sautner- Yes. For a person who wants to have a career in data, I definitely echo Sean about being curious. It's a huge part of it. And I think what's exciting these days is there are a lot of public data sets that are out there and there are, there are a lot of like free tools that can be used to,
Starting point is 00:44:59 to like work with those data sets to kind of teach yourself SQL to kind of to look at stuff data around a subject you're interested in asking questions seeing if you can see what the data says and then even you know take a crack at creating some charts and then do the whole storytelling like I had this question this is the public data set I looked at. This is some of the charts that came out of that. This is what it's telling me. And having some of those examples are really, really incredibly useful
Starting point is 00:45:34 for being able to communicate to others. Like this is what I'm interested in. This is something I want to do more of. This is what I care about. Those examples are really priceless when it comes to getting into data as a career, I believe. do more of. This is what I care about. Those examples are really priceless when it comes to getting into data as a career, I believe. Mm-hmm.
Starting point is 00:45:50 What about you? What do you think? Yeah, I would definitely plus one the curiosity. I mean, it's generally true for any profession, but I think with data, it's even more so because you are kind of a truth seeker there, right? I feel like, you know, the data space has become very specialized in the last several years, similar to how, you know, you had this change from being a single IT team
Starting point is 00:46:12 to now having a full stack engineer, mobile engineer, backend engineer, web engineer, and so on. And I think the same thing has happened to data. Like 10 years ago, you were just a data person, but now you could be a data infra engineer, data engineer, data analyst, data scientist, MLE, BI engineer, and so on. So, but I think for someone who is just entering the profession, it can be quite daunting to figure out what exactly, which exact role that you want to take, because there are just multitudes of them. But I think what could be helpful is to figure out which direction, you know, you want to approach this. So you could approach this from, I like building software, and I want to learn more about data. Or you could approach this from, I like solving problems for the business, I like
Starting point is 00:46:54 finding answers, and I want to learn more about the thing. And so depending on which where you start, you know, as you spend the years doing this profession, you will naturally figure out what excites you more and what excites you less. And I've also seen people changing their professions a lot. I've seen data scientists become engineers. I've seen engineers become data scientists. And so there's definitely a lot of mobility in the data space. Love it.
Starting point is 00:47:22 Well, we are actually closing in on our discussion time we might leave plenty of time for q a but there's one more question that i want to hit because i think that i'd love for you to share some insights from your experience especially with the listeners who are in some sort of managerial role on a data team. You know, that could be early, that could be, you know, on an established team. But with love, and I know it's hard to, it can be difficult to distill this up down, but what are some of the top things you would say to someone? And maybe even we can specify to sort of a new manager on a data team, what would you say to them as your top advice? And Paige, let's start with you.
Starting point is 00:48:12 All right. Awesome. This is kind of fun because there is a conference called Coalesce, and my teammate Adam Stone and I did a talk that was recorded at last year's Coalesce in December 21 called All the Data Managers I've Loved Before. And it is an entire talk that answers the question. Thank you. It's so much fun to put together. And so I think that probably to distill that down, a lot of it is really being aware of the people side of data, that the challenges that this career can bring for those who are on the data team and for the stakeholders who are meeting data and just really think about how can I think about the people side of this and how people are feeling about these challenges. Because we can really dive into the numbers and the facts a lot and kind of forget
Starting point is 00:49:07 that there are people behind all of this and even all the business, it's people in the very bottom. It's always people. So yeah, thinking about everyone involved as humans and what they really need can really help. So helpful. Sri, how about you? Yeah, I definitely agree with that. I'll try to offer an additive or a different perspective. I think it's really important if you're a manager of a data team to understand what is your purpose and what value are you driving to the organization. And it's not just you or your team knowing your purpose,
Starting point is 00:49:40 but the entire organization knowing your purpose. Because this is something that I've seen as a folly of a lot of data teams. And I've certainly made those failures in the past is when you are not aligned with, let's say the rest of the executive team or other leaders on what is your role and what value you're bringing to the company.
Starting point is 00:49:57 And the other corollary to that is the problems that you're solving today, how are they going to morph over the next year to two years? Because then that determines your staffing plan, your hiring strategy, the types of specialized skill sets that you want on your team.
Starting point is 00:50:12 What do career growth opportunities for members on your team looks like? And so it kind of boils down to those two foundational things. Like, what is your purpose? Is everybody aligned on that purpose? And how do you see the purpose morphing over the next one, two, three years?
Starting point is 00:50:26 Yeah, I love that. And I mean, that's really good advice for any sort of leader is evangelizing your team in general for any team. So love that advice. All right, Sean, wrap up this question and then we can move to Q&A. Yeah, I think Paige and Sri,
Starting point is 00:50:45 you've already covered it beautifully. I don't know that there's a whole lot more to add. All I can think of is to echo the importance on relationships and to make sure that you're able to get your people into the room as often as you can. I would actually suggest focusing less on the outputs and more on the inputs. That it's critical to help coach the organization on thinking about data from the beginning.
Starting point is 00:51:21 It's still too often an afterthought. And you have to ask that question from the beginning so that you can form a measurement plan so that you can know what the questions are as the product is being developed. As the product is going out the door is too late. You're going to lose so much value that way. And the product owner will just revert back to that gut intuition, which is great up until a point. You know, we're not trying to replace anyone's jobs with the data and the data tools that we're trying to provide. We're trying to, again, empower them.
Starting point is 00:52:00 There's that word again. And before we can even do that, we have to make sure that we're a part of the conversation that we're there at the beginning at product inception to ask, you know, or to say, love the idea. How are we going to measure it to know when we've achieved success? The same way you would ask of your own employees. What is your measurement plan for yourself? What is the measurement plan for our team? What are our OKRs so that we can know whether we've succeeded or missed our targets? Great question that's come up on scale, especially when you think about, you know, sort of like a hyper growth context is what I think about. But when a data team is scaling really fast, what are the main pitfalls?
Starting point is 00:52:55 And, you know, I'll add an augmented question by saying, you know, let's think about maybe kind of that new manager of a data team as well, who may be sort of experiencing this for the first time. Because, you know, in a completely controlled environment, you know, you can kind of like plan carefully every step of the way, right? But when you're moving really fast, things are harder. So Sri, do you want to kick us off with that one? Yeah, I've certainly experienced that both at Yelp and Robinhood and, you know, made a lot of mistakes along the way. So I can kind of share those learnings here. I think when teams expand very rapidly, you know, as leaders or managers, we have a tendency to put order and put structure. But I think building structure when your team is changing every three months is incredibly hard. And so one of the, it sounds counterintuitive, but when you're rapidly scaling, it's often beneficial to rely on people and delegate as much as you can and not worry about the structure and
Starting point is 00:53:56 the process. Because whatever structure and process you come up with, two months later, it's going to be a waste because your team's doubled in size, right? So I think in those situations, it's really important to make sure you've gotten the right hires because hiring is absolutely critical. But assuming you've gotten the right hires for the right roles, you know, making sure that they are empowered to run with the process, run with the problem the way they want to run with it, even if there might be some chaos that might end up being much, much more beneficial than trying to create a lot of structure along the way.
Starting point is 00:54:28 There's someone listening who's saying, yes, throw process out the window. I know that's not what you're saying. That was really unfair. But no, that is really the ability to hand stuff off, I agree, is absolutely huge. And I think that goes back to what all of you said before, which is understanding the purpose and understanding what success looks like, which are key foundations to be able to do that. Sean? Yeah, again, to kind of borrow from what the others have so rightfully called out already, two related things. one is context switching so as your team scales up and tries to do more things fast make sure that you you put some kind of restrictor plate on how many of those things
Starting point is 00:55:14 are in play at any one time and okay you've hired curious people that's awesome we curious people also tend to burn ourselves out. And it's super easy to do to ourselves, let alone to be driven to burnouts. And I have seen terrible burnout. I've been there myself. And, you know, I'm sure you mentioned recruiting. No manager really wants to be in the recruiting phase because it's super exhausting, distracting, and expensive. It is so much cheaper to keep the people you have, to keep good people in-house, and to pump the brakes once in a while when you need to. Be prepared to push back against leadership and set more realistic targets. But above all, you got to keep the good people you have.
Starting point is 00:56:07 So helpful. All right, Paige, I'm going to put a little spin on this as I answer the question to you. Generally, when a team's growing really fast, that's in response to the organizational demand for data, right? And usually, the mathematical relationship is that the demand outstrips supply, which can lead to burnout to John's way. Can you talk about mitigating that on a fast scaling team? You know, because you have the team dynamics, but then the pressure that's being put on them is really with good intention coming from all around the organization.
Starting point is 00:56:42 Yeah, that is is definitely a really interesting challenge. And I've been there a couple of times where really trying to figure out how to not get caught in what we call the service trap. It's also something I picked up from Emily and putting into place things that can protect the data team from that, which can include really a lot of the ability to be kind of ruthless when it comes to prioritization. So that might mean that for a little while, the data team kind of has to be their own product or project manager. I've done an exercise that was really helpful on a data team where we actually said, okay, we've got like a hundred requests. Everyone pick two that are like the top ones. You really think like these really need to get
Starting point is 00:57:30 done. There's so much value here. We're going to put that on our roadmap. We're going to actually decide like whatever happens, we work on these. And then as new requests come in, we can say, okay, stakeholders, you and I have said, these are the top two for what we're working on for like what I'm doing this quarter is this really more important than that and and that helps also get the stakeholders involved in helping with that prioritization so that's really that's really key that kind of of communication and yeah deciding what's what's going to be priority it helps a little bit what's a slowing down of the constant requests of ad hoc. So I've done processes like that before that have helped with that feeling.
Starting point is 00:58:10 Yeah. And then also making sure that the data team still can protect some time to do proactive insights. That also helps a lot with keeping that from feeling like we're just a bunch of you know folks just like hitting the time the keyboard a ton and sending out reports like the ability to really take time and explore the data and realize that we have some ideas of our own that are actually really helpful and valuable to deliver back to the business can like sort of it can really change that relationship that a data team can have with stakeholders. So those are- Yeah, I think the proactive side of that is such a durable way to build trust, right?
Starting point is 00:58:53 Yeah. And sort of that partnership. Well, we are right at time. This has been so wonderful, everyone. Thank you so, so much for giving us some time. And yeah, we'll have you on the show again sometime soon. Thanks. Yay, thank you.
Starting point is 00:59:08 Thanks for having us on the show. Yeah, this was wonderful. See you soon. Man, Costas, there's so much to discuss. I learned so much. I think it's funny. One of the, you know, we said data teams, but really it's very clear. I mean, one of my huge takeaways is that all of these people are very experienced and considerate and wise managers of teams in general.
Starting point is 00:59:35 And I think, you know, as expected, a lot of the wisdom that they shared with us, you could apply to, you know, almost any team structure. I think one of the things that, two things that were related that I really took away were Sri mentioned evangelizing the mission of the data team. And Paige brought up a concept of not getting caught in a service trap, which is a concept that someone she worked with in LFI came up with. Those two ideas in combination, I think, are really important because data teams can often be positioned as order takers. I need this data, or I need this insight, or I need this report, or this looks broken, or whatever, right? And if you evangelize the mission
Starting point is 01:00:25 and you create or sort of cast a vision that's bigger than just fulfilling requests, but actually pushing value back into the organization, as we've seen from people we've talked to on the show in the past, that really creates a special dynamic among the team. And so to me, that was just really, really helpful advice in general, but
Starting point is 01:00:45 also specifically for data teams. the most important thing for me and the most, let's say, what really inspired me. And that's together with something that was mentioned that in data teams, there's a lot of mobility. Like you can see people that they're data engineers, then they decide to go and like start working as data scientists and go back. And I think that's like part of the beauty of like working as part of the data team. And if you're like a curious person
Starting point is 01:01:30 and the person that really likes to learn and do new things, I think being in a data team is like probably an amazing place to be. So that's what I'm going to keep. And I think we should have more of these discussions about the people aspects and the human aspects of data. Totally.
Starting point is 01:01:53 So I'm looking forward to have more of these in the future. I agree. All right, well, subscribe if you haven't. Thank you for joining another live stream, and we'll catch you on the next one. We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week.
Starting point is 01:02:10 We'd also love your feedback. You can email me, ericdodds, at eric at datastackshow.com. That's E-R-I-C at datastackshow.com. The show is brought to you by Rudderstack, the CDP for developers. Learn how to build a CDP on your data warehouse at rudderstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.