The Data Stack Show - 73: What a High Performing Data Team (and Stack) Looks Like with Paige Berry of Netlify

Episode Date: February 2, 2022

Highlights from this week’s conversation include:Paige’s career path (2:44)Paige’s role and responsibilities at Netlify (6:38)Sharing data insights (8:55)Scope in the context of delivering an in...sight (12:39)Defining “insight” (15:10)Where the client journey begins (16:43)Miscommunication because of vague terminology (20:06)Netlify’s internal knowledge repository (23:01)Breaking down Netlify’s hub and spoke model (30:45)What data tools to use and when (35:21)The metric layer and BI (44:17)Next steps in the data space (49:42)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You'll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by Rudderstack, the CDP for developers. You can learn more at rudderstack.com. Welcome back to the Data Stack Show. Today, we are going to talk with Paige, who is a staff data analyst at Netlify.
Starting point is 00:00:34 Now, she also has engineering chops, and we're going to hear about that. Maybe even throw in a little story about Perl, about the writing pipelines in Perl, which will be very exciting. But I'm interested to hear about sort of what I would call like the other side of the data stack as it exists and the people who are working with the data, delivering it, et cetera, all the time about the technology, the data itself and the tooling around it. And I'm so excited to ask Paige, sort of, she bridges the gap between the technology and the infrastructure and sort of the, then the processing and delivery
Starting point is 00:01:11 of actual sort of data products in the form of analytics to the team. And so I'm just really excited to hear about that because we don't get to talk about that very often. How about you, Kostas? Yeah. I mean, first of all, I'm very, very interesting to hear about how it is like to work in work in netlify it's some like it's a product that i also use for my personal blog and it has a very different way of approaching like the infrastructure internet infrastructure so i'm pretty sure and usually it's like something that you that you see that when you have companies that they innovate a lot and they are very innovative and they bring something very new, usually this is also reflected to the practices that they have, like how the organization is, like even to the people that they have there.
Starting point is 00:01:58 So yeah, I'm very happy that we have Paige with us today. I think we are going to have a very interesting conversation that's going to be both technical about technology, but also about the organization, the teams, how we work together, how we communicate analytics and data and all that stuff. So I'm really looking forward to this conversation. Well, let's dive in and talk with Paige. Let's do it. Paige, welcome to the Data Stack Show.
Starting point is 00:02:25 We're so excited to chat with you. Yay. Thank you so much, Eric and Costas. I'm really excited to be here. Okay. We always start in the same place, which is where you tell us about sort of your career path and what led you to what you're doing at Netlify. All right.
Starting point is 00:02:42 Cool. Well, my career path is, it's not super straightforward, but it all has a theme of data and it all has a theme of learning on the job. So my first job with computers was at the IT help desk at the university I worked at, where I got the job, even though I didn't really know much about computers because I answered every interview question with, I don't know, but I can learn. And it ended up proving that to be true. It's been kind of the theme of my career. After that, I had worked in a role where I got to administer databases and write reports. I taught myself SQL for that job. It was really fun. Then I worked in the college
Starting point is 00:03:21 administration at Reed College, where I got to do some programming in C and Oracle PLSQL. And I built an ETL pipeline in Perl, which I don't recommend, but it did work. That's impressive. It was a really fun challenge. Have you kept the scripts, the Perl scripts? I mean, they're work of art. Like you should keep that.
Starting point is 00:03:43 I probably do have it saved somewhere actually on an external drive uh I should frame it yes yeah yeah or create an NFT outreach oh there you go now that is a fun idea yay so yeah after that I worked at New Relic company uh here in Portland where I, as a software engineer, where I got to learn Git and processes for collaborating on building a code base. But what I realized during that job is that I still loved playing with data most of all. And that's been the case at all of these positions. So I moved into a role as a data analyst, first for the support organization and then for product, where I got to work with things like Redshift and Snowflake and Airflow and write code in Python, learn what DBT was.
Starting point is 00:04:33 And then I also got to learn a lot about hiring practices because our teams doubled in size after I joined both of them. And so I got to help with the hiring process and with building a team culture and got some good practice on how to teach an organization to be more data informed, which was really interesting. All of that's been really helpful for me as I joined Netlify in March, 2021 as a staff data analyst. So yeah, that's the career journey. Very cool. Okay. Before we talk about some of the specifics of what you do at Netlify, I have to ask about the IT desk. Okay, so is there like a weird computer issue that a student brought in with their laptop that sticks out where you're just like, that was so bizarre or like what happened here? Boy, it has been a lot of years. So I'm trying to think, I don't think anything's really jumping out. What I remember most about that job
Starting point is 00:05:32 though, was that the first few weeks, what I would have to do was I'd, I'd go to like the professor's office who was having a problem with their computer. And I'd sit down at their computer and they'd say, the computer is broken in this way. it's doing this. And I'd say, great. And then I'd pick up the phone and call my manager and say, they said, blah, blah, blah. What do I do? And he would walk me through how to troubleshoot and solve the problem. And that was a lot of how I learned how to fix computers. Yeah. That probably actually is good to sort of, in terms of working with data, troubleshooting software engineering, like that's a great sort of breaking down the process and all of that. I'm sure that there were just a lot of restarts to, to fix whatever. Yes. Yes. Have you tried turning it off and on
Starting point is 00:06:14 again? Yes. And there were plenty of unplugged as well. Oh man. Plug it in. It might work if we plug it in. If only data pipelines were like that. Okay. So what is your day-to-day job look like at Netlify? I know you probably do a lot of different things, but just help us understand what your role is and sort of your responsibilities and team and all that. Sure. Yeah.
Starting point is 00:06:37 So at Netlify, we have a data team that follows the kind of hub and spoke models. So we've got a few of us who are, we do the hub work, and then we have other people on the team who work with spokes. So different business partners. We've got someone who works with finance, a couple of folks who work with product, somebody who works with growth, et cetera. I'm a member of the hub team. So that means that I do things like I pick up tasks and projects that maybe don't fall under one of the hub team. So that means that I do things like I pick up tasks and projects that maybe don't fall under one of the spokes or, or I will pick up some spoke work. If that, if the folks on that spoke are, have more than they can get done. I also get to spend a good amount of
Starting point is 00:07:19 time doing proactive insights work. So looking through the data that I've been working with, that if I, like I'm doing a task for stakeholder who's requested data and I say, boy, there's something interesting here. Let me get this task done. And then I want to dive into this and actually see what's going on with this data. I get to do a good amount of that work. And then there's also work around like team building culture, building things like that. If we're got to build some data models to get the analysis done, even some data engineering, bringing in data from new sources, I get to do a little of all of that, which is great. So fun. That sounds like a really fun job. It is. I love it. Okay. We have definitely some questions to ask about the team structure on Hub and Spoke. But first, you wrote a post a while back around some of the ways that you share insights that come from data.
Starting point is 00:08:33 And so I love your description of sort of the freedom to go in and explore. But I think it's really interesting how you've sort of taken the results of those types of explorations and syndicated them to the team. So can you tell us there? I think our audience would just love to hear how you do that, because I certainly learned a lot reading the post. Oh, awesome. Thanks. Yes. Yeah. One of the interesting things about when you're searching through data, you come up with an insight. It doesn't really do much unless other people hear about it. So figuring out the best way to get that information shared has been a fun thing to work on at Netlify. So our norm for communication is using Slack. So we have a Slack channel that is dedicated to sharing data insights.
Starting point is 00:09:30 And when we've discovered something that's interesting in the data, it usually helps to figure out, okay, what are the things that are really important about this insight? If somebody's got, you know, five seconds or 10 seconds, and all they can do is just skim the post in the channel, what do I want them to make sure that they're seeing? So that helps to really clarify like, okay, this is what's interesting about this data and kind of get it into a few bullet points,
Starting point is 00:09:55 even like bolding some of the numbers and words so that if all they read are those bolded numbers and words, they understand what's interesting. Like the too long, don't read. Like this is the punchline. Exactly. Exactly. And then it always helps to add at least one chart because this can be really visual stuff. It's super helpful for if someone's all they're going to do is look at a chart, they can see just from that, like, oh, there's something intriguing here in this picture. And maybe that's all I'm going to look at, but I know there's something intriguing here in this picture. And maybe that's
Starting point is 00:10:25 all I'm going to look at, but I know there's something here. And then in the back of their mind, they can remember if they're working on some project and a couple weeks later, it's like, oh, there was an insight about this data. I'm going to go back and read that because it might inform the decision we're making right now. Yeah. I love that. I love the thinking about it, assets that you sort of collect over time that can inform future work, right? Because so many times we produce data or reports or whatever, and you just, you know, basically throw them away. I mean, my Google Sheets is a nightmare of ad hoc analysis. For sure. Yeah. Yeah, exactly. Sorry, go ahead. Oh yeah. So, and we also, we keep these also in an insights feed. So we have an internal essentially blog or
Starting point is 00:11:19 it's really, we have an internal handbook at Netlify and the insights feed is a part of that handbook. So it really is like a blog people can go and read through if they've been away for like on vacation for a few weeks, they can come back and say, OK, what is the data team posted since I've been out and catch up on all of the insights in the insights feed as well. One quick question for you there. And that, you know, as we think about, you know, our audience and all of us really as people who are sort of on some level trying to deliver data products, whether that's as infrastructure or insights, one of the challenging things when you're delivering insights is sort of the scope and the context, you know, cause it's, it's like, you know, if you're seeing a small slice of something, you know, not having a larger context can be challenging. How do you think about the scope and the context of delivering an insight, especially when you think about kind of the,
Starting point is 00:12:16 like the too long, don't read side of it. And I'm asking that selfishly because I'm trying to do a better job of, of this inside, you know, you know, my own role, you know, because if you just share kind of a number without any context, not that that's happening, but, you know, sometimes people are like, I, that's interesting, but I like, I don't quite get it if that makes sense. Absolutely. That makes a lot of sense. Part of the process that we go through when we are doing a proactive insight actually
Starting point is 00:12:44 is captured in a GitHub issue. So as a data team, we actually capture all of our issues, all of the work we do in GitHub. We operate kind of similar to a software engineering team in that way. And when one of us is going to start a proactive insight, we open an issue and begin with the question, what's prompted this exploration? What are we curious about? What has happened recently that made us think, oh, this would be really important to look at?
Starting point is 00:13:16 And all that context is at the beginning of the issue. And then as we do the exploration, we add comments to the issue for every step. So that includes like the SQL code we're running, some of the results from that, like intermediate charts that show part of what we're trying to get at. So we are taking someone on the journey of this exploration with us at through this issue. And certainly not everyone's going to have time to read that, but it can help a lot when we're then going to write the post about the insight to go back to that beginning, like, and include the, I was, I'm did this insight to answer this question. And this was the reason why I was looking into this.
Starting point is 00:14:12 And then this is what I found. That's often how we can incorporate some of that context to let people know why this is actually important to the business. That makes total sense. I think about the Mark Twain quote where he says, I would have written you a shorter letter, but I didn't have the time, you know? And so just hearing about that process, it makes total sense. If you've wrestled that question to the ground, you then have the ability to deliver that in a concise way that includes the context. Very cool. Okay. Costas, I've been, I've been dominating the mic here, please. I know you have questions brewing in your head. Thank you. Thank you, Eric. Yeah, I have like, okay, my first question is, I think, pretty simple, but I want to ask you, Paige, what is an insight for you, like, and for your team? Like, what
Starting point is 00:14:55 constitutes an insight? Oh, that's a good one. I'm trying to think of a good example. I had one that I was working on that was because I'd noticed that there was this particular error happening in the product. And I noticed it completely dropped off at one point. And that was my insight, like something changed here. And I don't know what it is, but I'm really curious as a data person about why this would change at this point. And getting to post that was really fun because then people in our engineering org were able to come into the channel and say, oh, I know what it was. This software engineer applied this fix and we thought it made a difference, but it's really cool to see in this chart, like the impact that that work had. That was a really fun one. And then there are definitely ones around
Starting point is 00:15:52 like connecting specific feature usage to specific outcomes. That's something that is an area that I tend to have a lot of fun with. And so a lot of the insights I dig into tend to be really around that. Like when a team is using this, this is what tends to happen within the next three months. And those are really fun. Well, that's very interesting. So the way that I hear it, like as you describe it, it feels like you're some kind of explorer, you know?
Starting point is 00:16:23 Like you're out there like in the ocean and like you are finding this island and the other island and creating a map. But how do you start this exploration? Like what's the motivation and what's the journey? Yeah, that's another really good question. Thinking about it that way. Initially is prompted by a stakeholder request. There is, there are, we get requests from anyone across the company.
Starting point is 00:16:53 And when we're looking at the data required to fulfill request, we know it's about something that somebody is currently interested in and working on, which is going to tie to our goals as a company. So that's like a good sort of initial pointer, I guess, to what could be important or interesting to know about. Usually once the request is filled there, then you've already become acquainted with the data that is going to matter for this particular thing people are curious about and at that point then you know okay well this is the data that people are asking about let me dig further into here or let me pull in this other data that can inform that in this way that is new and that's often how we get started on one of our insight explorations. No, that's cool.
Starting point is 00:17:48 And one of the issues that I always have when it comes to data and trying to be, let's say, data-informed or driven or whatever you want to call organization is how people can agree upon the definition of the problem that we are trying to solve or the insight. That's why I also ask you about what's the definition of the problem that we are trying like to solve or the insight. That's why I also ask you about like, what's the definition of insight? Because many times, you know,
Starting point is 00:18:10 like for things that we take for granted, like what is the revenue or what is like the cost or like stuff that we instinctively take for granted, they are not actually like the definition behind it might be like quite different. And usually you figure that out when you start working with data, because you have to be very precise, like with how you communicate there. You are very experienced like with that, because you have to work like with many different people.
Starting point is 00:18:38 Like, so how big of a problem is it? Is it, first of all, like a problem? I just think that this is like something that is important. And second, like, how do you deal with it? Is it, first of all, like a problem? I just think that this is like something that is important. And second, like, how do you deal with it? Just to make sure I'm understanding fully, the problem of making sure that the definitions of the metrics we're talking about are agreed upon. Is that where you're going? Yeah. You said that like someone, like a stakeholder might come with a request, right? They express like, I don't know, like a problem that like someone, like a stakeholder might come with a request, right? They express like, I don't know,
Starting point is 00:19:07 like a problem that they have, right? And then you have to start like working and like digging into data to help them with that. But first of all, the two of you, you have like to agree on what the problem is, right? Yes. So that's what I'm asking. Like how big of a problem is this?
Starting point is 00:19:24 If it is, maybe it's just my problem, to be honest. I think about active users. Like, would you say that, Kostas? It's like, hey, could you tell me if our, like, number of active users has gone up, right? And it's like, okay, the number of questions that we have to ask to, like, you know, you could kind of give any number back and it's probably not wrong. You know, it just depends on the definition. Maybe that's, maybe I'm going on the wrong path,
Starting point is 00:19:52 but that's what jumped in line for me. Yeah, no, no, no. That's exactly what I'm trying to say. So yeah, how do you feel about it? Yeah, exactly. I definitely enjoy, I think it's a joke around that of like you know somebody says can you tell me how many active users we have and the answer is well that depends on what you mean by active and users are daily or active or users yeah every word needs to be
Starting point is 00:20:19 unpacked the definition for what someone's looking for that I think is a really big and interesting and gnarly and fun opportunity to figure that out. And I feel like a lot of folks are talking about that in the data profession in different ways that we can try to have a solution for that, like defining metrics in a separate layer. And once we get that defined, that is the, then the source of truth for what that metric means, because that is definitely the part that I considered the, the most interesting, but also can be most time consuming part of responding to a data request. So yeah, there's there, I don't know if there's really, I don't know that there are many, very many times in my career where somebody has asked
Starting point is 00:21:11 me for data and I have been able to just go get it without coming back with at least one, if not several rounds of questions to really understand, okay, what, first of all, what are, what's the problem you're trying to solve, what are, what's the problem you're trying to solve? What is, what's the decision you're trying to make? Because there may be actually other things that I can give you that will help with that even more. But if it turns out that yes, what you're asking for is exactly what you need, then let's still actually figure out the definitions for the words you use to make sure that the data I'm pulling, like what I'm telling the database is what you would tell it if you could. Yeah.
Starting point is 00:21:49 Yeah. That's, that's interesting. And I think we are going to see also like probably products around that, like, okay, we have like the data catalogs, which comes like from the, I mean, the enterprise space and with like products like Colibra, but these are, again, they are like more on the, what I would call like the still on the sedactic level and not that much like on the semantic level, because, okay, we can agree like on what each of the data fields that we have mean, right?
Starting point is 00:22:17 But going from that to the point where we build, let's say, knowledge, like institutional knowledge, like what is like the KPI and how all these things connect together. I think there's a big gap there between the two. And we need more of like a knowledge, like let's say, catalog, probably something like that. And I wanted to ask you, do you feel that this internal as a knowledge repository for the organization where like the data team can communicate like all these not information knowledge actually that is generated by doing all this analysis and creating all these insights yes that's i really appreciate that you
Starting point is 00:23:00 use that word we kind of consider it a knowledge hub and it isn't necessarily only this insights feed, but there are a few things that fit into what we think of as this knowledge hub. So there's the insights feed. We use DBT and we have our DBT docs, which is another aspect of knowledge where our models, our dbt models are all very well documented. And then we have another layer we are using transform, which is one of those metric repository kind of products where we are able to define a metric in transform with an explanation around it. And then our stakeholders can run charts and even do some slicing by dimensions and stuff on these charts, knowing that the definition of the metric is something that the company has agreed on. And so there doesn't have to be that question of like, is this actually really daily active users? Because we know it is. We decided on we kind of think of it as all of this together are different facets of knowledge about the data and the business. And this knowledge repository is something that's like owned and managed by the analytics team or
Starting point is 00:24:37 some other like team of organization? Yeah, it's all owned and managed by us on the data team. So any of us are able to administer any of these systems, add to any of them, edit stuff in there. It's all essentially a big part of our code base. Oh, wow. That's super cool. Can you tell us a little bit more about how you started doing that? How you decided to structure this knowledge repository in this way? And whose idea was that to have a wiki, first of all, there? Right. That's a good one. I'm trying to remember exactly who came up with that. I think it was a combination of our director.
Starting point is 00:25:21 Emily Sherio is the director of data at Netlify. When I started or and someone else on the team I can't remember who else on the team I'd have to look it up but there was it was definitely like others on the team who said hey we could do it this way or let's put it here because we all kind of try to we know this is an issue and we are all trying to think of ways to solve it as a team. So I'd have to look in the history of our Slack to see who exactly suggested which part, but yeah. That's nice. And I want to ask because you are also work a lot with like building teams and like building the right culture. How do you, let's's say motivate a team outside of running the analysis
Starting point is 00:26:07 and working with the data also to sit down and write the right content in order to communicate with the rest of the organization like how how do you do that right that's a that's another interesting aspect of this. I think at first we weren't sure about it when Emily Sherio suggested it. She's the one who really encouraged us to start doing this as a data team, as a way to kind of keep us out of getting trapped into being a service organization and only ever being reactive. When she first brought it up, I think at first we were all like, oh, this is different. This is new. I haven't really done this before. How am I going to make time for it? Is this really going to be helpful? And with her encouragement, we went ahead and started trying it.
Starting point is 00:27:05 And the reaction, the response was so awesome. There were really great conversations in the Slack threads. Every time we would post an insight, there would be questions that came out of it. Other people in the company would have conversations with themselves in the threads. We could see so clearly the benefit that was bringing the way people were thinking about data and in these new ways that it made it easier and easier to make time to do these and get them posted. It's part of our weekly data team meeting where we talk about the insight we did the week before, plan
Starting point is 00:27:45 who's going to post their insight post when to make sure that we don't have like them all clumping up on Monday or something. And it's become part of what we do as a team. And those of us, you know, there are several of us who really try to pay attention to like, hey, if we're not starting, we're not doing as many this week, maybe people got busy. Let's make sure that we pick that up next week. We yours. So yeah, it's really become something that we've tried to weave through the whole fabric of the data team and our processes. That's amazing. One question, and this is really practical, but I think about this service organization. You mentioned that concept and having a background in marketing, I know the pain of sort of, you know, inadvertently becoming, you know, a unit that just answers questions and like try to do that as quick
Starting point is 00:28:53 as you can. And you're not actually pushing value back, you know, which can be a tough place to be in. One thing I think when you start to push value back in to other parts of the organization, especially if you can build a repository, is that the number of questions that you get starts to decrease because you can refer to this like thing where it's like, you know what, that's a great question. Like it was answered, you know, go check this out or whatever. And, you know, you kind of train people to like, maybe I should go search that repository before I ask the question or make a request or whatever.
Starting point is 00:29:32 Is that dynamic happening inside of Netlify? Yes. And it's so exciting. It's another piece that is so valuable about this. We make sure that it's part of the company onboarding, that people learn about it from their very first week. And it's been so exciting to see questions come up in our public team channel and have other people in the organization come in with the answer
Starting point is 00:29:57 because of things that we've posted and published earlier. It's so exciting. It's like, yes, we're all in this together. This is something we all get to do. It's so fun. That's great. I want to talk a little bit more about the hub and spoke model and the different roles that are on the data team. I think this is a really interesting subject. And I think our listeners would really appreciate just understanding some of the learnings that you have from the hub and spoke model. So can you just give us a quick breakdown of who's in the hub and then who are the spokes and how do those analytics engineer, me, a data analyst, and then our data evangelist. I know. I know. And it's another one that I, it's like, I'm not sure, but it's Lori Voss. He does a lot of speaking at conferences. He did the JAMstack survey and has been posting a lot about the
Starting point is 00:31:12 results. Yeah. So it's a fun role and yeah. So there's four of us in the hub and then we've got two data analysts working on the product spoke. We've got an analytics engineer working on the growth spoke, and we've got another analytics engineer working on a finance spoke and a data analyst working on finance. I think that's, I think that's it. I have to count on my fingers to see if I got it all. And then what was the other part of your question? Well, you answered it actually. It was the roles and the spokes. And so one other question there, just because we're getting back to definition of terms here. We're coming full circle on the conversation. So it's interesting to me, and this may just be semantics and role names, but that you only mentioned one data engineer, which makes me interested in sort of
Starting point is 00:32:05 the, the scope of skills and responsibility, an analytics engineer, you know, working in a specific business function. Do they sort of, do they encompass like some data engineering, you know, type tasks and some analytics tasks as well. That's correct. That's exactly right. All of us on the team are able to do work across the board. We would probably need some of our data engineers assistance with some of the data engineering pieces, but those of us who are analysts, but the analytics engineers can absolutely do really any part of that stack of work. So it works fine to have just one person in that role supporting a spoke because they really can do like what is needed. Very cool. Yeah, that that is just as we've heard about different team structures and stuff, those people, I would guess, are extremely hard to find, right? Because, you know, sort of those two skill sets, you know, someone who can really handle the technical side of it and the engineering side of it and also the analytics side.
Starting point is 00:33:14 But it makes total sense. Like that's such a high value role to be embedded in another team. And how do, I love that we're getting to like data team management stuff here. How do the embedded analytics engineers, who's their boss, I guess, is I'm not, I couldn't think of the right business buzzword. No worries. Yeah, exactly. Yeah.
Starting point is 00:33:38 So before Emily left, she managed the entire data team. So she was the manager for everyone, hub and spoke. And the person, say, who is the analytics engineer on a spoke would get their prioritization and their work tasks mainly from a specific stakeholder on their spoke team. Now, at the moment, we're actually like hiring for a director. So while we're in this position, we have folks who are on the spokes actually being managed as well as given prioritization tasks by the person that they work with on the spoke. And those of us in the hub have an interim manager. But the idea is that when we have a director of data,
Starting point is 00:34:29 again, they will again become the manager for the whole team. This is great. So Paige, let's focus a little bit more like on the technology side of things, because I think we have covered, actually it was super interesting to have this conversation. We don't have like this kind of conversations too often.
Starting point is 00:34:44 Like we focus more on the stacks and like the technologies that people use. And so that was amazing. But you mentioned that you wrote your first ETL pipeline in Perl, right? Yes. Do you still do that? I do not. That was, goodness, it was a while ago, maybe 2017. It was a little while ago. So what kind of tools you are using? You mentioned something like you mentioned,
Starting point is 00:35:13 for example, like dbt explorer, which is like a metrics repository. What kind of tools you are using today? Yeah. Yeah. We use dbt. We use transform as our metric store, we use snowflake as our data warehouse, airflow, I'm trying to think what else and then we also use, we use five tran, we use census for reverse ETL, or operational analytics. I think that's mainly it. And then in terms of how we do our work, we use Git for GitHub for our code base and stuff. What else? Are there any areas of technology that you're curious about that I haven't mentioned? Well, I'd love to learn more about Explorer because metric repositories is something that we haven't touched that much. But before we get there, I'm wondering, you mentioned like Fivetran, for example,
Starting point is 00:36:12 which is of course like helping you like with your pipelines. Do you also build your own pipelines or you rely only on Fivetran to move the data around? Our data engineer also builds pipelines. Okay, why? Oh, that's a good question, I would have to ask him. Yeah, I'm not sure that was set up before I got here. Yeah, okay. Why, I guess, why are you asking? What, what, is that a surprise or? No, it's actually
Starting point is 00:36:44 something that is happening okay first of all like i have like a very long background in data pipeline right and like for the past like seven years together with five tram like the whole industry is trying like to let's say abstract as much as possible like this eel or whatever process that we are talking and remove the need of like creating custom scripts right right? So you can use, let's say the platform and have like this SaaS cloud experience, like in connecting your data source and move your data around. But it seems that still it's not enough. And I'm trying to understand why.
Starting point is 00:37:16 And that's why I'm asking you. And I have asked like also other people, to be honest, because it's not that uncommon. Actually, it's pretty common to see that you might have multiple different vendors in the same company used, like Stitch Data together with like Fivetron, for example, or you can have one of these and at the same time have like a number of Python scripts that exist out there to do like some of the pipelines for different reasons, right? Like some people, they do it for performance, like because the specific pipeline that Fivetron has, it's slow, I don't know, or for cost reasons.
Starting point is 00:37:52 But yeah, that's why I'm asking, because maybe you could add a little bit more color and your own like experience on why we still have to build our own pipelines. I see, got it. I see. Yeah, I'd have to think about that probably from our data engineer's perspective. And that would actually be something I'd really be interested in asking him about. So I think I will do that later today. Thank you for
Starting point is 00:38:18 that prompt. But it's very possible that that's still a holdover from what we had to do before. And that when you've got that stuff figured out for your company, you've got all of the whatever logic you might need to get the data in the right shape. It could be that it's difficult to just move away from that. Or maybe there just hasn't been enough of a reason yet, but I would really need to ask him to be sure. So it's a great question. And thank you. Just let me know after you learn about it. Like I'm very curious. Yeah. You didn't mention something as part of your, your like stack. You didn't mention anything about the
Starting point is 00:39:02 visualization layer and something. What do we use there thank you we use we use mode for visualizations for one of our insights and exploratory work and then we use the the visualization capabilities in transform and our metric repository for for stuff that that is a little less exploratory, for stuff that's really more for stakeholders to be able to quickly get a number or do some kind of like higher level dimension slicing. So the two kind of work together, but we're also still always try to communicate about like, this is when you would use mode and this is when you would use transform. Can you help me understand like when I should be, let's say I work for Metric, right?
Starting point is 00:39:55 Right. When should I use one and when I should use like the other? So transform you is great for metrics that we've already configured out as a company already have definitions for, and that you, that you're interested in looking at or learning more about that. We already have available the dimensions you'd like to slice by mode is going to be better. If, if you're kind of starting from more raw data. We don't have the metric really defined yet. You're kind of trying to explore how different models maybe interact with each other. That's kind of where we start for, let's say, building the definition for a metric.
Starting point is 00:40:38 Yeah. That's super interesting. I'm wondering how the process looks like when you start from the exploration part, right? Like using mode, until you reach the point where you have a metric that you can well define it and put it into Explorer. What happens in between? How does this work? Yeah. I'm very curious to hear about that it's right yeah i'm i'm thinking about work that a couple of my teammates did to define something called activation where they had a list of it was several people working on it. We had, I think two of our analytics engineers were working on it from our, the data team side. And they had a list of things you're going to look at, like different metrics, different things teams might do and at different like time sections, I guess,
Starting point is 00:41:39 from after signup and when they might do them. And so they did a bunch of work in this GitHub issue and used mode to do different visualizations and different cuts of the data to start to really understand how we can actually define like activation for a team. And they were able to kind of narrow down the funnel of options until they came up with, you know, activation is this. And when a team
Starting point is 00:42:06 has done this, that's considered activated. And now we have that. So that metric goes into transform where somebody can run a chart to say, what's the like number of activated teams weekly since the beginning of the year. And you can see, you can see where that number is going or what's the percentage of activated teams that have done X or Y. And you can look at that and transform, but that's because we have that metric that we figured out by doing all this exploration in mode with all this raw data. Can you use Explorer together with BI tool like Mode or completely disconnected? When you talk about Explorer, is that Transform? Yeah, sorry, sorry, sorry.
Starting point is 00:42:52 That's okay. Yeah. Yes. So Transform did develop recently a Mode connector. And so we have been using that. We set that up a couple of weeks ago, I think, a few weeks ago, and have been doing visualizations in mode with metrics that are defined and transform, which is really handy because we do have people kind of doing work in different areas.
Starting point is 00:43:19 So for instance, if I know I want to do an exploration that has to do with revenue, I don't have to necessarily go like talk to the analytics engineer working on finance and say, hey, what's the newest definition of revenue to make sure I'm getting it right? We have that definition in transform. So I could either look at the SQL and transform, or now I can actually just pull in that metric into mode, use it in my visualization. And I know that it is the correct definition of revenue. And so I'm getting the right numbers. Okay. And I mean, if I understood correctly, you define the metric in transform with SQL? Yes, It's a combination of SQL and YAML files. Oh, okay. I have to check transform. I was also, because I saw that you have like on your background, like at least like the conference that's happened like last week. Yes.
Starting point is 00:44:28 And I think there was a lot of discussions there about like the metric layer and like this new thing that it's called like the headless BI, whatever that means. Is this like a rebranding of BI in your opinion, or is it just like, let's say not rebranding, but iteration on the BI because the needs today are different than they were like 10 years ago when Looker started, for example. Because Looker, if you think about it,
Starting point is 00:44:54 like with these two layers that he had, like the visualization part and the LookML parts, I mean, with LookML, you were pretty much defining also, let's say, metrics, right? Okay, it wasn't called like that back then. But I'm just wondering about that. I really tried to position, let's say, the product category of the headless BI and the metric layers. And what's the differentiation at the end and why it has to be created compared to BI tools?
Starting point is 00:45:25 So you have been working on that for quite a while. So how do you feel about that? I'm trying to think. Do I feel like the talk of the metrics layer as like a rebranding of BI or just an extension of like where it needs to go next kind of? I will have to think about that I really will uh because I mean I think so much of what so much of what I feel like we're all trying to figure out
Starting point is 00:45:55 is how do we make working with data something that is easier for our stakeholders and users and also easier for data analysts. And so trying to find solutions that sort of make life better for both groups is, I think, what we're working on. And so that continues to be, because if you go one too far down one road, yeah, it may be easier for analysts, but it isn't great for stakeholders. You go too far down this other road. Yes, it's much easier for people to do a little more self-serve stuff, but it can be really, really hard to get there in a sustained way. So I guess this is a new, maybe a different way of trying to solve that problem of maybe kind of isolating a certain set of like difficulties and saying, okay, we're going to address fixing this set of difficulties
Starting point is 00:47:01 with this new, like bunch of tools or areas of creating tooling in. Yeah. I don't know if that helps. I really am thinking about this as I talk. So I may have like more to say later. I think I don't think that it's easy to answer anyway. Like, and the reason that time I asked you was more like to have this kind of discussion, not like get an answer.
Starting point is 00:47:26 And I couldn't stop thinking while you were talking about that. 10 years ago, when we were talking about BI, it was all about visual interactions, no code, no SQL. Everything should be self-serve without any technical knowledge. And when I asked you about Transform, you told me, oh, it's SQL and a bunch of YAML files, right? Which is not exactly what, let's say, a CFO would ever use, right? Right. So I see like this back and forth
Starting point is 00:47:57 between being, let's say, completely non-technical to getting to a very technical. And I think at the end like the solution is probably somewhere in between probably we need both right and we need to find like let's say the right balance there so there is a reason that all that stuff exists but yeah it's not answered yet that's what I'm trying to say like I think we're still in the process to figure it out and all these tools are just like attempts like to try and figure it out so yeah yeah that's great i just wanted to say i mean i think part of it is that stakeholders are and end users are not
Starting point is 00:48:30 a monolith and so different there are different like layers of ways that they can use data and so yeah so we're just still continuing to try to figure out that there aren't necessarily solutions that's going to work for absolutely everyone but maybe we can get like areas or groups. Okay. This is a good solution to help like this type of end user. This is a good solution to help this group of stakeholders. Yeah. Yeah. Eric, you have the microphone. I think we're, I think we're close to time here. I just need to say though, I really appreciate you distilling sort of all of the complexities of these challenges into saying, you know, we're all trying to figure out how to make, you know, working with data easier, both for ourselves and for the stakeholders that, you know, that we serve. And really when you distill down all the tooling, all the team structure, all that sort of stuff, I mean, that's, that really is the goal. So I just, I appreciate that. I think that's like one of the best concise, you know, sort of
Starting point is 00:49:34 explanations of, you know, what, what we're all trying to do. One last question for you, since we're, since we're at time here. So you have such a wide skill set. You can do data engineering. You work day-to-day as an analyst and many other things. What advice do you have for our audience? If they say, okay, I'm listening to Paige and I would love to work on a team like that, in a role like that, but I'm maybe early in my career or I'm maybe working at a company that
Starting point is 00:50:07 sort of maybe doesn't value data or the data team in the same way that, you know, it's very clear that Netlify does like, what advice do you have for them to sort of take some next steps? Oh, that is really cool. One of the things that I, I realized that has helped me get to where I've, I've been able to and get the experiences that I've had is I've managed to keep my, I guess, my excitement about working with data, regardless of what else is going on. It's like, there's this fundamental love of what working with data is and what it can do. And that no matter what else is going on, there's, I get to get into a database and look at data and figure stuff out. And that is always so much fun. It's like just enjoying the process of
Starting point is 00:51:05 discovery the data has and keeping that passion and love is something that has really helped me no matter what situation I've been in, because it's motivating. It's always been motivating me to find that joy, even when stuff around is difficult. And that's a, yeah. I think another aspect of that is always being curious and curious about what new stuff is going on in the industry, that curiosity and willingness to learn and be interested can open a lot of doors. It really can. There's a pretty awesome data community, especially around the Coalesce Data Conference I was at. There are people who love sharing.
Starting point is 00:51:58 We love sharing our love of data and working with data, especially with people who are earlier in their career. And so getting connected to a community of people who have this passion, who are continuing to try to make the industry better for all of us is really powerful, especially early in your career. That is so helpful. And I think two things here. One, it's very clear that you've maintained that sense of curiosity and love of exploration, which I think is great. And just from my own experience, I know that maintaining that, if you've gone through the process of sort of building teams and, you know, a company tripling in size, like it,
Starting point is 00:52:38 it's hard, you know, just the, the, the physical pressure of a system that's expanding that quickly, you know, can kind of snuff that, you know, the flame of curiosity out. So I think that is wonderful advice. Well, unfortunately, we are at time, but this has been such a good conversation. I learned so much and it's just been really fun to hear about sort of different sides of data that we don't normally get to hear about on the show. So thank you. Yay. It has been an absolute blast for me too. I've loved every minute of it. Thank you both so much. Okay. Costas, I, I, my takeaway is, and I kept thinking this throughout the conversation. I'm just sort of blown away by how many different things it seems like the Netlify
Starting point is 00:53:20 data team is doing that you sort of imagine as the best way to do something, you know, sort of actually having a team structure where you have embedded analytics engineers that have, you know, you know, sort of broad and deep skill sets, having a knowledge repository, you know, each person, each analyst on the Data Team delivering insights on a weekly basis to the team, managing that with version control in GitHub, the exploration process. There were just so many things I was like,
Starting point is 00:53:53 wow, they are super high functioning, it sounds like. And it was extremely impressive. Yeah, yeah. 100%, I think it was probably one of the most interesting conversations that we had so far when it comes to the organizational side of data. And when you hear, when you listen to Pages, you almost want to be part of this team. Something great is happening there.
Starting point is 00:54:26 And at the same time, they're also having fun, you know, which is amazing. I mean, that's the holy grail of like work environment. So and what I would like to add to what you said, which is something like I 100% agree with you, is that, yeah, it is like a matter of the organization. But it also like heavily starts and it's's related with also what the individual does. If you remember when we asked her about, okay, how did you convince your people there to start writing all this content as part of the everyday job? It was amazing to hear from her how determined the people were to do that
Starting point is 00:55:04 because it wasn't part of the job description anyway right how it might became at the end because yeah they show the value of this yeah but it's amazing like when you get people who love what they are doing what's what can come out of this yeah yeah it's like it's it wasn't like a result of like reacting to some sort of pain and so we're gonna like deliver this right oh yeah and or it wasn't like we brought you know like a consultant from the big four or whatever and like they told us you have to do that if you want to i feel like yeah if that came out of out of the big four i think uh we'd have to have that person on the show yeah anyway it was amazing like i really enjoyed the conversation with here and like
Starting point is 00:55:54 i don't know maybe we should have an episode where we have the whole team from netlify on the episodes and it'd be awesome let's's do that. Yeah, they all direct together. I think it's going to be fun and very interesting. Be great. All right. Well, we're out of time here. Thanks for joining
Starting point is 00:56:11 the Data Stack Show and we'll catch you on the next episode. We hope you enjoyed this episode of the Data Stack Show. Be sure to subscribe on your favorite podcast app
Starting point is 00:56:20 to get notified about new episodes every week. We'd also love your feedback. You can email me, ericdodds, at eric at datastackshow.com. That's E-R-I-C at datastackshow.com. The show is brought to you by Rudderstack, the CDP for developers. Learn how to build a CDP on your data warehouse at rudderstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.