The Data Stack Show - 31: How a 160 Year-Old Publisher is Using Data with Jenna Lemonias From the Atlantic

Episode Date: March 31, 2021

On this week's episode of The Data Stack Show, Eric and Kostas chat with Jenna Lemonias, director of data science at The Atlantic. The Atlantic, a publication that's been around since 1857, is adaptin...g with the times and is implementing and emulating some of the data science practices seen at big tech companies. Highlights from this week's episode include:Jenna's background in astrophysics and how she pivoted to data science (2:14)Differences in dealing with data at a FinTech company and then at a publication (4:40)The relationship between analog and digital data at The Atlantic (9:22)How The Atlantic structures its data science team (11:44)The role data engineering plays (14:42)Using natural language processing and machine-generated metadata (17:37)The Atlantic's data stack (28:22)The kind of data that's important to The Atlantic (29:44)Big projects forthcoming for the data science team (37:13)The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 The Data Sack Show is brought to you by Rudderstack, the complete customer data pipeline solution. Thanks for joining the show today. Welcome back to the Data Sack Show. Many of our listeners have almost surely read content from the publication, The Atlantic. Very important publication that's been around for over 150 years. And today we have the privilege of talking with Jenna, who runs the data science practice at The Atlantic. My burning question is really around, they have both digital data and analog data.
Starting point is 00:00:46 And I don't know if there's anything there, but I think it's just really interesting to think about the role of a data scientist who has probably an unbelievable wealth of digital data, but also has to consider that they have a lot of customers who read the analog version of their publication. Kostas, what do you want to find out? For me, it's going to be very interesting to see how exactly data science fits in a publishing organization. I think it's the first case
Starting point is 00:01:10 of a publisher that we have on this show. And there are many different aspects of data that they can really use there because they have the content itself and they also have all the rest of the different data that we usually talk about, like customer-related data and financial data and all these things. So I'm very curious to see
Starting point is 00:01:30 what's the difference between them and any other product out there that relies on data. Great. Well, let's talk with Jenna. I am so excited to talk to our guest, Jenna from The Atlantic. Jenna, welcome to the show. Thanks very much. I'm happy to be here. Okay. I always say I have so many questions for you, which is true of every guest, but the media space and data has been a fascination of mine for a while, just because it's undergone so much change. But before we get into that, Jenna, we love to learn about the backgrounds of people who work in data. And so
Starting point is 00:02:05 could you just give us a brief background of where you came from and how you ended up working in data at the Atlantic? Sure. So I went to graduate school after college intending to do scientific research and pursue a career in academia. I ended up getting my PhD in astrophysics, but decided to bow out of academia for a number of reasons. I found myself in the Bay Area after graduate school and decided to pivot to data science. Of course, data science and all things tech are ubiquitous in the Bay Area. I started at a fintech startup where I was the second member of the data science team and learned a ton there and made my way over to the Atlantic, where I'm really happy to be now. Very cool. We may need to do an episode on academic backgrounds
Starting point is 00:03:00 of people who work sort of in modern data context, because that's just been a repeated theme throughout the show. And actually, Costas, you came from an academic background as well. Yeah, yeah, that's true. I mean, it's very common to see people that are working with data these days that they are coming from a very diverse scientific background, and especially like people from physics. We had another guest who was part of his PhD. He spent a big part of the PhD program doing big data analytics with CERN. And then he went into the industry and started working with data there. Mathematicians. We had another one who was doing in neurosciences, I think. So we have seen quite a few PhDs from these more scientific, let's say, disciplines going and working with data.
Starting point is 00:03:45 So it's very exciting to see that this pattern continues. I also have like an academic background, but I've never pursued like a PhD. I avoid it. But yeah, I was working in the academic environment for quite a while doing stuff around data. So yeah, it's really very, very, very interesting. And I think we should have an episode just for that. Like, I think there are many insights like to draw from there. And it's really nice to see all these people, by the way, being part of the industry, because
Starting point is 00:04:12 they have a lot of like unique talents that can be extremely, extremely useful. Yeah, absolutely. So I'll start with a very quick question on what you mentioned, Jenna. You said that you started working in a fintech company and now you are in the Atlantic, which is publication, still working with data in both cases. How is the experience different from going from fintech to a publisher and what things are common as you work again with data? Good question.
Starting point is 00:04:41 So one thing that's different is that when I was at the fintech company in the Bay Area, I was the second member of the data science team. And so it was obviously a really small team of two, learned a ton, learned a lot about really what it meant to be a data science team within the context of a business. You mentioned how so many of us have come from the academic research side where you can spend months or years on a research project just because it's interesting. Whereas, of course, in the business side, you want to do projects that you know will have an impact on the business or at least have a high likelihood of having an impact on the business. So that's a big part of what's similar about the two. And just in general, it's been interesting to be at the Atlantic, which is a 160-year-old institution, right, that's now trying to essentially emulate some of the data science practices of tech companies. So it's been really exciting to be part of that transition.
Starting point is 00:05:56 Oh, that's very interesting. And how's your experience with that so far? Like, how is it that such a well-established organization in a very established market also, trying to adopt new methodologies, new techniques that are coming from the tech industry? How do you feel about this and how you have experienced inside The Atlantic? So much of what we're trying to do is really similar to what a lot of other companies
Starting point is 00:06:22 are trying to do and that we're trying to understand what makes someone want to pay for our product. And of course, our product in this case is journalism, but we have a lot of the same data that other companies do, right? We can see how people are interacting with our site, what they're reading and what makes people decide to convert. What articles are they reading prior to subscribing? And so a lot of it is so similar. So quick question on that, Jenna, which is interesting. So Costas and I both work in product and have sort of been around product. And when you think about a typical SaaS product,
Starting point is 00:07:05 let's just use like a new feature rollout as an example here. So you roll out a new feature, it allows a user to do X, Y, Z, right? They accomplish some sort of task, right? And so there's certainly a similar paradigm when you think about content, but a lot of times the way you measure activation on a feature and product is, I wouldn't necessarily say binary, but what you're looking for is, are people using this or are they not? Is there a different approach with content where there's a much more qualitative nature to it? I mean, the person is reading the article, they're engaging with the content, but is it harder to sort of triangulate performance, if that makes sense, relative to say like a new feature you roll out that allows a user to do X in a FinTech app? Yeah, that's an interesting question. I would challenge in some ways the idea that it's that
Starting point is 00:08:04 different. Of course, when someone is interacting with the print version of the magazine, we do not have data on that at all. But when we're thinking about how people are using the site, we can think about whether they're saving articles and whether they come back to them later. And so that's an example of a product feature that we can actually very easily quantify. And when we're thinking about depth of engagement and overall engagement with our journalism, we can look at how often are people coming back to the site. And so I think it's actually not that different in the end. Yeah, that's a really, I think, really valuable way of looking at it. And you mentioned print.
Starting point is 00:08:51 One thing that is really interesting to me is, I mean, you have some access to data around print sort of distribution and readership. How does that play into the work that you do in data science at The Atlantic? And especially, I guess I'm interested in, you know, in combination with the digital data that you have, because that's obviously more abundant and probably more accessible. But we'd just love to know about the relationship between those two types of data. Sure. The easy answer is that we really don't have that much analog data. We know who is receiving a print magazine and who isn't, of course. And so we can
Starting point is 00:09:32 take that into account when we're interpreting users' on-site behavior, right? So if someone isn't coming to the site as much, or they're not opening the daily newsletter and they have a digital only subscription, then we know that this is someone who is potentially concerning or at risk of churn and we want to try to re-engage them. If, however, we know they receive the print magazine, we know there's a chance that they are really valuing their Atlantic subscription, but just valuing it in a different way and just reading our articles on the couch, in the magazine over the weekend. And so that's something that we just don't have data on. There is a small audience research team that's separate from data science that conducts surveys and interviews where we can glean some insights from that. But it's really not something we can
Starting point is 00:10:26 quantify other than helping us to interpret on-site behavior. Very interesting. One quick follow-up question on that. Have you seen any patterns, and this is just more of sort of a, just, I have an interest in human behavior. Do you see any situations in which a print subscription actually increases online engagement for maybe types of content that you're producing online that don't make it into the print edition? Like the different formats may feed off of each other? That's interesting. I don't think that's something we've looked at in particular. Actually, the vast majority of our journalism isn't in the magazine. And so in order to really benefit from the Atlantic as a whole, we would want those people to come on site. But again, there are people who just prefer that analog reading experience.
Starting point is 00:11:23 That's pretty interesting, Jenna. My question is a little bit different, has to do more with how they work and the teams are organized inside the Atlantic. I mean, the data science department that you are. So can you give us a little bit more information about your team, your role in the team, and how the overall data science organization inside the athletic is structured? Sure. So data science is part of the growth team. There are five of us on the data science team, and we run the gamut in terms of experience. We have an analyst who's done a lot of digital marketing, and we have people who are also focusing more on predictive modeling and natural language processing. And so the data science as a part of the Atlantic started several years ago, knowing that we
Starting point is 00:12:16 were going to launch a paywall at some point. And so as such, we are mostly thinking about subscriptions and propensity to churn and propensity to subscribe. But we're also, and as part of that work, we're working closely with marketing and product and engineering, but we absolutely work across the business. We work with people in the newsroom and the advertising side as well. All right. So would you say that like your primary engagement inside the organization is with marketing? Probably not primary. I would say both marketing and product.
Starting point is 00:12:57 Because there's a lot. And of course, there's a lot of overlap between the two. Okay. Yeah, that's a very interesting distinction that I would like to learn a little bit more, to be honest, if you can share some information. So inside the Atlantic, what's the definition of product and how it differentiates between, for example, the people who write the content, right? And there's like the traditional kind of approach in publishing. So can you share a little bit more about that? Because that's especially for me, because I'm very product-oriented, it's super, super interesting to hear about.
Starting point is 00:13:28 Sure. I think you might be asking more about the editorial side in the newsroom. So all of our writers are on editorial or the newsroom, and they're completely separate from product, actually. So when we think about product, we think about the app, we think about our subscriptions product, right? So are we offering discounts or is there a way to upgrade or downgrade your subscription? And also just use of use of the website if you can, if we want to add a feature where you can follow your favorite writer or something like that oh wow so the atlantic right now is not that different than a tech company in the valley right yeah exactly yeah that's super interesting that's super interesting
Starting point is 00:14:17 and it would be like amazing to learn more about this transition from like such a well-established and old institution turning from a publisher more of, let's say, a tech company that part of the product is the content, which is, I think it's an amazing journey. You mentioned also engineering and that you work with engineering. What's your relationship with them? How is the data science team work together with them? So we have a small data engineering team. And so we think of them
Starting point is 00:14:47 as essentially providing raw data for the data science team. And so they're building out ETL pipelines. They are managing our workflow management tools. So we're moving on to Apache Airflow. And then of course, we're working with engineering in terms of setting up web analytics and just making sure that everything that we want to track is trackable so that we can ensure that we can, for example, quantify how well a new product feature increased conversion or decreased retention and what the adoption rate was. That's very interesting. So you don't like the data scientist team is actually, let's say, the consumer of the output of the data engineering team, right?
Starting point is 00:15:36 They are there to support you and make available the data that you need. So you can build your models or do your analysis and drive like business decisions and the product is this do i get it correct yeah yeah that's mostly right okay that's that's perfect so i'll get back i'm sure that also eric like has questions to ask about how you work together and the technologies used there with data engineering but do you think you can share with us how is like a typical day working with the data in the Atlantic? How does a project start in the data science team? How the work is organized? What you're doing with this data? What kind of iterations you do with them? And in general,
Starting point is 00:16:16 give us a little bit deeper insight of how data science work with the available data in the Atlantic. We have a small team, but because of that, we're all doing a little bit of everything. And so data science work runs all the way from business intelligence and dashboarding to A-B testing, deep dive analyses, all the way over to predictive modeling and natural language processing. And so we're obviously each one of us specializes in one or two of those areas. But so our day to day work really depends on which one of those things we're working on. I definitely try to emphasize that we are part of this institution that we're trying to support. And so we want to make sure that
Starting point is 00:17:08 any insights we have, any models we build are really actionable. And so I do place a lot of value on making sure that we're communicating our work to other teams. Jenna, Eric, jumping in here, could you tell us more about the types of, I know you may not be able to share everything, but what types of projects are you working on that involve natural language processing? We've built a topic model and we're also trying to build out our metadata. So metadata is an interesting thing in the journalism world that I can talk a little bit about. So metadata is essentially at a really basic level labels for each article. And so we can think about that as the author of an article or the section it was published
Starting point is 00:18:02 in, for example, science or politics, but we can use NLP to assign other metadata for each article. So we've assigned topics from a topic model to each article. We've also run some named entity recognition models as well. We can assign sentiment, of course, to each article. And the point of doing all of that is essentially to make our analyses more sophisticated, to make our understanding of readers' engagement with journalism more sophisticated. We can use it to power personalization, research engines, et cetera, et cetera. This is great, actually.
Starting point is 00:18:47 It's metadata is a very interesting topic. The use cases that you mentioned there, from what I understand, most of them are internal. You are trying to understand like the content and how the users interact with it. Do you use this metadata also as part of the product? Like the first thing that comes to my mind is how you can provide a better search functionality, right?
Starting point is 00:19:07 Let the users browse the content that you have based on these topics or provide recommendation systems for that. So what is the value of like this metadata to the product itself? So we're absolutely thinking of this as something that can lead into recirculation and recommendation engines.
Starting point is 00:19:25 There's also the idea that it can power essentially category pages so that you can click on coronavirus and it'll just automatically list everything that had the label coronavirus with it. We're not quite there yet. Of course, there's an interesting tension between the data science and then the real world product, because of course, as data scientists, we understand that not every topic assigned is going to be 100% accurate. We understand that some models are better than others. But when it comes to actually putting something on the page that a reader will see, we want to be a lot more careful and just have a higher bar for what gets
Starting point is 00:20:14 on the page. And so we're not quite there yet where we're okay servicing this type of thing to the reader. We'd want to have some sort of essentially veto power. Makes sense. Makes sense. So traditionally in the publishing world, because, okay, I guess that this kind of metadata, it's not like something new people use like to catalog content and try to come up with topics and all these things for like many, many years before technology was there in data science. So how was traditionally this done and how useful it was in the past? Of course, before we had machine generated metadata, you could have human generated metadata. And I believe this is still a team at the New York Times.
Starting point is 00:20:58 And so it absolutely can be really powerful. And we've actually also been thinking about what type of metadata we might want that we would need to be human generated, essentially. So if we want to label articles as a personal essay or as something written by a presidential candidate. We could probably develop some sort of sophisticated algorithm, but at a certain point, it's possible that it's just easier and faster to have it labeled by a human. And so that's absolutely how it used to work. That's very fascinating. So is there today any kind of cooperation between these two? Like you have the team who
Starting point is 00:21:46 is responsible for coming up with this metadata and the topics and creating categories of the content that you have. And then you also have the algorithms. I'll give you an example just to make the question more clear. Do you use or do you see using the data that is coming from these people to feed and train your models,
Starting point is 00:22:01 for example, or vice versa, use the output of the models to help them much faster come up with these topics? Is this something that's happening today or thinking of doing it in the future? So the latter, we would definitely think of doing. I think if we wanted to develop essentially category pages, we would want an editor to be able to go
Starting point is 00:22:21 into our content management system and essentially be able to say, our content management system and essentially be able to say like, no, this is wrong. We don't want this to be in this category. But I will say, I guess we're a small enough company that we don't have a lot of that human generated metadata right now. I think it's something we might do more of. But of course, that is outside of data science world. That's great. Quick question around that. So you said that you don't have right now the amount of data that you need to train
Starting point is 00:22:54 these algorithms. Can you give us like an example of algorithms that you are using? I don't think it's the quantity of data that we're lacking because we also have a large archive. So yeah, I think that isn't the problem. We're using, we've been looking at spaCy in terms of named entity recognition and some other natural language processing capabilities. That's cool. And what's your experience so far with it?
Starting point is 00:23:21 And how do you feel about the technology? I mean, have you seen it progressing in this space? And what do you anticipate in the future in terms of like improvement in this algorithm? We had originally built out one of these models a few years ago, and then recently updated it with a newer model. And it vastly outperformed what we had seen before. So we're absolutely seeing a lot of progress. And so we're excited to put it to work. Very cool. Question on models. And we've talked with multiple data scientists on the show. And one interesting subject, especially depending on different industries, is just the question around bias and models or the way that you build models.
Starting point is 00:24:11 And one thing that's interesting, and this could be not even right word, but I can't think of a better one. So the editor, in some senses, sort of saying, you know, these these topics are important for this edition of the magazine. This is what we're going to write about. Do those sorts of things influence the way you think about building models? Because it seems like there can be a significant human element in what actually gets delivered in the product as content that's an editorial decision. Yeah, that's a good point. And I guess I'll say that most of our modeling thus far has been more on the propensity modeling. So on the subscription side, so it's pretty independent of the editorial
Starting point is 00:25:13 side of the Atlantic. When we do get to the editorial side, I think we absolutely know that our editors are the experts here and we don't try to tell them what to write about. We essentially tell them what is performing well, what kind of content is really resonating with our audience. And then the final decision is with them. As for the models that do have to do with the articles. I'd say we're still in the early, early stages
Starting point is 00:25:49 of trying to figure out what we'll actually be doing there. That's really interesting. And I mean, it's really neat to hear that there's a relationship between the people making editorial decisions and data science. I just think that's a really cool, well, I mean, cool is one word for it, but I actually think a very modern way to approach operating inside of a publication.
Starting point is 00:26:14 Yeah, and I think also a very positive one. As all this time that Zena is talking about how the Atlantic is using data science and data in general. I think it's a very good counterexample of all this fear around that like ML is going to destroy jobs, that we are going to have Terminator coming and all that stuff. When you actually see inside an organization how technology can work very closely and help the professionals you have over there to focus more on the creative part of the work they are doing
Starting point is 00:26:43 and be much more, let's say, efficient and creative at the end. I think that's also the vision with technology in general. And things are not like binary. They're not like black and white. Either the technology is going to do something or the humans are going to do. At the end, I think the real value is when you combine these things together. So I think it's a very encouraging and positive example that we have here. What do you think, Jenna? What's your opinion on that? It's really exciting to be working at a company that's been around for such a long time and that can really drive the national conversation. And so I'm really just happy to be a part of it. Jenna, just out of curiosity, do you
Starting point is 00:27:23 on the data science team interact with the journalists themselves at all? Sometimes for sure. There are a few of us who interact with them more often just because they're accustomed to having those types of conversations. But yes, yes is the short answer. Very cool. That's just need to hear. Okay. We love to have philosophical conversations on the show, but I do need to ask just because I think for our listeners, your tool set and data science, and I know you talked a little bit about modeling with Costas, but could you just tell us a little bit about the tools
Starting point is 00:28:03 that your team uses, the stack, and then maybe even if you're not using them, what other tools that are sort of new and exciting to you in the data science space? So all of our data is in BigQuery. And I mentioned that our data engineering team writes some of those ETL pipelines, and then we have data connectors that's bringing in our subscriptions data and a number of other pieces of data. We use Looker for our dashboarding tool, and many of us are using Python for analysis.
Starting point is 00:28:37 A couple of people are using R as well. And then all of our models are running in Python on AWS. I have a quick question. You mentioned both BigQuery and AWS. What's the reason of using two different vendors there? That's a good question that I don't have the answer to. And I would defer it to the engineers who made that decision. Okay, so it's purely an engineering decision.
Starting point is 00:28:59 It doesn't have to do with what the data science feels more comfortable to use, right? I mean, honestly, those decisions were made before I got to the Atlantic and we've been happy with how it's been working out. Okay. That sounds great. Okay. You mentioned like the tools that you are using and we have talked so far about doing a lot of work with the data that you have, the actual text that you have from all the things that get published. What other data you are using? You mentioned subscriptions. Are there any other data sources that you are using that are important to your job? The big ones are definitely the web analytics and the subscriptions. We're really digging in on subscriptions more. We're doing a lot of attribution modeling recently.
Starting point is 00:29:47 And so we're thinking about that in two different ways. So we're trying to attribute new subscription purchases to, of course, various traffic sources. Is our paid marketing driving subscriptions? What about when articles are going viral on Facebook or people clicking on newsletters and then coming in and subscribing? And then we're also thinking about attribution in terms of articles. So what types of articles are people reading and then immediately subscribing after? No, that's super interesting. Are there any behavioral data that you're tracking?
Starting point is 00:30:25 Are you interested in how your user interacts with both the application and your content? And how do you measure that? We probably use the standard. We look at sessions. We look at how often people are looking at the homepage. We find that homepage traffic is often a pretty clear sign that someone is more likely to subscribe because they're not just reading an article, but they're actually going to the homepage to see what else we have to offer.
Starting point is 00:30:53 And then we also have, I mean, some of our articles are short, but we also are probably known for having a lot of long form articles as well. And so we definitely pay attention to scroll depth to see if people are getting to the end of the articles. Jenna, you mentioned if an article goes viral on Facebook, have you done any work around patterns of articles that tend to go viral? We have. And of course, this happened. This happened a lot in the past year because there was so many, many things to write about that everyone needed and wanted to read about. Exactly. Exactly. And I mean, it's not a case where we can learn about what happened and then replicate it, right?
Starting point is 00:31:47 Like you can't really replicate a story that goes viral because you wouldn't necessarily expect it. There are always way too many ingredients to really predict that. But what we have looked at is articles could go viral on Facebook or Google search or Twitter. Oh, interesting. And then different, so the page views could be higher in one or the other, but then in a lot of cases we'll get a higher conversion rate from a source that might not get as many page views. And so we've absolutely seen some really interesting
Starting point is 00:32:27 trends there. That is fascinating. It makes sense that you're not trying to produce listicle clickbait that goes viral, right? Because that's not the type of content that the Atlantic produces. And so it is interesting to hear you say, you know, you can't necessarily like influence, like you write this and it will go viral, but it's fascinating that the metrics around the different platforms and then the various forms of quote unquote performance, right? Page views versus subscriptions has a lot of variance. That is so interesting. So Jenna, just to move a little bit forward and talk a little bit more about you, as we are also reaching the final
Starting point is 00:33:10 part of our conversation today, can you share with us a little bit more information around your whole experience so far, going from the academic environment to going to a fintech company in the Bay Area and then going to the Atlantic. I mean, what are the common things that you see? You mentioned something very interesting at the beginning about the difference in the focus of the work and how important the impact is in the industry compared to the scientific world. But what are other differences? And more importantly, what are the common things that you see there? I'd say the work itself is probably surprisingly similar. You know, data, to a certain extent, data is data or data are data.
Starting point is 00:33:54 And so I think once you're really familiar with data and you know how to, you know, ask questions of data, you know, when something doesn't look right, and you know how to essentially be creative with analyzing data, I think you can probably use any sort of data. So yeah, I'd say that that's a really big similarity. I think one thing I've enjoyed outside of academia is being able to work with a really wide range of stakeholders. So not everyone has the same data literacy or the same goals really in understanding the data. And so it's been really, really great to flex that muscle. Yeah, that's very interesting. I remember a friend of mine who did his PhD in machine learning and computer vision, who ended up working for Facebook. And at some point, we met and we were talking together and he was, okay, he was trying to
Starting point is 00:34:59 describe to me what he's doing there. And pretty much like the work was not that different between the two environments I mean he was still making writing papers and doing publications and creating models and stuff like that one major difference that I I saw there was and what he was trying to communicate to me is that I'm much more stressed now because okay it's one thing to write a paper and have like a peer-reviewed like publication and it's another thing to know at the same time that this model is going to be affecting the experience that in case of facebook for example affects the experience of like billions of people so i found this like extremely extremely interesting That makes things, of course, like more stressful, but also,
Starting point is 00:35:46 I think, much, much more exciting, in my opinion. Yeah, I 100% agree with that. I think in academia, people think about making an impact on a very, very long time scale. And so things definitely feel a lot more urgent, I would say on a day to day basis. But I like that. Yeah, yeah, makes sense. So after being outside the academia for like, I guess, like some time now, how do you feel about this decision? How happy you are, let's say, and how do you reflect back then when you had to make the decision to go to the industry? I am really happy in data science. I've really enjoyed, as I said, being able to work with a wide range of people and flexing more muscles and learning how to communicate what we're doing to a wide range of people.
Starting point is 00:36:41 But I am absolutely a big proponent of basic scientific research still. All right. One last question for you, Jenna, before we wrap up. What are the big projects you have coming down the pipeline at The Atlantic from the data science perspective? You've mentioned some, but what are some that haven't started yet that you're particularly excited about? I mentioned to a certain extent, a recommendation engine, but I can talk a little bit more about that because it has a lot of different potential use cases from sending out personalized emails with reading recommendations and driving recirculation modules on our site. And this has been, we're really only at the beginning
Starting point is 00:37:26 of this project, but it's been fun because it's required us to partner really closely with engineering because of course our team can build out this model or this algorithm, but there's so many more steps that have to happen in order to get the results of the model onto the site and make sure the right person is seeing the right recommendation. So that's something we're really excited about seeing to fruition. Yeah, that's something that we hear about a lot. I think a really encouraging trend is that sometimes you've heard it referred to as the last mile. If you think about personalization, I mean, it's certainly a huge accomplishment to build a model that does that well,
Starting point is 00:38:12 but then you have to get the results of that model into an actual user experience and even a pre-existing user experience many times, right? So you're sort of fitting the results of a model into a pre-existing user experience. And from a technological and design perspective, that is non-trivial. Yeah. It's a great example of how we cannot do our work in a silo and we really need to collaborate with other teams pretty early on to understand exactly how we should complete this so that it can be implemented correctly. Very cool. Well, one last follow-up question on that. I lied. I have one more question. Have you developed a rhythm, and I know it works differently at different companies,
Starting point is 00:38:56 and especially just sort of based on the business model and the industry, different methodologies work better or worse depending on the situation. But collaborating early on, when something like recommendations originates, is that usually sort of the initial concept originating from data science? Or is it maybe originating from product who says, we have an idea for something that may improve the user experience that involves personalization? Or is it both? That's a good question. And I would say that we've had projects start in both places. So our attribution modeling work was definitely a request from more of the marketing side. For recommendation, this was something that data science was interested in. But then, of course, we had to essentially find a product stakeholder who would evangelize it and see it through and make sure it gets onto the roadmap of the product and engineering team.
Starting point is 00:39:58 Very cool. I mean, there are way smarter people than me that sort of do, you know, work in operational design. But I think from all of the companies that we've talked to on the show, it seems like it's very healthy to have a symbiotic relationship where data science is pushing value back into the organization in the form of ideas, but then people who are building the experience for the users are bringing needs to data science. And that seems to be sort of a relational dynamic among the teams that produces just really, really interesting and valuable experiences for users. Yeah, I completely agree.
Starting point is 00:40:38 Well, Jenna, this has been absolutely wonderful. Thank you so much for joining us on the show. We'd love to have you back on in the future, especially when you've had a chance to work on some of the personalization stuff to hear more about that. And we may even prevail upon you to ask someone from data engineering to join the show because I know Costas has burning questions about using both Google Cloud and Amazon. Sure. Thank you so much. As always, we learned so much. It goes without saying, but the academic background is something
Starting point is 00:41:13 we just need to do an episode on because it's been such a pervasive theme throughout the show. I think what was really interesting to me was to hear about the way that they seem to think about the relationship between sort of hardcore data-driven functions like data science and the sort of more subjective functions that are editorial and that are very human-based. And I think, you know, thinking back on conversations with other data scientists we've talked to, the human element continues to seem to be like one of the most interesting things that data scientists deal with. And it was really encouraging to me actually to hear about the relationship together with the human factor inside the company. Outside of this, I was extremely excited to hear that a company as old as The Atlantic, like it's a 150 years old company, it's actually a technology company today, which is crazy to think about what kind of transformation in these 150 years this organization had to go through and how they still are able to adapt, which is amazing. It is pretty wild to think about the state
Starting point is 00:42:32 of basic technology around things like electricity and other things like that when the Atlantic started and now they operate like a Silicon Valley product. That is fascinating. All right. Well, we could say so much more, but thanks again for joining us on the Data Stack Show. Be sure to subscribe on your favorite podcast network to get notified of new episodes every week. And we will catch you next time. The Data Stack Show is brought to you by Rutterstack,
Starting point is 00:42:59 the complete customer data pipeline solution. Learn more at rutterstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.