Microsoft Research Podcast - 077r - The productive software engineer with Dr. Tom Zimmermann

Episode Date: January 8, 2020

This episode originally aired in May, 2019. If you’re in software development, Dr. Tom Zimmermann, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he’s h...ere to help. How, you might ask? Well, while productivity can be hard to measure, his research in the Empirical Software Engineering group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today’s podcast, Dr. Zimmermann talks about why we need to rethink productivity in software engineering, explains why work environments matter, tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer, and what it would take to make a good day typical!

Transcript
Discussion (0)
Starting point is 00:00:00 When Tom Zimmerman came on the show in May of last year, he talked about how researchers are using data to help software engineers be more productive in their daily work. Whether you heard Tom in the spring of 2019, or you've just put productivity on your list of New Year's resolutions for 2020, I know you'll enjoy Episode 77 of the Microsoft Research Podcast,
Starting point is 00:00:21 The Productive Software Engineer. If you think of a typical software engineer at Microsoft, they spend about half of a day on development-related activities. And the other half of a day is spent on other activities like coordinating with other people in meetings, sending emails. So that's actually not that much time that we can spend on writing code. And the time they spend writing code on a good day, it's actually only 96 minutes. And on a bad day, it's on average 66 minutes. And half an hour writing code actually can make the difference between a bad and a good workday. You're listening to the Microsoft Research Podcast, a show that brings you closer to
Starting point is 00:01:08 the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizinga. If you're in software development, Dr. Tom Zimmerman, a senior researcher at Microsoft Research in Redmond, wants you to be more productive, and he's here to help. How, you might ask? Well, while productivity can be hard to measure, his research in the Empirical Software Engineering Group is attempting to do just that by using insights from actual data, rather than just gut feelings, to improve the software development process. On today's podcast, Dr. Zimmerman talks about why we need to rethink productivity in software engineering, explains why work environments matter,
Starting point is 00:01:53 tells us how AI and machine learning are impacting traditional software workflows, and reveals the difference between a typical day and a good day in the life of a software developer and what it would take to make a good day typical. That and much more on this episode of the Microsoft Research Podcast. Tom Zimmerman, welcome to the podcast. Thank you. You have a cool nickname, Tom Z, kind of like A-Rod or J-Lo, but for research. Why do people call you that?
Starting point is 00:02:30 So it goes back to when I started at Microsoft. So my manager was Tom Ball. And so because my short name is also Tom, so basically we had two Toms and we had to find ways to distinguish between us. Well, you're a senior researcher at Microsoft Research in Redmond, working under the umbrella of programming languages and software engineering. In broad strokes, what does a typical day look like for Tom Z? I'll ask you about a good day later, but what's a typical day? What gets you up in the morning?
Starting point is 00:02:59 So basically, the work I do, it's analyzing any type of data which is related to software. And so what gets me up in the morning is to basically learn new things about software development. Because there is so many things we don't know or we don't have any data about. So really what my job is about is to basically look at such data and create interesting insights that we can use to improve software development at Microsoft. So drill in there a little bit. Where do you gather this data and what does it look like? So some of the data I collect by myself. So like a lot of the work I do is interviews and surveys with software developers. We also look a lot at existing software repositories. So what happens during software development is that there's a lot of data that's being collected. So there's data about changes, about bugs that people face, about test executions,
Starting point is 00:03:53 and also like telemetry about how the software is being used in the field. Right. And so what I do as part of my work is look at this data and do some analysis. And it can be many different things. So sometimes it's empirical studies to find out what makes software engineers productive. Sometimes it's building recommendation systems. So one of my earliest projects was to build a recommendation system
Starting point is 00:04:14 based on past changes to recommend, like, people who changed this file also should change this other file. It's mostly internal, so a lot of the data we analyze is internal Microsoft data. So it sounds like you've got to be sort of cross-pollinating with developers and engineers in Microsoft proper. And you're here in Microsoft Research working with them.
Starting point is 00:04:36 How does that work? So we meet with them. We just send emails or schedule meetings with developers and talk to them. And in fact, actually, that's like one big piece of the work I do is really figuring out what challenges people face. And the best way to find out these challenges is by talking to people. And then after talking to people, it's basically can we use additional data to basically find support for the observations we made. Well, let's start the podcast kind of generally and set the stage for some of the other things
Starting point is 00:05:07 we'll be covering by talking a bit about ICSI or the International Conference on Software Engineering. Since it's happening in Montreal about the same time this podcast drops, I'd love for you to tell us what this conference is about and why it's important for people in your field. Yeah, so ICSI is the largest software engineering conference. So everyone who is passionate about improving software development goes to ICSI and finds out what everyone else is working on.
Starting point is 00:05:32 So you will find a lot of professors who are doing research on software engineering. You will find a lot of people from industry who want to find out what the latest research is about and how it can help them in what challenges they face right now. You will find a lot of students who basically want to start their careers. So it's really good for recruiting. And in fact, a lot of the interns I had over the past few years, I found at XE. Well, you're part of a group called ESE or Empirical Software Engineering.
Starting point is 00:06:01 Tell us why this group exists and what you hope to achieve through it. So one of the reasons this group exists is because there's still a lot of decisions being made just based on gut feeling. So like statistics that 40% of management decisions people just make based on what we think is right, but we don't use any data to inform the decisions. Or what you often also hear is that people get to see some data, but then the data doesn't match the gut feeling. They question the data until they either invalidate the data or the finding matches what we were expecting. So that's really what ESE wants to change.
Starting point is 00:06:42 So what we want to have is that people have a more data-driven culture when making decisions, because there's so much data available about software development. By doing some analysis on top of this data, we can learn things about software engineering, and that can help people to make better decisions. We can also build tools to make people more productive, and so that's like another mission of the ESE group. Yeah, and we're going to get way into productivity shortly. Give me an example of a decision that somebody would make on a gut feeling that could have disastrous consequences or maybe it'd work out just fine, but you don't know, right? Yeah, so like one very simple example
Starting point is 00:07:21 is like just management structures, like how you compose your teams, who is put in a management role and who is assigned which manager. That can have like big implications on the productivity and satisfaction of software engineers. So that's some of the questions that you're asking in ESC? colleagues, Nachi Nagapan. So he actually did one study where he looked into how the organizational hierarchy affects software quality. And so he actually found that like a misalignment in organizational hierarchy, that's one of the biggest predictor of bugs in software. Wow. And so another study we did in this space was one of my colleagues, Chris Bird, he was
Starting point is 00:08:02 looking into what makes effective software engineering managers, because software engineering management has so much impact on the software we build. So it's really important for us to understand what are characteristics of effective management techniques. You know, we're going to talk about the management impact on productivity of workers. It sounds like that's not just a software engineering workplace question, but could have broad implications. Absolutely, yeah. And I think that's what's unique about empirical software engineering is that it's not just software engineering, it's not just programming languages. It covers much more than
Starting point is 00:08:40 just these two disciplines. Like a lot of the work we do, it's very related to human-computer interaction, CSCW style of work. We do a lot of AI in our research as well because we often build like tools which learn from past data to support decision making. In a recent paper entitled Today Was a Good Day, The Daily Life of Software Developers, I can't wait till that movie comes out. You asked thousands of professional developers what constitutes a typical day at work and what constitutes a good day in hopes that the answers would help you understand how to make a good day a typical day. Yeah. Yeah. So what did
Starting point is 00:09:32 you learn from the research behind this paper? Yeah, so we've done actually quite a few things. So some of the findings confirmed what we already knew, which is also important in empirical research, because often you think you know something but you don't have any evidence and that's why you do empirical studies to support what your assumptions are. So one of the findings we had was and that's probably not surprising to many people
Starting point is 00:09:56 but software development involves many other activities than just writing code. So actually if you think of a typical software engineer at Microsoft, they spend about half of a day on development-related activities. And the other half of a day is spent on other activities like coordinating with other people in meetings, sending emails.
Starting point is 00:10:18 So that's actually not that much time that they can spend on writing code. And the time they spend writing code on a good day, it's actually only 96 minutes. And on a bad day, it's on average 66 minutes. And half an hour writing code actually can make the difference between a bad and a good workday. So everything you're saying has broader contextual application as far as I'm concerned. I mean, I think the average worker in any occupation has to do email, has to do meetings, has to do, you know, all the mundane things. And then, you know, if they're a writer or if they're, you know, an artist or whatever, it's how can I get to my work? So the next question is then, with what you found that did confirm what you think, how do you get back those hours or those minutes, that time?
Starting point is 00:11:13 Basically, the way we approached this in this particular paper is we wanted to understand what makes good workdays and what makes typical workdays. And so we built different conceptual models. So the way we approached it is we had this big survey with close to 5,000 responses. And in each of the responses, people could say why the day was good and why the day was typical. And so we analyzed all of these 5,000 responses and came up with frameworks.
Starting point is 00:11:41 And one of the frameworks was basically for a good workday, things that matter is that people see that they created some value. So at the end of the day, if they want to go home and say, hey, I spent a certain amount of time coding, I spent some time helping others to do their work, or I spent some time finding a bug. So they want to be able to say that they accomplished something. So I noticed how you able to say that they accomplished something. So I notice how you're framing this as a typical day and a good day. You're not calling a typical day a bad day. Is there something below a typical? I mean, there's days that could be really bad, right?
Starting point is 00:12:18 Yeah, absolutely. And I mean, an untypical workday doesn't have to be bad. Right. Because it can be untypical because you just had like 10 hours coding time. Sure. And that comes down to looking into productivity. It's actually a very complicated construct. Like there are many factors which come into play that can change if you're productive, if you have a good workday, if you feel happy about what you accomplished.
Starting point is 00:12:43 Well, let's dig in on this idea of productivity, especially as it pertains to programmers and developers, because that's what matters here. One of your papers, you have so many cool papers, Tom. It's really impressive. One of your papers is titled The Effective Work Environments on Productivity and Satisfaction of Software Engineers. So tell us about this research and the paper that came from it,
Starting point is 00:13:06 and what were the big takeaways on the study of work environments in relation to work productivity? Yeah, so it comes down to, again, many factors can influence productivity. We previously talked about management, and just like management, work environment is like another factor which can influence how productive software engineers feel. So the goal of this project was to understand what software engineers expect from the work environment. So what are things that they consider to be important about the work environment and how can they help us understand productivity and satisfaction. So what we did is we started off with interviews, like as we do typically in our studies.
Starting point is 00:13:51 We did some analysis of interviews, and we followed up with a survey among software engineers at Microsoft. And so what we found was basically that there's like a set of different themes which are really important for software engineers and probably for everyone else working in office environments. So it's not necessarily saying software engineering is so much different, but it's confirming also what other people found about what makes good work environments.
Starting point is 00:14:17 So we found that personalization is very important for people. We also found that social norms and signals are important in work environments. Like when you're working in an open office, certain signals people use to say, hey, don't disturb me. I have to focus now. For example, wearing headphones is one of the signals people often do. Then also like how the room composition looks like. So who's in the room sitting with you? Are you in the same room as your team members or are you in in the room sitting with you? Are you in the same room as your team members or are you in a separate room? If you can work without
Starting point is 00:14:49 interruptions and can focus on the work that's also very important. And so what we did is we collected these different aspects and when we did the survey and this helped us to basically prioritize which are the most important factors when it comes to productivity. And then you present these findings to people who are in charge of making the environments one way or the other. I would imagine there's a spread there because there are some people who can work in completely noisy environments, no problem, and other people who say, just get out of my face. Yeah, so absolutely. Like people have very different opinions or feelings and work differently in different environments. So then is one of the takeaways to say, make space for these different kinds of work preferences?
Starting point is 00:15:36 Yeah. So that's one of the takeaways. And I would even say it's a takeaway of entire productivity research we've done is that there's not a single way to make everyone more productive. Right. Right. There's so many personal differences. You really need to understand how everything plays together and then you can make recommendations for individual people. Right. It's very similar to like going on a diet or when you go to a gym, like your exercise routine. Everyone has their own techniques which work best for them. So it's not a single one-size-fits-all. All right. Well, we talked a bit about the movement toward incorporating data science
Starting point is 00:16:14 to help programmers achieve more, and you have a seriously delightful presentation called Software Productivity Decoded. I wish our listeners could see it, especially the part about Alice in Data Land. They'll have to come to one of your talks, I guess. Anyway, this is important and speaks to your unique approach to software engineering research. So how does data science impact productivity and help software engineers achieve more? So I think first it helps us to really understand more effectively how software developers work. So you can think of applying data science work to software engineering.
Starting point is 00:16:50 And so I've mentioned all these different data sources that we can leverage. And we also talked about ICSI. There's actually a conference just dedicated to the analysis of software data at ICSI. We talked about acronyms. It's actually called the MSR Conference, but it's not for Microsoft Research. It's for mining software repositories. Oh, that's funny.
Starting point is 00:17:12 And so what this conference does is basically people look at all the data that is available for software and software development and they try to do cool things with it. So I will come up with interesting findings that help us understand software engineering more or come up with tools that allow people to be more productive. So some people use data science techniques to improve code review processes.
Starting point is 00:17:35 For example, you can assign code reviewers to your changes and then your change hopefully gets reviewed faster. So that's one area where data science can help us to discover more about soft engineering. It can also help soft engineers to understand more about how they work themselves. It's sort of like Fitbit or Apple Watch, right, which keeps track of your steps, your heart rate, and at the end of the day, you can look at it and reflect on how active you were. You can do similar things with data science. So you can build tools which allow developers to look at the end of the day to see how much time they spend on coding, how much time they spend on
Starting point is 00:18:17 emails, on meetings, and then they can use this to become more productive and also to understand how their own productivity relates to the team productivity. Because you always have this tension between your personal productivity, but also productivity as a team. So sometimes what's best for your own personal productivity is maybe not best for your team, right? So one thing we find is that people don't want to be interrupted. So we can increase focus time, that helps people to become more productive, but it can also block someone who really needs your help right now. If they cannot interrupt you, they might be stuck and not make any progress. And so what data science can help us do is
Starting point is 00:19:01 to really understand this tension between individual productivity and team productivity and find effective ways to balance it. Those are fascinating questions and fascinating research. It seems like there's a really beautiful mix of qualitative and quantitative in your work. And it also is over into the HCI space. I know that there's researchers here, Dan McDuff and Mary Cherwinski, who are monitoring and then saying, how can we improve our tools to make us understand when I'm in the flow and don't get me out of the flow or when I could be interrupted? Yeah. Do you interact with them quite a bit? Yeah. So we work together on some things and it's very related for what we do.
Starting point is 00:19:42 Yeah. All right. Well, let's circle back to ICSI, because you're presenting a really interesting paper there this year called Software Engineering for Machine Learning, a Case Study, which I love. Tell us about this paper and the research behind it. How have you found AI, because that'll be in there, to be sort of a forcing function for an evolution of workflows in the software development process. Yeah. So what's happening at the moment is that more and more teams are using AI in their software. And so what we're trying to do is to basically, because there's this whole AI workflow, how you build machine learning models, you collect some data, you change the models, you tune the models, you test the model. And we are basically trying to fit this workflow into the traditional software engineering workflow. And it is possible, but it's
Starting point is 00:20:31 not a natural fit. For example, one very simple difference between traditional software development and modern software development with AI is in traditional software development, like software testing, it's usually fairly easy to tell if a test has failed or passed. So there's some exceptions, but usually you know if something has worked or not. If you type in an Excel, like add up two numbers, you can tell if the sum is the same. And if it's not, you know you have found a bug. And this is a little bit harder for AI models because testing is not as straightforward. Because you don't know what is a good AI model. It's sometimes hard to tell.
Starting point is 00:21:13 Right. Because you cannot test them online. You have to test them offline first. And it also might change over time. If you use new training data, the model might look completely different. So it's really hard to tell when you're getting worse results than you did before. So that's really one of the challenges people have to figure out. And that's what we've been looking into in this paper. So we started off with interviews and we did surveys and collected data and did analysis
Starting point is 00:21:42 of lots of open text responses. Well, tell us a little bit about the case study. What did you look at, and how did you get your subject or your case? So the case study is basically Microsoft. To find people, we used something called Snowball Sampling, where we basically identified a few people knowledgeable in this area. And they recommend two friends, And they recommend two friends. They recommend two friends. And we kept doing this until we pretty much thought we heard everything.
Starting point is 00:22:14 So that's called saturation, when you don't get to hear anything new. And that's when we started working on a survey to really understand the landscape of AI at Microsoft. The interviews allow us to go deep because we can talk to people one-on-one and it gets us more richer inside. And then we want to confirm what we observed in the interviews with a large-scale survey. And it also allows us to look at some other things. Like we looked into how much time people spend on different activities
Starting point is 00:22:41 and get questions about challenges we face. It makes me laugh because I, you know, what was your case? Microsoft. That's a big case. It is. And you're lucky you got access to this place. You're a busy guy, Tom. In addition to all the research and papers you're doing, you're one of the editors of a new book called Rethinking Productivity in Software Engineering. I'm detecting a theme here.
Starting point is 00:23:22 The description says that it collects the wisdom of software engineering thought leaders in a form digestible for any developer. I'm going to resist my urge to make a comment about that and just ask you to give us some highlights on what these thought leaders had to say. So basically the book is the outcome of a Darkstool seminar that I helped co-organize. And so we organized a seminar on rethinking productivity in software engineering. And so that's where all these thought leaders met. And at the end of the seminar, we decided we want to do a book to basically summarize what we discussed at the seminar. So one of the takeaways, and this is again not going to surprise everyone,
Starting point is 00:24:03 but it's really hard to measure productivity. And often when people think about productivity, they have like one or two measurements in mind. But it's important to take away that you need to measure many different things if you talk about productivity. If you just focus on one or two aspects, you're not going to get a complete picture. Because productivity depends on so many different aspects, like management, work environments, which tools you use, what processes you use, how good your team is, how much experience you have, so many factors which can affect productivity. Basically, what we were doing is discussing these factors and coming up with ideas how to get at a more complete picture of software productivity. Do you ever factor in things like life events or illnesses or the kinds of unpredictable edge cases?
Starting point is 00:24:57 Yeah. So actually when we did this work on a typical soft engineering workday, some of these came up. So if you have to go to a doctor and family emergencies. So that comes into play. But you can't predict any of those, nor can you prevent them necessarily. You cannot predict or prevent, but you can basically deal with how you react to these cases. So having contingencies and thinking ahead for the unexpected. Yeah. And the end, it also comes down, and that's also part of the book is, so we have one chapter on happiness of software engineers.
Starting point is 00:25:30 So one thing that has been found in research is that happy software engineers are more productive. Are productive software engineers more happy? I think so too. Well, I like to ask all my guests to share their version of what could possibly go wrong, because pure research is way upstream of unintended consequences. So tell us, in the context of your work, is there anything that concerns you or keeps you up at night? And if so, what are you doing about it? Yeah, so what's happening right now, I think, is software is getting more and more important. So you cannot do much without software.
Starting point is 00:26:12 And we're working really hard to make people more productive, more happy, more satisfied when they use software. But I think what's also really important is to think about the people who cannot use software for various reasons, or who are not as skilled using software. So that's one thing which concerns me.
Starting point is 00:26:30 So give me an example. So I see it a little bit with my parents. Oh, yeah. So there are certain things they're not as comfortable doing on a computer, which is totally natural for me. And I'm also going to get older, and I'm going to have the same challenges. And I sometimes see it already, like certain things I'm like, oh, it's getting a little bit too complicated. So that's one thing which worries me, like how we can really tailor to everyone with software.
Starting point is 00:26:55 Yeah. Is there anything you can do about that? Because that's an interesting conundrum that's not new and it's going to get worse, I think, in a cloud era AI world? Yeah, so I think it's a tricky question and it might get worse for a little bit, but over time, I think it's probably also going to get better because right now you need to interact with a computer. If you want to do something on a desktop computer, you need to use mouse and keyboard. It's much more complicated. But again, with my parents, like once I gave them like touch devices, it was much easier for them to interact with it. Right.
Starting point is 00:27:31 And if you think of what's coming next, if you do more voice control, it's probably also going to get easier for people to interact. So designing much more natural interfaces will probably assuage some of the problems that people have encountered with an engineered environment for what we use now. So I like to know the personal stories and career paths of the people I interview. They're often really interesting and inspiring, and sometimes they're actually entertaining. So regardless of your genre, tell us how you landed here at Microsoft Research in Redmond, all the way from Germany, with a stop, I understand, in Calgary, Alberta, Canada? Yes, yeah.
Starting point is 00:28:09 Tell us your story. I did my PhD in Saarbrücken, in Saarland, in Germany. And so during my PhD, I ended up doing an internship at Microsoft Research, actually working on empirical studies. And it was actually very eye-opening for me at that moment because until then I only looked into open source projects. And so when I came here, I learned really like what's important for Microsoft, for company, for large-scale software development. What year was that? It was 2006. And I really enjoyed my time here. So when I finished my PhD, and I ended up going to University of Calgary as assistant professor for one year,
Starting point is 00:28:52 and then the opportunity opened up at Microsoft for me to become a researcher here, and I jumped on it because most of my research was very data-driven. I had a good time here during my internship, and I thought, what better place to be for data than Microsoft? And since then I've been here, and I've done data analysis of many different things. I looked into games telemetry, like how we can increase player engagement on Xbox, and lots of software data about software teams, software productivity.
Starting point is 00:29:25 And so it's been really interesting. Well, it seems like the research that you do has broad application in any division around the company. Do they let you out of 99? Yeah, we get out of 99 a lot. So that's really nice about being in research because we can get to work with many different product groups on many different challenges. And they're usually all very interesting. Well, that's great because that's where research becomes practical and helpful in immediate situations. Yeah.
Starting point is 00:29:58 And the other thing which was really great about coming to Microsoft is, so initially my motivation was to come because of the data, but what I learned once I came here is you also get the people who you can talk to about the data, which is a very big difference compared to more open source setting. Come for the data, stay for the people. Yeah, it's all about the people. Along those lines, here's a new question I've been asking. What's one interesting thing, whether it's a trait about you, a characteristic, a life event, that people might not know about you?
Starting point is 00:30:34 And how has it influenced or impacted your career as a researcher? Okay, so I've always been fascinated by numbers. And so here's one thing I think only my parents and my sister know. When I was like little, like eight or nine, I actually did my own stock index. So and it was the old days. It was before internet. I didn't have a computer. So basically every morning the newspaper came in, I took the newspaper, was looking at the stock listings, looking up the numbers, computing my own stock index, writing it down in like a notebook, and I kept it for quite a long time.
Starting point is 00:31:15 And so it kind of started my interest in doing data analysis. Were you a good analyst? For eight. For eight, I would say yes. I mean, it was not a very clever stock index. And I just liked adding up numbers, doing the average and keeping track of it. Do you follow the stock market now? I still follow the stock market. And do you analyze how they're analyzing? So I don't do analysis on the stock market. I think the people who are smarter.
Starting point is 00:31:47 See what I'm saying? Yeah. I like it. As we close, Tom, I'd like to give you the chance to say something profound to our listeners. If there's anything you believe would-be researchers in programming languages and software engineering ought to know, now's your chance to say it. I have something very short and concise. It's that software is more than bits and bytes. It's all about the people.
Starting point is 00:32:14 How can you close with anything better than that? Tom Zimmerman, Tom Z, thanks for coming on the podcast today. Thank you for having me. To learn more about Dr. Tom Zimmerman and the science of programmer productivity, visit microsoft.com slash research.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.