Microsoft Research Podcast - 059 - Building contextually intelligent assistants with Dr. Paul Bennett

Starting point is 00:00:00 If you think about having your own executive assistant, a great executive assistant is prepared to answer all of your questions. And that's where a lot of the assistants live right now. They're responsive to you. They know the answers. They come back. An even better assistant starts anticipating your needs. And that's where we go with things like the Calendar Aware Proactive Email Recommendation.

Starting point is 00:00:22 You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. The entertainment industry has long offered us a vision of the perfect personal assistant, one that not only meets our stated needs, but anticipates needs we didn't even know we had. But these Uber assistants from the preternaturally prescient Radar O'Reilly in the TV show MASH, to Tony Stark's digital know-and-do-it-all Jarvis in Iron Man, have always lived in the realm of fiction or science fiction. That could all

Starting point is 00:01:00 change if Dr. Paul Bennett has anything to say about it. As principal researcher and the research manager of the Information and Data Sciences group at Microsoft Research, he and his team are working to make machines calendar and email aware, moving intelligent assistants into the realm of science and onto your workstation. Today, Dr. Bennett brings us up to speed on the science of contextually intelligent assistants, explains how what we think our machines can do actually shapes what we expect them to do, and shares how current research in machine learning and data science is helping machines reason on our behalf in the quest to help us find the right information effortlessly. That and much more on this episode of the Microsoft Research Podcast.

Starting point is 00:01:54 Paul Bennett, welcome to the podcast. Thank you. You're a principal researcher and you're the research manager for information and data sciences at Microsoft Research. So you wear at least two hats, maybe more. Yes, that's right. Tell us more generally than specifically right now what you do for a living. What big problems are you trying to solve?

Starting point is 00:02:13 What gets you up in the morning? Within the Information and Data Sciences group, we think about two things. First off, information science is about how people work with and use information in every aspect of their lives. Data science, we think about how do we analyze the data and behavioral traces we have from people interacting with data? How can we predict their actions? How can we understand what they're trying to do? What we try to do in the group is actually bring these two together and understand how people

Starting point is 00:02:40 are working with information, how their behavioral traces say what they're doing and what they'd like to do, and what software may not even support them in doing right now that they'd like to do even better. And so, you know, really one of the things that I think about at a very high level of what the group does, it's about taking science fiction and making it reality. So things that you've seen in movies that, you know, 15, 20 years ago you said are not possible, first off it's about making those real, and then it's about envisioning the future and laying the groundwork for how we get to the science fiction of the future. Before we get into the meat of the interview, I want to set the stage by talking a little bit about recommendation engines writ large. It's sort of a huge part of our life now and a part of the science. And we humans seem seriously invested in developing tools that

Starting point is 00:03:26 will predict our next movie, our next purchase, our next turn, our next relationship. So I'd love to get your take. What's our core problem here in making choices and what are researchers doing about it? Right. Well, let me unpack that a little because I think you addressed two different things in there. One is predicting what actions you might take. Another is helping people make decisions. And there are actually two interesting parts that come into artificial intelligence. In terms of recommendation, the variety of recommendation engines that have been developed, you're right, have been targeted at things like advertising or selling you products. Even going back to my days as an undergraduate, my first project was actually on content-based recommendation engines. Turns out, looking back at it at the time, I was actually writing a web crawler to scrape Amazon and recommend books.

Starting point is 00:04:12 This was back in 97 and 98, when Amazon was developing their own recommendation engine at the same time. At the point of time then, that seemed very novel because people hadn't thought of it applied to any task. Now, people have applied it to a variety of things to sell you products. But the next question is, you know, for intelligence, even when we interact with people, you're thinking about what's going to happen next. How will they react to what you're going to say? You're trying to predict the future. And in some sense, that's

Starting point is 00:04:37 what recommendation engines are about. Can we predict the future and actually guess what your next action will be? But more importantly, and this comes to the decision-making question, often the challenge is you don't know all the possibilities and when people are trying to interact with a set of items or a set of possibilities, they don't know all the different capabilities that are out there. So an example of this, you know, just from search engine point of view, if you went and searched for a flight on a search engine and said, you want to fly from Seattle to Austin, and it you want to fly from Seattle to

Starting point is 00:05:05 Austin and it came back to you with a single result at your price preferences at the single best time and showed you this one result, the question is, are you happy? And if you stop and think for a while, what most people come to the conclusion of is, no, I'm not happy. And the reason is you don't know what else is possible. And that's where decision-making comes in. Decision-making is not only about finding what matches your interests, it's about knowing what the other possibilities are and that this is a good decision for you to go forward with. And so when we think about presenting people with information,

Starting point is 00:05:39 we have to give them enough supporting information that they understand where they're going from there. I made a note of a statement I read in a blog on this topic that email and calendar recommendations are underexplored in AI research. So what was the aha moment or inflection point where you decided to do an AI research project to recommend emails instead of shoes to me? So I think there were a couple of small moments leading up to things. One was exactly looking to this question of how people make decisions and what supporting information they need. But then we were doing work specifically trying to understand where the gaps are in intelligent assistance of when you interact with a person, what types of

Starting point is 00:06:22 common sense does the person have that the system is lacking? And one of the key conclusions that we came to was that it was lacking a lot of information about context. And there are a variety of things that go into context, but some of the key variables are time, people, and what it's about. If I know when something is going to happen and where you think it's going to happen, if I know the people that are involved and I know what it's about, those things always exist with email and with calendars or nearly always. The data's there. Sometimes you have very useless descriptions of calendar items like sync or talk.

Starting point is 00:06:56 How many sync emails do you get? It's frighteningly high actually. But you know, they were just creating it because it was in the path of communications of what they're sharing. But the information that it gives the system then to reason on their behalf and actually help them proactively, that just offered a huge opportunity. Reason on their behalf. That's such a big statement. Can machines do that? Absolutely. In very small, limited ways right now. And of course, what we do in research is push that boundary even further. But, you know, when I think about reasoning on someone's behalf,

Starting point is 00:07:29 what I think about is, first off, you want to be able to predict what they're going to need and help them out. So if you think about having, you know, your own executive assistant, a great executive assistant is prepared to answer all of your questions. And that's where a lot of the assistants live right now. They're responsive to you. They know the answers. They come back. An even better assistant starts anticipating your needs. And that's where we go with things like the calendar aware proactive email recommendation, which is I have this meeting. I didn't actually remember what information was relevant. You were able to determine it and you were able to predict I was going to search for this. And instead of wasting my time during the meeting, boom, it's just there.

Starting point is 00:08:08 But then when you really talk about reasoning on someone's behalf, you start anticipating what are they going to wish they would have thought of even after this event where they came to realize they missed an opportunity, the things that you could have actually put in there. So for example, someone sends me a design document for some other project. Doesn't seem relevant at the time. I kind of ignore it, go on with things. I have some other planning meeting for a new project, completely different. Turns out that that other design document that I ignored actually has relevant information. Worse yet, it's going to create a blocking situation six months down the line. I have all the right information at my hands, but

Starting point is 00:08:44 of course I didn't have the time. And at the time that I received the first document, there was no reason to think that it was relevant. And when you talk about reasoning on someone's behalf, it's about saying, look, that information is there. You already have access to it. Can we bring that to your attention and actually say, hey, maybe there's a little part here that if you dig into can prevent huge problems in the future. So I'm intrigued by the technical underpinnings and research methods that might help people like me find, as you phrase it, the right information effortlessly. Take us into the technical weeds for a minute. Are you using existing methodologies in new areas or are there some fundamental explorations in ML methods?

Starting point is 00:09:42 Yeah, so we make use of both. And we also even go beyond that within the group. We try to first take an HCI perspective and say, if we were to try to identify what it is that people need, sometimes how they interact with the system is heavily influenced by its current capabilities. And so how do you even get after what they would use if those capabilities were there or what those needs are?

Starting point is 00:10:04 So for example, coming back to this question I mentioned about time and not understanding time, we said, what is it that people want a system to understand in terms of time? And so we said, well, what would it be like if you interacted with another person? So we can go out and do surveys like that. The problem is there's all sorts of biases to that. So we say, what other types of data are available? So here we had email as a huge opportunity where you're communicating with another person. So we can actually look at how you communicate time to another person and what types of language do you use? As opposed to when you go and set a reminder on, you know, whatever intelligent assistant you use right now, you constrain your language because you think the system is only capable of certain types of things.

Starting point is 00:10:42 And so here's the interesting thing. When you go to an intelligent assistant, the first thing that they want to do when you tell them something is say, what time should I remind you? And it's just this problem, which is, I don't know what time. And so the interesting part of this is that we can actually capture even this kind of soft information

Starting point is 00:11:00 so that we can come back to you proactively. And it turns out as you get near that point in time, you can often resolve that further. More interestingly, if you actually do interviews, which we followed up from looking at how people communicated an email in this way, it's not just about that item. Turns out when you made that note, you probably were doing something else. And your first goal was actually take the note quickly. Let me continue to focus on this item. Right. And so that's one of the things that some of the researchers in our group look at is how do we allow people to focus and be in the moment of what they're doing and realize that what they may currently be trying to take a quick note

Starting point is 00:11:34 on or do something else is not actually the key thing. So that's a set of looking from a very user-facing perspective, understanding where the gaps are. Now let's talk predictively. What are we trying to do in terms of prediction? So there we have information traces that we can work with. So for example, you interact with certain types of meetings in different ways. You send a meeting invite out to somebody. You may reference different links in it,

Starting point is 00:11:57 have different attachments. Those are all indications of what's relevant to that meeting. And so that gives you an opportunity to automatically say, here's some examples of some things that are relevant to that meeting. And so that gives you an opportunity to automatically say, here's some examples of some things that are relevant to this meeting. Then we can look across all of your meetings, emails and documents and say,

Starting point is 00:12:13 can we predict on a regular basis then what is relevant? And so there we would come with typical machine learning techniques where you can start with very simple techniques like simple linear logistic regression. You can move on to more complicated techniques, nonlinear methods like gradient boosted decision trees, neural networks.

Starting point is 00:12:31 It turns out that in this space, the key challenges though are not so much the architecture of the model that you're looking at, but how you deal with the privacy challenges, which is very, very again, different than things like advertising and selling you books. In that case, everybody is buying the same books. In this case, only you have access to your information, which means the traces

Starting point is 00:12:48 that we learn from are only relevant to you. And how we think about how we generalize across users in ways versus within a particular user are key challenges. Yeah, so I'm going to come back to the what keeps you up at night and what are your challenges with privacy a little bit later in our interview, but I'm hearing two kinds of research going on. One is the user research, which is super important to find out what the needs are and what the pain points are. And then you're over here on algorithms and machine learning

Starting point is 00:13:18 and the deep technical weeds of how you take what you've learned and make a machine be able to do it. Yes, that's right. So part of it is HCI, or human computer interaction focused. And even within that field, there are huge challenges in understanding the biases that you deal with when you try to elicit from people what their preferences are, how they'd interact with things. But there's also actually an interesting part that's in between two that connects them. So if you look at a current system, again, how you expressed a reminder to an assistant to give you a certain time, you constrained yourself based on what they expected.

Starting point is 00:13:49 And we see the same thing in every single system. So, for example, in search engines, people don't type questions that look like a question quite often for two reasons. First off, typing it takes a while. But there's a second reason that's more interesting. They don't actually issue those queries because they don't expect the system to answer them. Because they know from experience that those don't lead to success. And so as we add the capabilities that the systems are capable of answering those kinds of questions, there's challenges getting people to use them and to actually, you know, teach them the behavior. It's all about what if, what if something different would have happened? And so we see the same thing

Starting point is 00:14:24 in any given system. What if we would have given you a different set of documents and said, these were relevant to this meeting, would you have interacted with one of these or not? And so there's a line of research within the group looking at counterfactual or causal reasoning that tries to say, okay, this is the data we have, but what we want to reason about is the system we're going to deploy, which is going to behave differently and how you're going to react to that. So how do we take those past behavioral traces and deal with the different biases that

Starting point is 00:14:49 are introduced in terms of how you reacted because of that system, as opposed to how you will react to the new system? So how are you doing that? So there are an interesting set of techniques, and this is, again, you know, new novel techniques being developed as well. Some of the simplest methods look at actually doing small amounts of randomization in terms of minor things. So if you think about the order of documents you're presented, if you add just the slight amount of randomization so that you don't put them in order of your predicted relevance, but a slightly different order, you're now actually able to reason over a broader set of orders that you didn't put in front of people, right? And even small tweaks like that allow you to make changes.

Starting point is 00:15:30 Now, one of the interesting challenges is where we made progress deal with very limited situations like that. Rankings, an item from a small set of items. But as we go more and more to rich interfaces, there are all sorts of questions of how does the presentation affect things. So even simple things like email, email when you do a search may look like a ranking of items, but then it turns out there are conversations. These conversations can be collapsed or not. They can have different types of interface markup. All of these affect your likelihood of interacting with those items based on more attentional factors, but that influences the data that we learn from. And so being able to separate out how your attention influences your interaction versus what we really want to estimate, the relevance and pertinence of the information

Starting point is 00:16:15 to you in order to update the models is one of those key challenges. You have to be willing to kind of understand that there may be a suite of different things that are most appropriate to the situation, as well as being willing to push on the boundaries and develop new techniques when appropriate and finding out where you can actually deal with these different kinds of challenges by adding a slight amount of noise to the process or even showing in you know in other cases we've done research where we looked at and said how much can we introduce randomization without creating an experience that you'd hate, right? Because you could imagine that you actually went into a restaurant to look at a meal. I could put in front of you five choices that you really liked

Starting point is 00:16:55 where I could add a random choice just to learn about what you might like that you didn't know that you liked, right? You might be willing to tolerate that one out of five. Now you walk in and you have five random choices in front of you and a horrible dinner, right? I may have learned a lot about you, but I can't also, you know, I can't run that risk. And so being able to understand the limits of how we can do that online exploration, but also sometimes we can't always do online exploration. We have just what we've observed in the past. And so there we try to identify what are natural experiments, where there's sort of small amounts of variation due to random differences in the system.

Starting point is 00:17:33 A lot of times we have distributed systems in the cloud where things are returned slightly differently because of latency. And those offer natural experiments that allow you to say, can I learn from what just happened to be random chance and better predict the future? And a lot of it comes down to when you learn a model that is influenced by these presentation or selection biases that happen, you deploy this,

Starting point is 00:17:55 and then what you actually find out is it doesn't work. And so it's a false start, and you may repeat and repeat and repeat. And so when you learn from the logs to more effectively estimate what will happen once you deploy it, it's also about rolling out innovation to our customers much more quickly. Efficiency is a huge thing in computer science research, making us more efficient. And when I talked to you before, you said it's not specific enough. You know, efficiency is a step on the path of where we want to go. It's certainly not the end goal. Ultimately, intelligence is about changing outcomes for people, giving them capabilities they wouldn't have otherwise had, or allowing them to essentially be augmented by the system.

Starting point is 00:18:37 And so when I say that, what I mean is that when we make intelligent predictions for people, we assume that there has to be some reason for them to select what we've put in front of them. The first initial case is usually in terms of efficiency. So, for example, you know, if I'm retrieving this information relevant for your meeting, this was going to be information you're going to search for anyway. However, in order to issue that search, you would have to say, you know, from Paul about some particular topic. And then it turns out we've exchanged email about a number of things and you have to find out a way to cut that down. And which email from Paul is the one I remember? And so I've made it efficient for you to identify that, right? But when we think about changing outcomes, so this is the case, you know, where I mentioned the design document, that design document, you didn't even know was relevant. You wouldn't have searched for this. If I actually can put that in front of you and allow you to ask relevant. You wouldn't have searched for this. If I actually can put that in front of you

Starting point is 00:19:25 and allow you to ask questions that you wouldn't have otherwise asked, more generally, when we think about augmenting capabilities, we think about things like, what do our expert users do? How do they interact with the system? How do we take a novice and turn them into an expert, right? So you imagine just starting out your job, you're a fresh college graduate,

Starting point is 00:19:43 you're interacting with Visual Studio, maybe a few other tools you've been searching around. You know, you have your favorite websites for development that you go to, but there's a wealth of other resources you don't know. When we can connect you with those and suddenly accelerate your learning curve as a developer because of the patterns of other people, that really changes your life. It's not just about efficiency. It's about changing what information you have at your hands based on how you're interacting. Let's talk now about this project that you're working on called CAPERS. It's Calendar Aware Proactive Email Recommender System. So tell us about CAPERS. What's new, cool, different about it?

Starting point is 00:20:37 How does it work? Who's using it? How's it been received? Yeah. So in terms of the system itself, it's just this notion that when we look at how you go throughout life, there's a lot of different meetings that you may have. The typical professional has three or four meetings a day. And one of the complaints, of course, is they have too many meetings. But even if we can't just get rid of their meetings, the question is, how can we

Starting point is 00:20:58 actually make their information life better with respect to their meetings? And so part of the things that we see is that people have trouble finding information to prepare for a meeting. They have trouble during a meeting, finding information on demand that they didn't anticipate needing. They have trouble after a meeting, referencing back to things relevant to the meeting. And when we look at meetings, we also see again that there's context, right? There's the context of the people that were involved. The topics usually isn't the meeting name, the time. And so all of this gives us an idea of when you'll need this, what phase are you in? Is it before the meeting and you're likely preparing? Is it during the meeting and you're really trying to respond in the second, which

Starting point is 00:21:38 means that you might be looking for different types of information. Again, it's information you were less likely to anticipate. Is it after? In which case you're looking for follow-ups that were sent out after the time of the meeting more likely than before. And so this really offers us the capability to understand the phase, the types of information you want, and to predict those things seamlessly.

Starting point is 00:21:57 In terms of where it is right now, the Microsoft Search and Assistance Group is looking at productizing this. This particular project is being led up by Abhishek Arun, but he's also working tightly with the user experience folks in Microsoft Office and FAST, which is a group that we have in Norway. So right now, this system is being used internally by a small set of people. We roll things out in phases and you get to start seeing, you know, how much are your assumptions right as people interact with things? How often do you need to update the model because what people need evolves?

Starting point is 00:22:28 Some of these questions I mentioned in terms of presentation and selection biases actually do come up again, because if you have only certain types of positive information, like what was relevant to this meeting, that's biased by what people's search capabilities were at the time. And so there's a tendency to go just towards what they currently were able to find. And so being able to expand that set, all of those things come up as you push it further and further. But yes, the system is usable right now. And hopefully in the future, we'll have it actually in customers' hands as well. And so you've used it. I've used it, yes. What's your experience?

Starting point is 00:23:03 I mean, is it radar-esque for you now? It's radar-esque. I think there's certainly times where I'm just wowed by it, where I didn't expect something and it was there. You know, actually, the more amusing thing, and it's a matter of using the future too, I had this colleague of mine who's leading up Abhishek Arun come give a talk and asked him for the slides that he presented afterwards. And he said, why are you asking me? Just open up your meeting. And so, you know, you also have a, how do you change your own habits as well? But even serendipitous cases where that was something where he had shared access with me, but not sent me the file, but it was there in terms of being relevant to the context. And so

Starting point is 00:23:44 those capabilities where you can wow people, those are really fun chances. What will be fascinating for me if I use it is you can sort through the number of emails that say quick question that I sent or follow up. And that's the entire subject, right? And I can't remember what I said or what I was following up on. So how you present content in terms of giving people a quick summary, that is a key challenge. There's, you know, a part of this that I haven't talked about that Ryan White and Eric Horvitz and I have worked on with Cortana and again, folks in Microsoft Search and Assistance, the Cortana suggested reminders where it looked at your email and said things that you would take action on in the future. So, you know, this is

Starting point is 00:24:24 the kind of case where my wife sends me an email and says i got the theater tickets for saturday night can you get the babysitter and i respond and say yes i'll get the babysitter somebody walks into my office i forget about it saturday night comes and didn't get the babysitter exactly and so if you pop up a small reminder and say to people hey it, it looks like you said you do this. Do you need to block out some time or is it already done? Just that simple item there can actually make them take action. But coming back to your question of what kind of information do you need to give them, you have to give them enough context where they know what it's about. Speaking of that, CAPERS is just one example of where contextual assistance can come into play.

Starting point is 00:25:04 Where else and with whom else are you digging around in this research bin where you say good ideas and deep research understandings turn into product-ready features? Yes. So I mentioned kind of the Cortana suggested reminders as another example, and that's tied into the same space

Starting point is 00:25:20 of how we understand how communications relates to the work you're trying to get done. And that offers key opportunities where we partner with both other groups in research, like the Language Information Technologies Group and the Knowledge Technologies and Experiences Group, as well as our product partners to make those things happen. More generally, even going back to old assistants in the research space like Lumiere that Eric Horvitz was driving back in, I think, the late 90s. One of the lessons we learned is that in order to provide people assistance, you have to be within their current workflow. If your primary application that you

Starting point is 00:25:56 have open and you're using right now is Word, then we need to think about how to integrate with Word. And that's part of thinking about contextual assistance in general. What does it mean to provide assistance in that moment? So other kinds of cases we're looking at, also partnering with groups in office as well to look at applications such as PowerPoint. How do you think about and reuse content across different types of documents? So you might have different slides that you need at one case. How do you refine them and reuse those later? These kinds of opportunities come up in application after application. I mentioned understanding how people search within the office suite. How do we go from search directly to actions? How do we recommend those features in general? Those come up across the whole different suite

Starting point is 00:26:38 of products. What about other verticals like education or, you know, these other big chunks, healthcare, et cetera? Yes. Great question. For the most part, we haven't done much within the group with Like education or, you know, these other big chunks, healthcare, etc. Great question. For the most part, we haven't done much within the group with healthcare a little bit. Education comes a little bit more closely to things that we've been looking at. So we have partners in Windows and Office together with external partners at Harvey Mudd and the University of Michigan looking at what will the reading experiences of the future look like?

Starting point is 00:27:04 And in particular, how does information about people's attention that we can take from things like gaze tracking or cursor movement, which is a proxy for gaze, help us understand where people's attention are, where they're reading, what they're learning or what they may not be learning? How can we change that reading experience to enhance their learning? And I think that offers an interesting capability in general. I mean, even right now, if you look at teachers, again, teachers don't really know what their students are or are not paying attention to. And so this is the case of really thinking about how can we augment those capabilities. That, Paul, was a perfect segue into my next topic here, which is, as I think about all the

Starting point is 00:27:41 data you'll need to collect about me in order to anticipate my needs, including needs I don't even know I have, and all the ways you're going to collect that data, my eye movements, my cursor movements, my documents, my emails, et cetera, I get a little nervous about what I'm trading off and who's looking at it in exchange for this amazing convenience of effortlessly finding my information. So is there anything about what you do that keeps you up at night? I think that there are a host of different privacy challenges and the kinds of methods that we can develop within these spaces versus what we know is the best possible just in terms of prediction. So privacy is always foremost in our minds. We don't look at customers' data. Algorithmically, we may run over the data,

Starting point is 00:28:30 but how we even generalize from that data is limited based on our privacy guarantees. So, you know, think of it like in terms of the algorithm running where I learn from your documents what's important to you. I have to be careful that if I'm going to learn about particular words or phrases that are important to you that I don't transmit to anyone else indirectly through the model that those words or phrases are important to you, right? And so even simple cases like imagine that I knew acquire LinkedIn was a more common phrase in Satya's email box. That would be terrible, right? It doesn't matter even if it was available within Microsoft. It would be terrible. And so, you know, we think about how do we construct guarantees on privacy between our different customers that prevents anything from going in between, right? Even indirectly.

Starting point is 00:29:11 And so, yes, absolutely, the first thing we do is say, you know, we will limit ourselves in terms of the predictive capabilities to make sure that we guarantee that first. Which is an interesting promise and perspective. So how can you educate users, customers, et cetera, that this is actually true, what you're saying? It is safe. You're not going to look in my underpants. You know, I think we just have to, first off, keep communicating that, but then also just be true to our word,

Starting point is 00:29:44 not just in what we do, but in who just be true to our word, not just in what we do, but in who we partner with. You know, we work with a lot of external partners. We ask our partners, how are you working with your data? Who agreed to this? Was this a broad enough agreement? Why would we work with any partner that can't meet our own ethics and guidelines on what we think is acceptable? And so I think, you know, being very clear across the board that this is not just a subject of policy that we work within because we've been told, but deep beliefs we hold is foremost. Tell us a bit about Paul Bennett. What's your background? How did you end up working in information and data sciences at Microsoft Research?

Starting point is 00:30:38 I started quite a while ago, originally looking first at just philosophical problems of how we reason about things. And I started... Was this when you were a child? How far back are we going? Not that long ago. But going back to undergrad, and I was actually looking about how do people reason about court cases? So I not only did philosophy, it was more from the theorem proving side, the logical side of things. And so in undergrad, I did both philosophy and a computer science degree where I looked more at the computer science side again of doing this. And you know, the computer science part of the project I mentioned was to actually build this recommender system as well on the computer science part of things.

Starting point is 00:31:16 But more generally, as I got deeper and deeper into AI, it was seeing how AI could really transform situations and moments. And for me, probably the biggest moment I had with artificial intelligence was I was working on a machine translation project to translate from English to French. And I just finished coding up this particular algorithm we'd been talking about. I'd worked out what I knew were the software systems bugs, and it ran. I typed in English, and I got French out, but I looked at it and I said, I don't know if it's right. This was interesting. And I was actually sitting in an international conference in Finland at the time. And because of the international conference,

Starting point is 00:31:57 there's a number of people around. And I raised my hand and said, does anybody speak French here? Somebody said, yes. And I walked over to them and said, did they say yes or wait? I walked into them and said, does this French sentence say what the English sentence says? And they said yes. And then I typed another one. Does this say the same thing? They said yes. And I did another sentence. Yes. And then I realized I'd written a program that was capable of doing something that I couldn't do. And that is simply magical. When I can just from an algorithmic specification and data create something that, you know, I wasn't capable of, at least in the moment, without lots of study, that kind of capability to be able to make it available for all the moments in our life, that's just a huge opportunity. So how did you end up here? After undergrad at the University of Texas, I went to graduate school at Carnegie Mellon. During my time in grad school at Carnegie Mellon, I actually came out as an intern. And so, you know, if you're an undergrad or a graduate student listening, internships are a hugely important part of your career development. Who did you work with? with Eric Horvitz and Susan Dumais. At the time I was looking at how do we actually do ensembles of machine learning methods. That gave me an opportunity to start engaging with them. And then,

Starting point is 00:33:09 you know, when I went back to Carnegie Mellon, I continued to engage with them throughout the time of my dissertation. And then after I was finished with grad school, I came out and started as a researcher. Carnegie Mellon seems like an incubator for MSR in a lot of ways. A lot of people have come. It's also, I don't think it's completely coincidence in that going back several decades, Carnegie Mellon said, computing cannot just be about computer scientists. It has to be about how computing changes the rest of the world. And so they started actually bringing in other disciplines at the time and saying, what happens if we put people together from different disciplines? How would it apply to this portion of the world? I like to ask all my guests when we come to the end of our conversation to offer some parting thoughts to our listeners.

Starting point is 00:33:54 And it can take a number of forms, whether it's, you know, advice or challenges or inspiration or next steps. How might some of the brightest minds think about joining the efforts in this area, Paul? I think there are a number of different ways that people can contribute. First off, you know, in terms of us, we're open to working with people. We bring visiting researchers through, we start different partnerships. And certainly if you have someone that you'd like to reach out to here, feel free to reach out to them and more than likely they'll get back to you and kind of talk about opportunities. But I think that more generally, part of the computing mindset is to say, where could computing be that it's not? You know, what are the opportunities? Something that you look at and you say, this is not an opportunity for computing. Usually that's not the case. It's

Starting point is 00:34:40 a question of how would you see it differently? How would you turn it into an opportunity? Where is the current friction? You know, I even think about two simple things like during undergrad, I worked for the government for a while and we had these forms, everything had to be on an official form. And you know, at the time they didn't even have this where they could print from a software system to this form within the office. And one of my great innovations at the time was I created templates where you could photocopy onto these official forms most of the things that you needed for each person in the office.

Starting point is 00:35:10 And that again was just an example of the computing mindset, right? It didn't look like a computing opportunity. I couldn't create a word processor for everything there because they wouldn't be the official forms. But it was a question of how do you actually recognize the generality and reapply it? And that can happen in any field,

Starting point is 00:35:24 regardless of what you work in, regardless of whether you're an AI or not, you know, dreaming about the possibilities and then finding a few people who might actually have what you don't have in those other areas to bring it together. And you say, hey, I can bring this from this field. What do you have from yours? Let's talk about where we can actually go together. Paul Bennett, saving time and changing outcomes. Thanks for coming on the podcast today.

Starting point is 00:35:48 It's been great. Thank you, Gretchen. To learn more about Dr. Paul Bennett and how researchers are building contextually intelligent assistants to deliver contextually intelligent assistants, visit microsoft.com slash research.

Your Ad Here

Microsoft Research Podcast - 059 - Building contextually intelligent assistants with Dr. Paul Bennett

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.