Microsoft Research Podcast - 074 - CHI squared with Dr. Ken Hinckley and Dr. Meredith Ringel Morris

Starting point is 00:00:00 On today's episode, we're mixing it up, moving the chairs, and adding a mic to bring you the perspectives of not one, but two researchers on the topic of human-computer interaction. We hope you'll enjoy this first in a recurring series of multi-guest podcasts where we dig into HCI research from more than one angle, offering a broader look at the wide range of ideas being explored within the labs of Microsoft Research. You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. If you want to know what's going on in the world of human-computer interaction research, or what's new at the CHI conference on human factors in computing systems, you should hang out with Dr. Ken Hinckley, a principal researcher and research manager in the EPIC group of Microsoft Research, and Dr. Mary Ringel-Morris, a principal researcher and research manager in the Ability group.

Starting point is 00:01:01 Both are prolific HCI researchers who are seeking, from different angles, to augment the capability of technologies and improve the experiences people have with them. On today's podcast, we get to hang out with both Dr. Hinckley and Dr. Morris as they talk about life at the intersection of hardware, software, and human potential, discuss how computers can enhance human lives, especially in some of the most marginalized populations, and share their unique approaches to designing and building technologies that really work for people and for society. That and much more on this episode of the Microsoft Research Podcast. So I'm excited to be hosting two guests in the booth today, both working under the

Starting point is 00:01:51 big umbrella of HCI, or Human-Computer Interaction. I'm here with Dr. Ken Hinckley from MSR's EPIC group, which stands for Extended Perception, Interaction, and Cognition, and Dr. Meredith Ringel-Morris, aka Mary, of the Ability Group. Ken and Mary, welcome to the podcast. Thank you. Thank you. So I've just given our listeners about the shortest possible description of the work you do. So before we dive deeper into your work, let's start with a more personal take on what gets

Starting point is 00:02:20 you up in the morning. Tell us in broad strokes about the research you do, why you do it, and where you situate it. Mary, why don't you give us a start here? Sure. One of the reasons that I'm excited about working in the space of HCI every day is that I think computers are cool, but what I think is most interesting about computers is how people use computers and how computers can help people and how computers can enhance our lives and capabilities. And so that's why I think HCI is a really important area to be working in. When I did my undergraduate degree in computer science at Brown, they actually didn't have any faculty in HCI. And the first time I heard of HCI was when Bill Buxton, who is actually

Starting point is 00:03:02 a researcher now at Microsoft, came and gave a guest lecture in my computer graphics course. And so we had that one lecture where Bill talked about HCI. And when I heard that, I was like, wow, this is the most interesting thing I've heard about in all my courses. So then I found a way to work on that topic area more in graduate school. Ken, what about you? What gets you up in the morning? Sure. Well, I guess the first answer is my dog and my kids, often earlier than I would prefer. But beyond that, I was interested in technology at a young age, but it was not for the sake of computers themselves, but like, what is this technology actually good for?

Starting point is 00:03:39 And I remember going through a phase early, maybe sophomore year in college, I was very interested in artificial intelligence and so forth. But I started thinking about it. I was like, well, why would you actually want to do that? Like, why would you want to make a computer that's like a human? It's really more about the humans and how can we use technology to augment their thinking and their creativity to do amazing things. So that's what excites me about being at Microsoft Research

Starting point is 00:04:00 and having research that's getting out in terms of devices and experiences where people can do amazing things with the tools and the techniques that we're building. I want to set the stage for the research we'll be talking about by getting a better understanding of the research groups your work comes out of. Can you work in the EPIC group? We referred to what that stands for. And you called your interdisciplinary team a pastiche of different expertise. Since we're using French words, tell us more about EPIC and its raison d'etre. Yeah, I should have been cautious with my use of French words. I did study French in high school, but I've forgotten all of it. And my accent is terrible. But the core sort of mission of EPIC

Starting point is 00:04:41 is to innovate at the nexus of hardware, software, and human potential. So there's really sort of three pillars there. There's the hardware, so we do sensors and devices and kind of new form factors. We explore that like in terms of new technologies coming down the pike. We also look in terms of the software, like what can we do in terms of new experiences? And actually, Mary mentioned Bill Buxton earlier. He's actually one of the people on the team. And so we think a lot about what is happening at this, you know, as new technologies come forward, like how do they complement one another? How do they work together? How do we have not just a single experience, but experience that sort of flows across multiple devices and everything, all the technology you're touching in your day. It's still this incredibly fragmented world that we're trying to figure out interesting ways that we can sense or provide

Starting point is 00:05:21 experiences that give you this flow across all the technology you're touching in your day. And then we also study in sort of the human side, we study the foundations of perception and motor behavior and like these very fine grain things about how people actually perceive the world. And it turns out it's, you know, there's some interesting things that go on there in terms of how you form sensory impressions when you're hearing something and seeing something and feeling something all at the same time, or maybe not in a virtual environment, for example. So there's tricks you can play sometimes. So we sort of explore that as well. I want to drill in just a little before we go on to Mary. I've looked at the scope of your

Starting point is 00:05:53 team. It's a really diverse group of people in terms of where they come from academically and in their research interests. Talk a little bit about the kinds of people on your team and that pastiche that you referred to. Sure. So we have people, you know, people like Bill Buxton, who come more from the design world. He started in electronic music, right? So that was his motivation for technology back in the 1970s. We have people like Mara Gonzalez-Franco, who comes from neuroscience background and gives us some of that, you know, real interesting insight in terms of how the human brain works. We have people from computer vision, like E Ofec, who's leading a lot of work in virtual environments and haptic perceptions. So, I mean, it really kind of spans all of those areas.

Starting point is 00:06:33 We also have people from information and visualization, people from multiple countries, you know, Korea, France, Switzerland. So it really is like kind of people from all over the world. And we really value that diversity of perspective that it brings in terms of what we bring to our work. Right. So Mary, last May, we talked about what was then the newly minted Ability Group on this podcast. It's episode 25 in the archives. But as it's been about a year now since that came out, let's review. Tell us what's been going on over the last 12 months and what the Ability Group is up to now. Absolutely. So as you mentioned, we've already talked about the Ability team on another podcast, and our primary mission is to create technologies that augment the capabilities of people with disabilities, whether that's a long-term disability, a temporary disability, or even a situational disability. And we have a lot of new projects

Starting point is 00:07:25 that we've started in the past year. A big one is the Ability Initiative, which is a collaboration that spans not only many groups within Microsoft Research, but also externally to a partnership with UT Austin. And this project is focused on how we can develop improved AI systems that can combine computer vision and natural language generation to automatically caption photographs to be described to people who are blind or in the situational disability context, maybe someone who's sighted but is using an audio-only interface to their computer, like a smart speaker. And so we're trying to tackle problems like how do we include additional kinds of detail in these captions that are missing from today's systems? And also the really important problem of how do we not only reduce error, but how do we

Starting point is 00:08:20 convey that error to the end user in a meaningful and understandable way? So that's a really big new initiative that we've been working on. Another big initiative that we've just started up with some new personnel is starting to think about sign language technologies. So there's a new postdoc, Danielle Bragg, who joined the Microsoft Research New England lab, and her background and experience is in sign language technologies like sign language dictionary development for people who are deaf and hard of hearing. So actually together with Danielle and also Microsoft's new AI for Accessibility funding program, we organized a workshop here at Microsoft Research this winter that brought together experts in deaf

Starting point is 00:09:07 studies, sign language linguistics, computer vision, machine translation, to have a summit to discuss kind of the state of the art in sign language technologies. What are the challenge areas? What is a plan of action going forward? How can the Microsoft AI for Accessibility Initiative accelerate the state of the art in this area? So that's a really exciting new area of work that we're interested in. Give me an example of where sign language technology plays. I'm actually trying to envision, how would it manifest for a person who had hearing disabilities? Right now, the state of the art is very far from realizing automatic translation between English and sign language.

Starting point is 00:09:53 It's a very complicated computer vision program and a very complicated machine translation program because since sign language doesn't have a written form, there are no existing paired corpora. So for example, you know, many famous works of literature or newspaper articles, you know, exist online in say both English and French, since we're talking about French today. And so that gives you data from which you can begin to learn a translation. But since there's no written data set of American Sign Language, that becomes very difficult to do. And so how you might generate training data is one of the challenges that we discussed at the workshop. But if you could have a computer vision-based system, a machine translation system that could recognize sign language and provide English captioning, that could be an important communication aid for many people. And then, of course, vice versa in the other direction. If you could go from English to generate a realistic animated avatar that signed that same content, that would actually be really important for many people because many people who speak sign language,

Starting point is 00:11:01 sign language is their primary language. English is a second language for them. And so their English literacy skills are often lower than people who learned English as a first language. So just English captions on videos or English language content on the web is often not very accessible to many people who speak sign language. And signing avatar translations of that content would open up web accessibility. But developing these avatars to sign in a realistic fashion and signing involves not only the hands, but facial expressions. And so actually generating that nuance in an avatar is a very challenging research problem. I've had a sizable number of your colleagues on the show, all of them have stressed the importance of understanding humans before designing technology for them. And to me, this makes sense. You're in

Starting point is 00:11:50 human-computer interaction, human being the first word there. But it's remarkable how often it seems like nobody asked when you're using a technology. So give us your take on this user-first or user-centric approach that comes out of this umbrella of HCI, but particularly within your groups. Why it's important to get it right and how you go about that. Ken, why don't you start? Sure. I do subscribe to the user-centric approach to interaction design. Because you really have to understand people before you can design technology that fills this impedance mismatch between what you can do with technology and sort of the level at which people operate and think about concepts.

Starting point is 00:12:30 However, in my own research, I often take sort of a, I call it a pincer maneuver strategy, right? So you think about, you know, what are the fundamental human capabilities? How do people perceive the world? How do you interact socially with other people? But then you can sort of couple it with, you see things coming down the road in terms of, oh, there's these new technologies coming out, there's new sensors coming down the pike. And so what I often try to do is I try to match these trends that I see

Starting point is 00:12:52 converging. And when you can find something where, you know, say a new sensor technology meets some need that people have in terms of how they're interacting with the devices, and you can sort of change the equation in terms of making that more natural or making the technology sink into the background so you don't even have to pay attention to it, but it just seems like it does the right thing. I like to do that. So I really do play both sides of the fence there where I study people and I try to understand as deeply as possible what's going on there, but I also study the technologies.

Starting point is 00:13:19 So sometimes I do have work where it's more technically motivated first, so it's a bit contrary in that sense, but it always ends up meeting like these real world problems and real world abilities that people have and trying to make technology as transparent as possible. Mary, how about you? I can give actually a great example from the sign language workshop that we were just talking about of the importance of this user-centered design. So one of the important components of this workshop was having many people who are themselves deaf, who are sign language communicators attend and participate in this ideation and this session. And one of the themes that came up several times that speaks to the importance of user-centered design was the example of sign language gloves. a few examples of very well-intentioned technologists who are not themselves signers

Starting point is 00:14:05 and who didn't necessarily follow a user-centered process, who have invented and reinvented gloves that someone would wear that could measure their finger movements and then produce a translation into English of certain signs. But of course, this approach doesn't take into account the fact that sign involves many aspects of the body besides just the hands, for example. So that's one pitfall. And then, of course, also from a more sociocultural perspective, another pitfall of that approach is it thinks about sign language technology from the perspective of someone who's hearing. So it's placing the burden of wearing this very awkward glove technology on the person who is deaf. Whereas, for example,

Starting point is 00:14:45 maybe a different perspective would be people who are hearing should wear augmented reality glasses that would show them the captions and the people who are deaf wouldn't be burdened by wearing any additional technology. And so I think that kind of perspective also is something that you can only gain by a really inclusive design process that maybe goes beyond just involving people as interviewer user study participants, but also actually having a participatory design process that involves people from the target communities directly as co-creators of technology. Yeah. Ken, did you have something to add? Yeah, so maybe one more thing I'd add in terms of my own perspective.

Starting point is 00:15:24 So I sort of mentioned how it's really important I'd add in terms of my own perspective. So I sort of mentioned how it's really important to understand people and sort of what goes on. But I think one of the interesting or unique attributes of that is I think we're trained here in Microsoft Research in terms of understanding people is we sort of become these acute observers of the seen but the unnoticed. So there's lots of these sort of things that we just take for granted in terms of interpersonal interactions. Like if you're at a dinner party and you're talking to a group of people, like, well, you're probably forming a small circle and probably there's five or less of you in the circle, right?

Starting point is 00:15:51 And there's a certain distance that you stand and you're facing each other in certain characteristic ways. None of this is something that we ever notice. Another example I like to use from my own work and sort of working on tablets and pen and touch interactions is actually just looking at how people use everyday implements with their hands. And my talks often ask people like, okay, well, which hand do you write with? And of course, you know, 75% of the audience will raise their right hand because they're right-handed. And of course I'll say, well, actually you're wrong because you use both hands to write, because first you've got to grab the piece of paper and then you orient it with your left hand and then the right hand does the writing. So there's examples like that in terms of what people actually do with their behavior

Starting point is 00:16:25 that because we just take them for granted and it's just part of our everyday experience, you don't actually notice them. But as a technologist, you have to notice them and say, oh, if we could actually design our interactions to be more like that, it'd be natural that technology would be transparent. Well, let's talk about research now. Ken, I'll start with you, and maybe the best approach is to talk about the research problems you're tackling around devices and form factors and

Starting point is 00:16:59 modalities of interaction, and how that research is playing out in academic paper and project form, especially with some of the new work you're doing with posture-aware interfaces. So talk kind of big picture about some of the projects and papers that you're doing. Yeah. So in terms of the global problem we're trying to solve, we've been thinking for a long time about these behaviors that are seen but unnoticed. Everyone goes through the day using maybe their mobile device, or if you have a tablet, you interact with that in certain ways. And there's always sort of these little irritations that maybe you don't really notice them, but over time they build up. And using technology, we can mediate some of these. So to go back, you know, actually almost 20 years now, we were looking

Starting point is 00:17:36 at mobile devices and you had to go through certain settings. You could go into the settings, say like, oh, I want it to be in portrait mode now, if you're taking some notes, or maybe you need to look at a chart. You had to go back in the settings, say, oh, I want it to be in portrait mode now if you're taking some notes. Or maybe you need to look at a chart. You had to go back in the settings and say, oh, I want it to be in landscape orientation. And so we started looking at, oh, if we had some sensors like a little accelerometer that was on the device, maybe we could actually just sense which way you're holding the device and the screen could automatically rotate itself.

Starting point is 00:17:57 So that was something that you could publish as an award-winning paper in the year 2000. And now it's sort of an everyday use. So taking that same perspective now to the modern era of interacting with tablets, how do you actually understand how the person's using their device and what they're trying to do with it? So tablets have this interesting attribute where you can use them directly on your desktop. Maybe you're doing some very focused work. You can kind of be leaning over it and writing with your pen, maybe marking up some document, you know, or maybe you're just kicked back on your

Starting point is 00:18:23 couch and watching like YouTube videos of like cats chasing laser pointers right so it kind of in that whole spectrum but obviously it's particular to the situation right so if you're kicking back on your couch you're probably not trying to mark up your excel spreadsheet and likewise if you're kind of hunched over your desk you're probably not watching the cat video so by sensing those contexts and actually adapting the behavior of the tablet so it understands how you're holding it which hand you're grasping it with are you holding it on the bottom are you holding on the left of the tablet so it understands how you're holding it, which hand you're grasping it with. Are you holding it on the bottom? Are you holding it on the left or the right? Are you holding the pen in your hand? Are you reaching for the screen? By being able to sense all those attributes, we can actually simplify the experience and sort of bring things that are

Starting point is 00:18:55 useful to you at those moments. Well, and it's interesting that you referred to that toggle between landscape and portrait mode, which is actually annoying to me. I sometimes lock my phone because I don't want it to go there. I want it to stay there. But anyway, that's a side note. It's a side note, but it's funny because we actually noticed that when we first built that technique

Starting point is 00:19:12 and we kind of knew that was a problem with it. But actually in our most recent research, now we can also sense how you're gripping the phone. And so you can understand that if you lay down and you're gripping the device is not changed, we know that you didn't intend to reorient it. So we can actually now suppress screen rotation. Finally finally 20 years later. That's exactly the problem because I would lie down and want to read it sideways and then it would go like that with my head.

Starting point is 00:19:32 Well, Mary, let's talk about the work you're doing in the accessibility space, particularly for people with low or no vision. We might expect to see this manifested in real life, I like to say IRL, or on the web or other screen applications. That's just sort of how I think about it. But you've taken it even further to enhance accessibility in emerging virtual reality or VR technologies, which is fascinating to me. So give us the big picture look at the papers and projects you've been working on. And then tell us about Seeing VR, which is both the paper and the project, as I understand it. I think this project initiative around accessible virtual reality is trying to think, OK, virtual reality isn't that commonly in use right now, but in 10 years, it's going to be a big deal.

Starting point is 00:20:18 And we want to think about accessibility from the beginning so that we can design these systems right the first time instead of trying to have post-hoc accessibility fixes later. So part of the aim of this work beyond the specific prototypes like the controller or seeing VR is really to start a conversation and just raise people's awareness and be provocative and have people think, oh, we need to make VR accessible. So if people only remember one thing, that's what I want them to remember, separate from the details of the specific projects. But yes, so last year, we presented at the CHI conference, the cane troller, which was a novel haptic device that allows people who are blind to our white cane users, IRL,

Starting point is 00:21:03 to transfer their real life skills into the virtual environment in order to navigate and actually physically walk around a VR environment with haptic sensations that mimic the sensations they'd get from a white cane. And then that project was led by our fabulous intern, Yuhang Zhao from Cornell Tech, and also with strong contributions from our intern Cindy Bennett from the University of Washington, who is herself a white cane user. So she had some really important design insights for that project.

Starting point is 00:21:31 And so Yuhang returned last summer for a second internship, and she wanted to extend the accessibility experience in VR based on her passion around interfaces for people with low vision. So low vision affects more than 200 million people worldwide, and it refers to people who are not fully blind, but who have a visual condition that can't be corrected by glasses or contact lenses. And so working together with Yuhang and also several people from Ken's team, like Eyal Ofec and Mike Sinclair, as well as Andy Wilson and Christian Holtz from MSRAI, and also Ed Cutrell from my team. This was really a big

Starting point is 00:22:10 effort. We developed Seeing VR. And so Seeing VR is a toolkit of 14 different options that can be combined in different ways to make VR more accessible. And we went with this toolkit approach because low vision encompasses a wide range of abilities. And so we wanted people to be able to select and combine tools that best met their own visual needs. And the great thing about seeing VR is that most of these tools can actually be applied post hoc to any Unity VR application. So Unity is the most popular language for developing VR. And so even if the application developer hadn't thought about accessibility beforehand, we can still apply these tools. And so, for example, the tools do things like increasing the magnification of the VR scene,

Starting point is 00:22:58 changing the contrast, adding edge detection around things that you can more easily tell the borders of different objects, being able to point at objects and hear them described out loud to you, special tools for helping you measure the depth of different objects in case you have challenges with depth perception. And so we'll be presenting Seeing VR at CHI this year, and we'll not only be presenting the paper, but we'll also present a demonstration. So people who want to actually come and try on the VR headset and experience the tools directly will be able to do so. Well, that's a perfect segue into talking about CHI, the conference. It's a big conference for HCI, maybe the biggest. Yes, it is. And Microsoft Research typically has a really big presence there. So talk about the conference itself and why it's a big deal. And then give

Starting point is 00:23:44 us an overview of what MSR is bringing to the party in Scotland this year in the form of people, papers, and presentations. Sure. So in terms of, you know, some of the papers that people on my own team touched, yeah, we have a couple of the Honorable Mention Awards, which is sort of recognizes the top 5% of papers appearing at the conference. So one in particular, we talked about the, you know, censoring postural awareness in tablets. Another one that's a really fun effort is using ink as a way to think with data, right? So think by inking kind of thing.

Starting point is 00:24:12 So if you have some sort of visualization that you're looking at in your screen, what if you could just sort of mark up some data points and simply by marking it up as you're thinking about and just sort of glancing at these visualizations. So then you take sort of these simple marks that people would do anyway in terms of using very pen and paper-like behaviors, but now translated

Starting point is 00:24:30 to a digital context on your tablet, for example. I can just mark something up with my pen, and then I can use those marks as ways to actually link the data points back to the underlying data. Can I actually split my data set just by drawing a line across it as opposed to doing some complicated formula? And that's our Active Ink project. So instead of just having sort of dead ink on paper, you can actually imbue it with this digital life that

Starting point is 00:24:53 actually it just naturally how people think, but then you can just start going deeper and deeper in terms of it actually touches live data that's underneath. Go ahead, Mary. What I wanted to point out was the paper at CHI from Microsoft Research that I find most exciting this year is one of the best paper award winners, the Guidelines for Human-AI Interaction work that was led by Salima Amershi, who's a researcher here in the Redmond Lab. I'm really glad this paper won the best paper award because it has a lot of immediate practical value for people in the HCI and AI communities. So if you're going to read one paper from CHI this year, listening audience, I'd suggest you read this one. It has a list of 18 very concrete, actionable guidelines for developing AI systems that are meant to be used by real people

Starting point is 00:25:38 in a way that's pleasing to the end user. And it has lots of good examples and counter examples of applying each of these guidelines. So I think it's a really great read, and I'm glad it was recognized with the award. Any other highlights that you want to talk about? I mean, you've got colleagues. Stephen Drucker has a couple papers, I think, in there, and Cecily Morrison, who's out of the UK lab. I mean, there's tons of great stuff. And in a sense, it's an embarrassment of riches because even myself being at Microsoft Research,

Starting point is 00:26:07 I haven't had a chance to look at all this work. I haven't looked at the HCI AI paper that Mary just mentioned. So now it's like, I learned something from this podcast too. I need to go read this. Yeah, in terms of other areas that we're addressing, there's actually quite a bit of work

Starting point is 00:26:19 on virtual environments coming from Microsoft Research. So just looking at different ways that we can manipulate things in VR in very unique and sort of crazy and creative ways. So we have work exploring that. We also have work exploring in terms of if you're in a virtual environment, you actually want to reach out and touch something,

Starting point is 00:26:34 your hand is just an empty space. So how do you give the illusion that there's actually objects you can interact with and give you sort of dexterous ways to manipulate them? So we've been doing a series of technologies around how to simulate the grasping motions, for example. So you can actually feel an object and you can squeeze it and it feels like it has compliance. We have other work looking at personal productivity when you have wearable devices and just numerous other topics, even in terms of like, oh, can you use the language of

Starting point is 00:26:58 graphic novels as sort of a way to present visualizations to people? So you sort of have these data comics or data tunes, we call them. Just exploring that as sort of another language for interacting with data. So there's all kinds of great stuff going on. So you mentioned Cecily Morrison. So Cecily and her collaborators from the UK lab will be presenting a demo at CHI this year of Code Jumper, which I believe you interviewed her about in another podcast. It was Project Torino then. It's been renamed. But that's a tangible toolkit for children who are blind that they can use to actually learn programming skills themselves. And so that will be on demo at CHI for people to try out. I think besides all the papers, another thing to point out is that many people from Microsoft Research are involved in organizing workshops. And so one that I'll call out is there's a workshop. It's called the Workshop on Hacking Blind Navigation

Starting point is 00:27:50 that discusses the challenges of developing indoor and outdoor navigation technology for people who are visually impaired. And Ed Kutrell from the Ability Team is one of the co-organizers of that workshop. But that workshop is going to feature a panel discussion that includes Melanie Neisel, who is an engineer on the Soundscape team here. And so, as you may be featuring some research from an intern, Annie Ross from the University of Washington, who was co-advised between the Soundscape team and the Ability team. And what Annie was looking at was a very different take on accessible virtual reality. So she was thinking about designing an audio-only virtual reality experience that would allow a Soundscape user to practice

Starting point is 00:28:47 walking a route in the comfort of their own home or office before they actually walk that route in the real world. And it's a very challenging design problem. And so she'll be discussing that at the workshop also. So it's about now that I always ask the infamous what keeps you up at night question. Your work has direct human impact, perhaps more directly than other research that we might encounter, and often with people who are marginalized when it comes to technology. Mary, I'd like you to start this one since I know you're involved in work around AI ethics and some of the work that's going on in the broader community about that. So what are your biggest concerns right now and how and with whom are you working to address them? Yes. So there's been increased awareness both in the industry and in the public consciousness over the past year or so about some of the fairness and ethics challenges of AI systems. But a lot

Starting point is 00:29:59 of that discussion, rightfully so, has focused on issues around gender and race but one thing that I've noticed in these conversations is that there hasn't been any discussion around these issues as far as how they might impact people with disabilities and one goal of mine for the coming year is to really elevate awareness and conversation around this issue so So for example, are people with disabilities being included in training data? Because if they're not, AI systems are only as good as the data. So as an example, if you think about smart speaker systems like an Alexa or a Siri or a Cortana, do they recognize speech input from someone with a speech disability? So if you have dysarthria or stutter or some other atypical speech pattern, and the answer in general is no, because people with those speech patterns

Starting point is 00:30:53 weren't included in the training data set. And so that locks them out of access to these AI systems. Well, you might want to add children. There's a video about a little girl asking Alexa to play Baby Shark and never, ever gets it until the mom comes in and says it in the right way. Well, absolutely. And that goes back to a longer historical issue. The original speech recognition systems were trained only on men, so they didn't even recognize women's voices at all, if you want to look back several decades. So yes, there are many groups, people with accents, children, that all come to mind. And so that's a big issue.

Starting point is 00:31:26 I think another example of an ethical issue, particularly for sensitive populations like people with disabilities, is expectation setting around the near term capabilities of AI systems. I think a lot of the language that's used by researchers and marketing people when we talk about AI leads people to overestimate the capabilities of AI systems. I mean, even just the word artificial intelligence suggests a human-like semantic understanding, which really today's AI systems are all about deep learning. It's all just statistical pattern recognition with no semantics at all. And so I think if we want to go back to the sign language example, I feel like every month I see some kind of headline that says, you know, researchers have invented a sign language recognition system. But then if you actually read between the lines, you know, they've invented a system that can recognize 50 isolated signs, which is nowhere near recognizing a whole language. And I think for people whose lives may be fundamentally changed by these kinds of technologies, being careful about how we communicate the capabilities of AI to those audiences is really important. So in terms of going forward with this issue of raising awareness around issues around disability and AI fairness, I've been collaborating with Megan Lawrence, who's from the Office of the Chief Accessibility Officer, as well as Kate Crawford from the New York City Lab. We've been organizing some external events to raise awareness

Starting point is 00:32:50 around this issue. Megan led a panel discussion at the recent CSUN conference, and the three of us co-hosted a workshop last month at the AI Now Institute. So I think you'll be hearing more about this topic in the future. Ken, aside from having a French test that you didn't study for, is there anything that keeps you up at night? Anything that we should be concerned about? In my research, it doesn't keep me up at night. It's kind of, I'm really looking for what are some unique ways that we can add to human potential. So for me, it's a very satisfying and very exciting topic to think about. So it doesn't keep me awake. It kind of gets me excited at night, or maybe it keeps me up because I'm just,

Starting point is 00:33:30 my mind keeps buzzing with creative new ways that we can use technology to kind of amplify human thought to make it so people can express new and better or maybe more expressive concepts than they would with pen and paper and physical materials. I often like to return to, you know, Kranzberg's laws are sort of a way to think about this, and particularly the first one where, you know, basically the gist of it is that technology is not inherently good, nor is it bad, but nor is it neutral, right? So it just kind of exists out there and it's often, you know, human desires and needs and those shape how technologies get applied. And we need to think really carefully about that when we're bringing something new into the world. So yeah, so I don't really like the approach of like,

Starting point is 00:34:08 oh, let's just build something and kind of hope it works out. You really need to think carefully through like, what is the human impact? And is this actually going to empower people and make it so they can do new things? I love stories. And one I'd like to hear is how you two know each other. And another is how you each ended up at Microsoft

Starting point is 00:34:25 Research. So Ken, tell us your story. And then we'll have Mary talk about how she, in her own words, wormed her way in here. Yeah, so I was a graduate student at the University of Virginia. And I was a computer science student, but my desk was actually situated in the neurosurgery department. And most of my funding was through some grants that the neurosurgery department had, because they had these really interesting medical image data sets of, you know, MRI scans of patients that were coming in. And the lab I was working with was actually working on ways to create new tools for neurosurgeons so they could actually access these things directly and look at them. So they were looking for tools to visualize and plan surgeries. So that was my

Starting point is 00:35:01 entree into the world of HCI and research. So George Robertson had been sort of one of the pioneers of 3D interaction at Xerox PARC. He had moved over to Microsoft Research. He was starting up a team here. And so he was like, hey, Ken, you want to come work on these things? So we started doing that. And just over time, because there's so many great things going on here in Microsoft Research, there's lots of opportunities to jump to different topics. So I started working with Mike Sinclair. and he was saying like, hey, look, I got this new accelerometer sensor and maybe you should try putting that in your phone stuff that you're looking at. And so that's where this automatic screen rotation came from, for example. So that's sort of how I ended up here

Starting point is 00:35:35 on this track of exploring devices and so forth. That's actually a great segue into my story, the automatic screen rotation. So that's how I met Ken. I don't even know if Ken knows this story because it was very important to me at the time. So I was a student at Stanford in the computer science department doing my PhD, and it was my first year of school. And Ken was an invited speaker for the seminar series. And Ken spoke about this work that he and Mike had done on attaching the sensors to the phone with the rotation. And it was an amazing talk, an amazing project, but also it was the first time I had heard of Microsoft Research because I was a brand new grad student. And so I was really impressed. And after that talk, I thought,

Starting point is 00:36:13 oh, I really want to get an internship at Microsoft Research. But it's a very competitive internship, and usually they want students who are further along in their career, who already have a track record of publication. So I wasn't really a competitive applicant on paper. And so then a couple months later, I attended the HCIC workshop. And at this workshop, I met a different Microsoft employee who invited me to come interview for a more product-oriented internship. And so they were going to fly me up from California to Seattle. So I took advantage of the fact that I was going to be in Seattle and I mailed Ken. And I said, Ken, you know, I'm a student at Stanford. I heard your talk. It turns out that I'm going to be in the Seattle area next week. While I'm there, could I come talk to you and your group about internship opportunities? And so I kind of snuck my way in the back door to having this on-site interview at the time.

Starting point is 00:37:09 And I did get the internship. And I ended up actually spending my time that summer working with Eric Horvitz and Susan Dumais, who were part of the same team. But it was because of seeing Ken's talk that I managed to worm my way into Microsoft Research. And then you went back to finish your studies and then came back here? Right. So because I had such a great experience at that internship, I was very focused on becoming a professor of computer science. But I had a great experience at my Microsoft Research internship. And so Eric and Sue and all the people that I worked with really encouraged me to come and also interview with Microsoft Research.

Starting point is 00:37:48 And so that's how I ended up back here. Ken, what do you remember of that? That's fantastic. I do remember talking with Mary after I gave this talk. Her advisor, Terry Wintergrad, had invited me down to give a talk. Another sort of wonderful person in our field. But yeah, but I was definitely impressed by Mary. She asked me a lot of great questions after the talk.

Starting point is 00:38:05 And you know, sometimes you get students coming up afterwards, but then they don't really say anything that kind of gets to the next level. So I remember Mary asking a lot of really good questions. And so I was very interested in bringing her in as an intern. But as it turned out, I didn't actually have an intern slot to offer anybody that summer. So it's great that she sort of managed

Starting point is 00:38:20 to work her way into actually getting in the building. And I think Sue was the one who actually had the internship to offer. And Sue Dumais, I don't know if she's been on these podcasts or not, but she's also wonderful and sort of one of the people here at Microsoft Research that I look up to the most. So it was even better than the opportunity with me,

Starting point is 00:38:34 but working with Sue is wonderful. And Mary's just continued to blaze new trails in just every topic that she touches. So it's been great. Mary, what's fascinating about what you just said, and you're not the first one who's told a story about a kind of bold move. It's like, take a risk. Why not?

Starting point is 00:38:52 Right? Absolutely. That's how I warmed my way into grad school, too. I had heard Bill Buxton's lecture about HCI. We didn't have HCI at Brown. I did some searching about HCI on the internet and came across one of Terry Winograd's projects at Stanford. And so just out of the blue, I emailed him and said, I read about your project on the internet. I'm an undergraduate in computer science.

Starting point is 00:39:17 I really want to get to know about HCI research. Can I come spend the summer with you working? And he said, yes. And that was how I got involved in HCI research, did a good job. He offered me to come back for graduate school. But if I hadn't sent that email, it never would have happened. So I guess the lesson is be bold and advocate for yourself. Well, you kind of just answered the last question I'm going to ask you, which is at the end of every podcast, I give my guests a chance to talk to our listeners with some parting thoughts, maybe wisdom advice, and stories about how being bold got you what you wanted. So here's your chance to say something you believe is important to researchers who might be interested in making humans and computers have

Starting point is 00:40:00 better interactions. Ken, what do you think? I think for me in terms of like what I see working well and having good research outcomes, I see lots of people, they try to sort of plan out this grand scheme of like, oh, I'm going to do this and this and this in my PhD. And you kind of outline everything to death. But where the really exciting sort of nonlinear breakthroughs come in is you just start walking down this path. And at some point you're kind of beyond the area where the lights are from the city, right? And you're kind of out in the wilderness in the darkness and you're like, well, I'm not quite sure where to go, but you just start trying things, right? You basically just try ideas, see what works, see what doesn't work. And then from that,

Starting point is 00:40:35 you take the next step. And if you're kind of willing to just take this approach where don't try to plan anything out, but be willing to walk off into the darkness in terms of just exploring unknown terrain. That's where the really interesting things come up. Mary, what do you say? I guess my advice would be around just being curious. And as part of that, asking questions. I know that for years, I think it was at least five years after I had my PhD,

Starting point is 00:41:00 before I felt confident enough to ask a question during a talk, instead of just waiting afterwards to talk to the speaker one-on-one because I was worried like, oh, if I ask this question, will people think I'm stupid or they'll think I don't understand? And after I asked questions, you know, I realized that like lots of other people had the same question as me. So not only did I learn new things, but I helped enrich dialogue at the conference and the conference experience for other people. And a lot of times asking these questions leads to really good conversations and collaborations and new research ideas afterwards. So I think my advice is to ask questions.

Starting point is 00:41:34 That's what research is all about, isn't it? Ken Hinckley, Mary Ringel-Morris, thank you so much for joining us today. It's been really fun to have two people who play off each other and have great stories. Thanks for coming on the podcast. Thanks for having us. Thanks very much. To learn more about Dr. Ken Hinckley and Dr. Mary Ringel-Morris, and their up-to-the-minute research in human-computer interaction, visit microsoft.com slash research.

Microsoft Research Podcast - 074 - CHI squared with Dr. Ken Hinckley and Dr. Meredith Ringel Morris

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.