Microsoft Research Podcast - 025 - Advancing Accessibility with Dr. Meredith Ringel Morris

Starting point is 00:00:00 In the past, there was what we call a medical model of disability. And this model proposes that people with disabilities are ill and need to be fixed. More recently, thinking has really evolved toward a more social model of disability. And so the social model emphasizes that many people with disabilities are only disabled by the way that society treats them. So that philosophy of the social model rather than the medical model of disability is something that the Ability and Enable teams really embrace in our thinking about technology.

Starting point is 00:00:39 You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. With 7 billion people on the planet, you might be surprised to learn that approximately a billion of those people experience some form of disability. Enter Principal Researcher and Research Manager, Dr. Mary Ringel Morris and the Ability Group at Microsoft Research.

Starting point is 00:01:10 They're working to remove accessibility barriers both to and through technology, empowering people with disabilities to better perform their daily tasks. Today, Dr. Morris gives us some fascinating insights into the world of ability, talks about how technology is augmenting not only sensory and motor abilities, but cognitive and social abilities as well, and shares how Microsoft, through its AI for Accessibility initiative, is committed to extending the capabilities

Starting point is 00:01:37 and enhancing the quality of life for every person on the planet. That and much more on this episode of the Microsoft Research Podcast. Mary Ringel-Morris, welcome to the podcast. Thank you. Glad to be here. So your research falls under the general umbrella of human-computer interaction, or HCI, but you're a principal researcher and a research manager of a relatively new group in Microsoft Research. Tell us what it is and what gets you up in the morning. team and is a new group that's focused on harnessing research at the forefront of HCI, human-computer interaction, and also AI, in order to augment the capabilities and extend the abilities of people everywhere, and in particular, the approximately 20% of the population that has

Starting point is 00:02:41 some form of disability. So let's drill in there a minute. You have referred to some pretty surprising statistics. Give us a picture of the ability spectrum from differing levels, categories, and lengths even of disabilities. So the most recent U.S. Census indicates that about one in five Americans experience some form of disability in their daily lives. And statistics from the World Health Organization indicate that about one billion people worldwide are disabled in some way. And disability refers to a spectrum of different conditions that include sensory disabilities, such as challenges with vision and hearing. It includes motor disability that can include speech and language disabilities, as well as

Starting point is 00:03:32 cognitive disabilities, such as learning challenges, autism, attention challenges, etc. And then, of course, in our own research, I think it's really motivating to think about not only how our research advances can specifically assist those disability constituencies, but actually how advances in research for people with disabilities tend to benefit the usability of technology for everyone. because all of us at some point in our lives experience temporary or situational disabilities. A temporary disability in the motor space might be if someone breaks their leg. So they might be using crutches or a wheelchair for a short period of time and of course would then benefit from technologies that assist people who have more lifelong motor disabling conditions. But even situational impairments that are much more fleeting affect all of us. If someone is pushing a grocery cart or pushing a baby stroller, just in that moment, they also, in a sense, have a motor disability. You know, that's interesting because I don't think most of us think about disability

Starting point is 00:04:40 in terms of that it might be temporary and that all of us at some point experience some kind of disability. So I think it's interesting to broaden the vision of what that means and how technology can help us there. So talk a bit about how your technologies are aiming to help us. So the Ability Group has projects focusing in several different areas. In the cognitive space, we've been doing some work on how to make search engines easier to use for people with dyslexia. So dyslexia is a spectrum disorder. It impacts about 15 to 20% of English language speakers. The incidence rate is different

Starting point is 00:05:26 depending on different languages, but 15 to 20 percent of the population is quite substantial. And people with dyslexia experience a range of challenges with spelling and reading comprehension. And of course, search engine use to find information online is, I would argue, one of the most important modern literacy skills today. And so we've been doing some basic studies to understand, first, what are the challenges that people with dyslexia experience at different stages of web search? So choosing their query terms, triaging the information on the search engine results page, actually finding specific nuggets of information within a web page once they've arrived at it, and then thinking about how we might change both the ranking algorithms for search results themselves, as well as the interface for

Starting point is 00:06:16 presenting those results to make it easier to use for that group. And I think that that's also another really interesting example of where thinking about a particular disability, such as dyslexia, might yield a user interface that impacts a much larger group of people. Children who are still just learning to read or people learning English as a second language might also benefit from many of these same interventions that would benefit users with dyslexia. Would you look at those as applied to a broad audience? I mean, if you're using algorithms that are on your search engine, everyone's going to experience them, right? From the person who's an early reader to someone who's a PhD. Well, I think that's still a design decision to be made is about whether you would want these interface designs to be applied generally across all users of the service

Starting point is 00:07:06 or whether it might be another form of personalization. So today, major search engines generally apply different types of personalization. So if you and I both were to search for hamburger restaurant and I was searching from Seattle, according to the geolocation of my browser, and you were searching from London, we would actually see different search results because the results are actually being tailored to something about us, in this case, our location. So you could imagine situations where search results might be tailored to someone's cognitive profile. Of course, that also raises potential privacy concerns. Yeah, we'll talk about that in a bit.

Starting point is 00:07:47 Is definitely something we're aware of. And so I think there are interesting trade-offs in how you might personalize or generalize these kinds of interface designs. There's another group within MSR under the umbrella of human-computer interaction called Enable that you collaborate with quite often. Why two groups in the sort of the same zone? And how would you characterize the research going on in each group? Absolutely. So the Ability team and the Enable team work together closely. The Ability group's mission is in more foundational academic work, more fundamental scientific research that then lays the basis for advanced engineering efforts and shipping more public-facing products by the Enable team. So for example, the Enable team shipped the hands-free keyboard, which was inspired by their collaborations with Steve Gleason and Team Gleason, which is a foundation for people with ALS or Lou Gehrig's disease. And so the

Starting point is 00:09:00 hands-free keyboard is an improvement on eye gaze interaction for typing and then speech synthesis. And so the Enable team led the engineering efforts on that product. But behind the scenes, they were collaborating with myself and other members of the Ability team. And we did more underlying academic research that we shared broadly with the community that influenced that product. We did research to understand what are some of the limitations of eye trackers for different users and different lighting conditions and how might interfaces adapt to those changes. We looked at new algorithms for mitigating error in eye typing that can now let users eye type more quickly because they have to spend less time correcting errors. And we also did some fundamental research about what end users want from an AAC device in terms of how these communication devices can let users more authentically express their personality.

Starting point is 00:09:58 So we were out on the ground doing interviews with people with ALS to learn about their current technology needs. For example, how these devices currently don't let people express humor, sarcasm, artistic expression, you know, wanting to be able to sing through your device. So we document those needs. And then we also propose innovations in terms of how these devices can express richer prosody and other channels of feedback that might support these needs in the future. So that's how the Enable team and the Ability team work together towards shared goals. So back to the accessibility technology and the terminology, when we think of disabilities, we most often think of sensory and motor because they're the most obvious ones.

Starting point is 00:10:47 And you referred to some of the work you're doing with search engines for cognitive challenges as well. Is there anything else happening in the cognitive range that is interesting in this research? So another area that our team has been working on is technologies for people on the autism spectrum. So Microsoft is, I think, a leader in beginning to think more actively about this space, particularly through our inclusive hiring program. So Microsoft is now part of a consortium of companies that are actively working to change interview and hiring and retention practices to help harness the neuro technology program, but also to think about how

Starting point is 00:11:46 our software engineering tools and practices as a tech company might need to be altered and updated to better support neurodiverse talent. That's the first time I've heard that phrase. Neurodiversity. Well, I think there's some interesting thinking in the accessibility space in the area of disability studies that's really changed over the past few years. So in the past, there was what we call a medical model of disability. And this model proposes that people with disabilities are ill and need to be fixed and that medical intervention is the focus. And more recently, thinking in that space has really evolved toward a more social model of disability. And so the social model of disability emphasizes that many people with disabilities are only disabled not by their own state of being, but by the way that society treats them.

Starting point is 00:12:39 The onus is really on society to adapt and be more inclusive. And the neurodiversity movement is one manifestation of that. So I think that philosophy of the social model rather than the medical model of disability is something that the ability and enable teams really embrace in our thinking about technology. Again, back to the when we think of disability, we often think of sensory and motor and so on. And we also think of solutions that are meaningful and important. And so thinking of something like social media and how teens like to participate and be part of the group and have their friends share pictures and things like that. Tell me a little about what you're doing for that group of limited vision

Starting point is 00:13:25 population in the teen group. Yes. So this past summer, we did some really interesting research led by our intern, Cindy Bennett, who's a student nearby at the University of Washington, to understand how teenagers with vision impairment are using photography and social media. And we included in that study teens with a range of visual impairment. So all of the teens who participated were legally blind, but most of them would be classified as having low vision. But the remainder of the group that we interviewed relied to some degree on their residual vision. Both of these groups of teenagers, whether they completely lacked vision or had some residual vision, were very interested in using visual social media tools, just as any teen would be, in order to communicate and stay

Starting point is 00:14:18 in touch with their friends and participate in pop culture. So applications like Facebook, Instagram, Snapchat were all part of these teens daily lives and observing the usability challenges that this group experienced was really interesting for thinking about how we can redesign these kinds of photography capture and editing applications to make them more accessible. It's really important to all teenagers to be able to take flattering selfies. So this was particularly challenging for people with low vision because in order to take a selfie, you need to hold the camera at arm's length away from your face. That made it impossible for this group to see on the screen whether they were framed in a flattering way on the camera, or even more challenging if they were using some apps that try to overlay different lenses on you while it's taking the selfie, they couldn't tell what those look like or if they were aligned.

Starting point is 00:15:16 So thinking about how you might have better audio feedback in selfie cameras to give people information to help them position. Like move to the left or focus your face. Right, or you're completely in the frame now or you're out of the frame, would be really helpful to that audience. Another thing we saw the teens doing that I was really surprised by was they used filters in an interesting way. What we saw was that teenagers with low vision,

Starting point is 00:15:42 in addition to using filters for artistic purposes to kind of fit in with their friends, they were also using filters to actually improve the visual clarity of images for themselves. So people would go through and choose a filter because it best matched their visual abilities to increase their own perception of a photo. And they would even do things like download other people's photos so that then they could filter them in order to see them better. And I think that suggests a lot of, again, potential for technological innovation in terms of actually designing filters specifically for different visual conditions that might be optimized to improve visual clarity for particular subsets of users rather than only designing filters for aesthetic purposes.

Starting point is 00:16:29 Is that something that your research team is looking at and studying? It's on our list of things to tackle going forward. Yeah, so we have, I think, an interesting pipeline in our research. So we do human-computer interaction more broadly, encompasses several different research methods. So it includes research to understand users' needs. So this is things like ethnography and interviews and surveys and observations to understand what users really want and need in technology. So our team does a great deal of that kind of work. And then the next layer is to build on that, right? And now that we understand the needs in technology and the potential for innovation to actually build new prototypes, new algorithms,

Starting point is 00:17:17 new interfaces that enable these kinds of expression that we see a need for. And that's also, again, something that our team does. So we build prototypes and then actually testing out these prototypes with users to understand what works and what doesn't and refine them. And then, of course, the next stage is actually building on that knowledge to engineer real systems that might be shippable to an end user. So it sounds like you use the full spectrum of research methodology from qualitative to quantitative and then actual lab testing of this stuff and then getting it out into the world.

Starting point is 00:17:54 Yeah, so our team has people with backgrounds in different areas. We have one researcher with a background in psychology, others like myself with backgrounds in computer science. So we really have different kinds of skills on the team. I think it's important to point out now that your team conducts research at, as you call it, the intersection of HCI and AI. And the AI part is super important. Without it, the research would look a lot different, I think. So tell us how AI is changing the game for innovation in your field, inability, enabling, and human computer interaction? Absolutely. I think there are a lot of AI advances that are going to enable new experiences for people with disabilities. But I think it's also

Starting point is 00:18:55 vitally important that the needs of that user population be considered by people involved in the AI research directly. There's been an explosion recently, thanks to advances in deep learning, in the ability for computer vision systems to label photographs. And that's wonderful. And of course, to me, an obvious application for a system that can automatically label photographs

Starting point is 00:19:19 is to caption images for people who are visually impaired. Because when browsing the web right now, a screen reader can only describe an image if the author of a web page has supplied an alt text for it. That's right. And about 50% of images on the web right now on major websites lack alt text completely. So a blind user wouldn't receive any description. So one might think, well, we'll use these new AI technologies to caption the images. But there are some challenges with that. So most of these technologies are not developed right now with the scenario of use by people who are visually impaired. So there's the assumption that if a mistake is made in labeling

Starting point is 00:20:01 the image, the cost of that mistake is relatively low because a sighted user can see that a particular image maybe doesn't match the retrieval terms and just ignore that mistake. But for someone who's visually impaired, the cost can actually be quite high. And that, I think, is a more fundamental problem in AI research that I think our perspective from HCI and working with end users can really inform through collaboration. Right. Let's talk about collaboration and circle back to the inception of the Enable group. And it began as a partnership with Team Gleason. I'm not sure everyone knows who Steve Gleason is and what the collaboration started as and how it's advanced

Starting point is 00:20:46 and grown. Sure. So the Enable team, which is led by Rico Malvar, who's the chief scientist of Microsoft Research, they grew out of a partnership with Team Gleason as part of Microsoft's one-week hackathon. And so Steve Gleason is a very well-known football player, and he was diagnosed at a relatively young age with ALS, which is also known as Lou Gehrig's disease. And as people with ALS progress and they lose muscle control, they need to rely on other tools such as wheelchairs for mobility and AAC devices for communication and speech generation. And Steve was not satisfied with the state of current technologies in that area. Unfortunately, there's no medical cure right now for ALS.

Starting point is 00:21:35 And Steve is very well known for a quote that I'll try to reproduce accurately here. So his approximate quote is that while there is no medical cure for ALS, technology can be the cure. And so by that, he means that technology can really, in this social model of disability, technology plays an important role in removing barriers to access and making people less disabled in their daily interactions. And so his team has worked closely with Microsoft on two particular projects, which were the initial projects of the Enable team. One is allowing someone to use eye gaze to drive their own wheelchair to have more autonomy. And then the other is in improved interfaces for typing with eye gaze. So typically the way people with ALS and other serious motor conditions communicate is through using the eyes to type. So you would stare with your eyes at a particular letter on the screen with an eye tracker for a set amount of time until the system's confident that you're looking at that letter. And then that letter would be typed and you do that one at a time. And then you, when you're done typing something, you would hit a button that would speak that out in a

Starting point is 00:22:48 computer generated voice. And typically people achieve typing rates that are very slow, about five to 10 words a minute, whereas conversational English speech is closer to 190 words a minute. So that is a huge impediment to participation in daily life when you're communicating that slowly. So just letting people type letter by letter is not going to get you up to regular rates of speech. You need prediction. And that's where AI comes in again. And so the Enable team has been working on better user interfaces and better algorithms behind the scenes and word prediction in order to improve that kind of speech. And of course, it's not just the words. So another thing we've been thinking about is, again, how you let people express themselves more quickly and more richly.

Starting point is 00:23:40 And so, like they say, you know, a picture is worth a thousand words. We've looked at adding a row of keys to the communication devices that are different emoji. So representing the most common human emotions, anger, surprise, sadness, happiness. And if you just add one of these emoji to your sentence as a punctuation, so it's just one more key press. So low input from the user, since each input is such a great effort. And we use those to not only modify the prosody of the output behind the scenes, but we insert clips of non-speech audio. So for example, if you add a surprised emoji to a sentence, you might get inserted. And if you add the angry emoji, you might get like, and these communicate such a great amount of emotion and nuance for such a low effort, a single key press.

Starting point is 00:24:31 And that's been very well received by users in some of our research testing. You talked about what you do with universities and so on, and I know you're affiliated with the University of Washington and you're involved in a program called DUB. Can you tell us what that is? Sure. So DUB is a little bit of a play on words because, of course, the University of Washington is abbreviated as UDUB, but the DUB research Group, it's spelled D-U-B, and it stands for Design, Use, Build. And so it's an interdisciplinary consortium of faculty from several different departments, so computer science, the School of Information, human-centered design and engineering, arts and design, that are all interested in creating technology that is more human-centric for end users. And myself and several other researchers from Microsoft Research are actively

Starting point is 00:25:32 involved in collaborations with Dubb. So how did you end up here? Give us a little bit about your background and what brought you to Microsoft Research? Sure. So my background is that I studied computer science as an undergraduate, and I went to Brown University, which at the time had a very traditional computer science department, so there was no HCI. And I learned about HCI by browsing the web. I found the website of the Stanford Interactive Workspaces Project, which was an early project around ubiquitous computing. So thinking about how we can design spaces where computing is embedded in the environment. And I remember that webpage had a picture of a room that at the time looked very futuristic to me, where there were these large wall-sized displays and a table with a

Starting point is 00:26:23 display embedded in it. And it was really very futuristic looking. So I contacted the professor in charge of that project at Stanford, who's Terry Winograd, to ask how I could get involved in this kind of work. And I ended up doing some volunteer based research projects with him. And I eventually went to graduate school in the computer science department there. And that was kind of my route into learning more about this field of HCI. And while I was at Stanford, I was fortunate to get an internship with Microsoft Research, where I worked with Eric Horvitz and Susan Dumais on thinking about how desktop search, which was very new at the time, how you could allow people to use context that they might remember to assist in search. So instead of only searching for something by the file name, you could specify other things that you might remember. Like, oh, I remember it was that PowerPoint document that I wrote the day after the presidential

Starting point is 00:27:25 election. Or, oh, it was that email someone sent me right after my son's birthday party. And that you could use these anchors that were more meaningful and memorable to people than file names as a way into search for information. And so that was a really exciting project that kind of expanded my knowledge and got me into thinking about information retrieval. But I also learned about the culture of Microsoft Research, and then I was excited to come and work here when I graduated. Given the insights that you get from the work you're doing, which is both powerfully beneficial

Starting point is 00:28:01 for good, but could also have some downsides, Is there anything that sort of keeps you up at night? I think one is in thinking about AI systems that are more inspectable and understandable, not only inspectable by AI researchers, which is in and of itself still a challenge, but inspectable by end users so that they can really understand when to place their trust in a system and what a system is doing. I think there are challenges in developing AI systems that balance augmenting users' capabilities with the privacy needs of the users themselves and people in the surrounding environment. So let's say one might imagine hypothetically that someone who's visually impaired might benefit from having an outward facing smart camera that's a wearable that could sense things in the environment using computer vision

Starting point is 00:28:57 and describe that to someone. So that could potentially have a great benefit to the end user, but might have a privacy cost to people in the surrounding environment who may not be actively consenting to being captured. And I think thinking carefully about those kinds of ethical challenges is actually a really important part of AI research. And I know Microsoft has now formed a group called Aether, A-E-T-H-E-R, which is thinking specifically about AI and ethics. So the A is for AI, the E is for ethics. And I think that's one of the areas that will be really interesting to tackle. Microsoft just announced a new initiative at the annual Build Conference.

Starting point is 00:29:41 That's pretty exciting. Tell us what it is, who it will impact, and what it tells us about Microsoft's broader mission for technology in the 21st century. Yes, so you're referring to the AI for Accessibility Initiative. So this is an exciting new program that is designed to help support grassroots innovation in the accessibility space. So Microsoft is really interested in encouraging students, entrepreneurs, etc.

Starting point is 00:30:12 to think about how you can use Microsoft technologies to enable important scenarios for people with disabilities. So, for example, how can we use these technologies to allow people to be more productive in their work life or participate more fully in social life outside of work? And so the AI for Accessibility Initiative offers funding opportunities that people can apply. So you describe how your project or app would fit into this vision and how Microsoft can help support your success in this space, perhaps by donating compute time on our Azure servers

Starting point is 00:30:57 or allowing free API calls to our cognitive services APIs, which offer shortcuts to some of our advances in AI technology like libraries for computer vision and for natural language processing. So we're really excited about this initiative. We're excited to see what kinds of great project proposals and applications come in. And I think it accentuates Microsoft's commitment to its mission statement of empowering all users in their lives. And I think empowering all users really refers to all users, including the one billion people worldwide with disabilities. And so it's great to see that being emphasized in this new initiative. Mary Ringel-Morris, I so enjoyed our conversation today. My eyes were opened, shall we say.

Starting point is 00:31:46 Thank you for coming in and sharing the work that you're doing and the passion behind it. Great. Thank you. To learn more about Dr. Mary Ringel-Morris and how AI is helping people with disabilities all over the world, visit Microsoft.com slash research.

Your Ad Here

Microsoft Research Podcast - 025 - Advancing Accessibility with Dr. Meredith Ringel Morris

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.