Microsoft Research Podcast - 074 - CHI squared with Dr. Ken Hinckley and Dr. Meredith Ringel Morris
Episode Date: May 1, 2019If you want to know what’s going on in the world of human computer interaction research, or what’s new at the CHI Conference on Human Factors in Computing Systems, you should hang out with Dr. Ken... Hinckley, a principal researcher and research manager in the EPIC group at Microsoft Research, and Dr. Merrie Ringel Morris, a principal researcher and research manager in the Ability group. Both are prolific HCI researchers who are seeking, from different angles, to augment the capability of technologies and improve the experiences people have with them. On today’s podcast, we get to hang out with both Dr. Hinckley and Dr. Morris as they talk about life at the intersection of hardware, software and human potential, discuss how computers can enhance human lives, especially in some of the most marginalized populations, and share their unique approaches to designing and building technologies that really work for people and for society.
Transcript
Discussion (0)
On today's episode, we're mixing it up, moving the chairs, and adding a mic to bring you the perspectives of not one, but two researchers on the topic of human-computer interaction.
We hope you'll enjoy this first in a recurring series of multi-guest podcasts where we dig into HCI research from more than one angle, offering a broader look at the wide range of ideas being explored within the labs of Microsoft Research.
You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen
Huizenga. If you want to know what's going on in the world of human-computer interaction research,
or what's new at the CHI conference on human factors in computing systems, you should hang
out with Dr. Ken Hinckley, a principal researcher and research manager in the EPIC group of
Microsoft Research, and Dr. Mary Ringel-Morris, a principal researcher and research manager
in the Ability group.
Both are prolific HCI researchers who are seeking, from different
angles, to augment the capability of technologies and improve the experiences people have with them.
On today's podcast, we get to hang out with both Dr. Hinckley and Dr. Morris as they talk about
life at the intersection of hardware, software, and human potential, discuss how computers can
enhance human lives, especially in some of the most marginalized populations,
and share their unique approaches to designing and building technologies
that really work for people and for society.
That and much more on this episode of the Microsoft Research Podcast. So I'm excited to be hosting two guests in the booth today, both working under the
big umbrella of HCI, or Human-Computer Interaction. I'm here with Dr. Ken Hinckley from MSR's
EPIC group, which stands for Extended Perception, Interaction, and Cognition, and Dr. Meredith
Ringel-Morris, aka Mary,
of the Ability Group. Ken and Mary, welcome to the podcast.
Thank you.
Thank you.
So I've just given our listeners about the shortest possible description of the work you do.
So before we dive deeper into your work, let's start with a more personal take on what gets
you up in the morning. Tell us in broad strokes about the research you do, why you do it,
and where you situate it. Mary, why don't you give us a start here? Sure. One of the reasons that I'm
excited about working in the space of HCI every day is that I think computers are cool, but what
I think is most interesting about computers is how people use computers and how computers can help
people and how computers can enhance our
lives and capabilities. And so that's why I think HCI is a really important area to be working in.
When I did my undergraduate degree in computer science at Brown, they actually didn't
have any faculty in HCI. And the first time I heard of HCI was when Bill Buxton, who is actually
a researcher now at Microsoft, came and gave a guest lecture in my computer graphics course. And so we had that one lecture
where Bill talked about HCI. And when I heard that, I was like, wow, this is the most interesting
thing I've heard about in all my courses. So then I found a way to work on that topic area
more in graduate school. Ken, what about you? What gets you up in the morning?
Sure.
Well, I guess the first answer is my dog and my kids, often earlier than I would prefer.
But beyond that, I was interested in technology at a young age, but it was not for the sake
of computers themselves, but like, what is this technology actually good for?
And I remember going through a phase early, maybe sophomore year in college, I was very
interested in artificial intelligence and so forth.
But I started thinking about it.
I was like, well, why would you actually want to do that?
Like, why would you want to make a computer that's like a human?
It's really more about the humans and how can we use technology to augment
their thinking and their creativity to do amazing things.
So that's what excites me about being at Microsoft Research
and having research that's getting out in terms of devices and experiences
where people can do amazing things with the tools and the techniques that we're building.
I want to set the stage for the research we'll be talking about by getting a better understanding
of the research groups your work comes out of. Can you work in the EPIC group? We referred to
what that stands for. And you called your interdisciplinary team a pastiche of different expertise. Since
we're using French words, tell us more about EPIC and its raison d'etre.
Yeah, I should have been cautious with my use of French words. I did study French in high school,
but I've forgotten all of it. And my accent is terrible. But the core sort of mission of EPIC
is to innovate at the nexus of hardware, software, and human potential.
So there's really sort of three pillars there. There's the hardware, so we do sensors and devices and kind of new form factors. We explore that like in terms of new technologies coming down the pike.
We also look in terms of the software, like what can we do in terms of new experiences?
And actually, Mary mentioned Bill Buxton earlier. He's actually one of the people on the team. And
so we think a lot about what is happening at this, you know, as new technologies come forward, like how do they complement one another? How do they work together?
How do we have not just a single experience, but experience that sort of flows across multiple
devices and everything, all the technology you're touching in your day. It's still this incredibly
fragmented world that we're trying to figure out interesting ways that we can sense or provide
experiences that give you this flow across all the technology you're touching in your day.
And then we also study in sort of the human side, we study the foundations of
perception and motor behavior and like these very fine grain things about how people actually
perceive the world. And it turns out it's, you know, there's some interesting things that go
on there in terms of how you form sensory impressions when you're hearing something
and seeing something and feeling something all at the same time, or maybe not in a virtual
environment, for example. So there's tricks you can play sometimes. So we sort of explore that
as well. I want to drill in just a little before we go on to Mary. I've looked at the scope of your
team. It's a really diverse group of people in terms of where they come from academically and
in their research interests. Talk a little bit about the kinds of people on your team and that pastiche
that you referred to. Sure. So we have people, you know, people like Bill Buxton, who come more from
the design world. He started in electronic music, right? So that was his motivation for technology
back in the 1970s. We have people like Mara Gonzalez-Franco, who comes from neuroscience
background and gives us some of that, you know, real interesting insight in terms of how the human
brain works. We have people from computer vision, like E Ofec, who's leading a lot of work in virtual environments and haptic perceptions.
So, I mean, it really kind of spans all of those areas.
We also have people from information and visualization, people from multiple countries, you know, Korea, France, Switzerland.
So it really is like kind of people from all over the world.
And we really value that diversity of perspective that it brings in terms of what we bring to our work.
Right. So Mary, last May, we talked about what was then the newly minted Ability Group on this podcast. It's episode 25 in the archives. But as it's been about a year now since that came out, let's review. Tell us what's been going on over the last 12 months and what the Ability Group is up to now.
Absolutely. So as you mentioned, we've already talked about the Ability team on another podcast,
and our primary mission is to create technologies that augment the capabilities of people with
disabilities, whether that's a long-term disability, a temporary disability, or even a
situational disability. And we have a lot of new projects
that we've started in the past year. A big one is the Ability Initiative, which is a collaboration
that spans not only many groups within Microsoft Research, but also externally to a partnership
with UT Austin. And this project is focused on how we can develop improved AI systems that can combine computer vision and
natural language generation to automatically caption photographs to be described to people
who are blind or in the situational disability context, maybe someone who's sighted but is using
an audio-only interface to their computer, like a smart speaker. And so we're trying to tackle problems like how do we include additional kinds of detail
in these captions that are missing from today's systems?
And also the really important problem of how do we not only reduce error, but how do we
convey that error to the end user in a meaningful and understandable way?
So that's a really big new initiative that we've been working on. Another big initiative that we've just started up
with some new personnel is starting to think about sign language technologies.
So there's a new postdoc, Danielle Bragg, who joined the Microsoft Research New England lab,
and her background and
experience is in sign language technologies like sign language dictionary development for people
who are deaf and hard of hearing. So actually together with Danielle and also Microsoft's new
AI for Accessibility funding program, we organized a workshop here at Microsoft Research this winter that brought together experts in deaf
studies, sign language linguistics, computer vision, machine translation, to have a summit
to discuss kind of the state of the art in sign language technologies.
What are the challenge areas?
What is a plan of action going forward?
How can the Microsoft AI for Accessibility Initiative accelerate the state of the art in this area?
So that's a really exciting new area of work that we're interested in.
Give me an example of where sign language technology plays.
I'm actually trying to envision, how would it manifest for a person who had hearing disabilities? Right now, the state of the art is very far from realizing automatic translation between English and sign language.
It's a very complicated computer vision program and a very complicated machine translation program because since sign language doesn't have a written form, there are no existing paired corpora. So for example, you know, many
famous works of literature or newspaper articles, you know, exist online in say both English and
French, since we're talking about French today. And so that gives you data from which you can
begin to learn a translation. But since there's no written data set of American Sign Language, that becomes very difficult to do.
And so how you might generate training data is one of the challenges that we discussed at the workshop.
But if you could have a computer vision-based system, a machine translation system that could recognize sign language and provide English captioning, that could be an important communication aid for many people. And then, of course, vice versa in the other direction. If you could go
from English to generate a realistic animated avatar that signed that same content, that would
actually be really important for many people because many people who speak sign language,
sign language is their primary language. English is a second language for them. And so their English literacy skills are often lower than people who learned English
as a first language. So just English captions on videos or English language content on the web
is often not very accessible to many people who speak sign language. And signing avatar
translations of that content would open up web accessibility. But developing these avatars
to sign in a realistic fashion and signing involves not only the hands, but facial expressions. And
so actually generating that nuance in an avatar is a very challenging research problem.
I've had a sizable number of your colleagues on the show, all of them have stressed the importance of
understanding humans before designing technology for them. And to me, this makes sense. You're in
human-computer interaction, human being the first word there. But it's remarkable how often it seems
like nobody asked when you're using a technology. So give us your take on this user-first or
user-centric approach that comes out of this umbrella of HCI, but particularly within your groups.
Why it's important to get it right and how you go about that.
Ken, why don't you start?
Sure.
I do subscribe to the user-centric approach to interaction design. Because you really have to understand people before you can design technology that fills this impedance mismatch between what you can do with technology and sort of the level at
which people operate and think about concepts.
However, in my own research, I often take sort of a, I call it a pincer maneuver strategy,
right?
So you think about, you know, what are the fundamental human capabilities?
How do people perceive the world?
How do you interact socially with other people?
But then you can sort of couple it with, you see things coming down the road in terms of,
oh, there's these new technologies coming out, there's new sensors
coming down the pike. And so what I often try to do is I try to match these trends that I see
converging. And when you can find something where, you know, say a new sensor technology
meets some need that people have in terms of how they're interacting with the devices,
and you can sort of change the equation in terms of making that more natural or
making the technology sink into the background so you don't even
have to pay attention to it, but it just seems like it does the right thing.
I like to do that.
So I really do play both sides of the fence there where I study people and I try to understand
as deeply as possible what's going on there, but I also study the technologies.
So sometimes I do have work where it's more technically motivated first, so it's a bit
contrary in that sense, but it always ends up meeting like these real world problems and real world abilities that
people have and trying to make technology as transparent as possible.
Mary, how about you?
I can give actually a great example from the sign language workshop that we were just talking about
of the importance of this user-centered design. So one of the important components of this workshop
was having many people who are themselves deaf, who are sign language communicators attend and participate in this ideation and this session.
And one of the themes that came up several times that speaks to the importance of user-centered design was the example of sign language gloves. a few examples of very well-intentioned technologists who are not themselves signers
and who didn't necessarily follow a user-centered process, who have invented and reinvented gloves
that someone would wear that could measure their finger movements and then produce a
translation into English of certain signs. But of course, this approach doesn't take into account
the fact that sign involves many aspects of the body besides just the hands,
for example. So that's one pitfall. And then, of course, also from a more sociocultural perspective,
another pitfall of that approach is it thinks about sign language technology from the perspective
of someone who's hearing. So it's placing the burden of wearing this very awkward glove
technology on the person who is deaf. Whereas, for example,
maybe a different perspective would be people who are hearing should wear augmented reality
glasses that would show them the captions and the people who are deaf wouldn't be burdened by
wearing any additional technology. And so I think that kind of perspective also is something that
you can only gain by a really inclusive design process that maybe goes beyond just involving people as interviewer user study participants,
but also actually having a participatory design process that involves people from the target communities
directly as co-creators of technology.
Yeah. Ken, did you have something to add?
Yeah, so maybe one more thing I'd add in terms of my own perspective.
So I sort of mentioned how it's really important I'd add in terms of my own perspective.
So I sort of mentioned how it's really important to understand people and sort of what goes
on.
But I think one of the interesting or unique attributes of that is I think we're trained
here in Microsoft Research in terms of understanding people is we sort of become these acute observers
of the seen but the unnoticed.
So there's lots of these sort of things that we just take for granted in terms of interpersonal
interactions. Like if you're at a dinner party and you're talking to a group of people, like, well, you're probably forming a small circle and probably there's five or less of you in the circle, right?
And there's a certain distance that you stand and you're facing each other in certain characteristic ways.
None of this is something that we ever notice.
Another example I like to use from my own work and sort of working on tablets and pen and touch interactions is actually just looking at how people use everyday implements with their hands. And my talks often ask people like, okay,
well, which hand do you write with? And of course, you know, 75% of the audience will raise their
right hand because they're right-handed. And of course I'll say, well, actually you're wrong
because you use both hands to write, because first you've got to grab the piece of paper and then you
orient it with your left hand and then the right hand does the writing. So there's examples like
that in terms of what people actually do with their behavior
that because we just take them for granted
and it's just part of our everyday experience,
you don't actually notice them.
But as a technologist, you have to notice them and say,
oh, if we could actually design our interactions
to be more like that, it'd be natural that technology
would be transparent. Well, let's talk about research now. Ken, I'll start with you, and maybe the best approach
is to talk about the research problems you're tackling around devices and form factors and
modalities of interaction, and how that research is playing out in academic paper and project form,
especially with some of the new work you're doing with posture-aware interfaces. So
talk kind of big picture about some of the projects and papers that you're doing.
Yeah. So in terms of the global problem we're trying to solve, we've been thinking for a long
time about these behaviors that are seen but unnoticed. Everyone goes through the day using
maybe their mobile device, or if you have a tablet, you interact with that in certain ways. And there's always sort of these little irritations
that maybe you don't really notice them, but over time they build up. And using technology,
we can mediate some of these. So to go back, you know, actually almost 20 years now, we were looking
at mobile devices and you had to go through certain settings. You could go into the settings,
say like, oh, I want it to be in portrait mode now, if you're taking some notes, or maybe you
need to look at a chart. You had to go back in the settings, say, oh, I want it to be in portrait mode now if you're taking some notes. Or maybe you need to look at a chart. You had to go back in the settings and say, oh, I want it to be in landscape orientation.
And so we started looking at, oh, if we had some sensors
like a little accelerometer that was on the device,
maybe we could actually just sense
which way you're holding the device
and the screen could automatically rotate itself.
So that was something that you could publish
as an award-winning paper in the year 2000.
And now it's sort of an everyday use.
So taking that same perspective now to the modern era of interacting with tablets,
how do you actually understand how the person's using their device and what they're trying to do
with it? So tablets have this interesting attribute where you can use them directly on your desktop.
Maybe you're doing some very focused work. You can kind of be leaning over it and writing with
your pen, maybe marking up some document, you know, or maybe you're just kicked back on your
couch and watching like YouTube videos of like cats chasing laser pointers right so it kind of in that whole
spectrum but obviously it's particular to the situation right so if you're kicking back on your
couch you're probably not trying to mark up your excel spreadsheet and likewise if you're kind of
hunched over your desk you're probably not watching the cat video so by sensing those contexts and
actually adapting the behavior of the tablet so it understands how you're holding it which hand you're
grasping it with are you holding it on the bottom are you holding on the left of the tablet so it understands how you're holding it, which hand you're grasping it with. Are you holding it on the bottom? Are you holding it on the left or the right?
Are you holding the pen in your hand? Are you reaching for the screen? By being able to sense
all those attributes, we can actually simplify the experience and sort of bring things that are
useful to you at those moments. Well, and it's interesting that you referred to that toggle
between landscape and portrait mode, which is actually annoying to me. I sometimes lock my
phone because I don't want it to go there.
I want it to stay there.
But anyway, that's a side note.
It's a side note, but it's funny
because we actually noticed that
when we first built that technique
and we kind of knew that was a problem with it.
But actually in our most recent research,
now we can also sense how you're gripping the phone.
And so you can understand that if you lay down
and you're gripping the device is not changed,
we know that you didn't intend to reorient it.
So we can actually now suppress screen rotation. Finally finally 20 years later. That's exactly the problem because I
would lie down and want to read it sideways and then it would go like that with my head.
Well, Mary, let's talk about the work you're doing in the accessibility space,
particularly for people with low or no vision. We might expect to see this manifested in real life,
I like to say IRL, or on the web or other screen applications.
That's just sort of how I think about it.
But you've taken it even further to enhance accessibility in emerging virtual reality or VR technologies, which is fascinating to me.
So give us the big picture look at the papers and projects you've been working on.
And then tell us about Seeing VR, which is both the paper and the project, as I understand it.
I think this project initiative around accessible virtual reality is trying to think, OK, virtual reality isn't that commonly in use right now, but in 10 years, it's going to be a big deal.
And we want to think about accessibility from the beginning so that we can design these systems right the first
time instead of trying to have post-hoc accessibility fixes later. So part of the aim of
this work beyond the specific prototypes like the controller or seeing VR is really to start a
conversation and just raise people's awareness and be provocative and have people think, oh,
we need to make VR accessible. So if
people only remember one thing, that's what I want them to remember, separate from the details of the
specific projects. But yes, so last year, we presented at the CHI conference, the cane troller,
which was a novel haptic device that allows people who are blind to our white cane users, IRL,
to transfer their real life skills into the virtual environment
in order to navigate and actually physically walk around a VR environment
with haptic sensations that mimic the sensations they'd get from a white cane.
And then that project was led by our fabulous intern,
Yuhang Zhao from Cornell Tech,
and also with strong contributions from our intern Cindy Bennett
from the University of Washington, who is herself a white cane user.
So she had some really important design insights for that project.
And so Yuhang returned last summer for a second internship, and she wanted to extend the accessibility
experience in VR based on her passion around interfaces for people with low vision.
So low vision affects more than 200 million people worldwide,
and it refers to people who are not fully blind,
but who have a visual condition that can't be corrected by glasses or contact lenses.
And so working together with Yuhang and also several people from Ken's team,
like Eyal Ofec and Mike Sinclair, as well as Andy
Wilson and Christian Holtz from MSRAI, and also Ed Cutrell from my team. This was really a big
effort. We developed Seeing VR. And so Seeing VR is a toolkit of 14 different options that can be
combined in different ways to make VR more accessible. And we went with this toolkit
approach because low vision encompasses a wide
range of abilities. And so we wanted people to be able to select and combine tools that best met
their own visual needs. And the great thing about seeing VR is that most of these tools can actually
be applied post hoc to any Unity VR application. So Unity is the most popular language for developing VR. And so even
if the application developer hadn't thought about accessibility beforehand, we can still apply these
tools. And so, for example, the tools do things like increasing the magnification of the VR scene,
changing the contrast, adding edge detection around things that you can more easily tell
the borders of different objects, being able to point at objects and hear them described out loud to you, special tools for
helping you measure the depth of different objects in case you have challenges with depth perception.
And so we'll be presenting Seeing VR at CHI this year, and we'll not only be presenting the paper,
but we'll also present a demonstration. So people who want to actually come and try on the VR headset and experience the tools directly will be able to do
so. Well, that's a perfect segue into talking about CHI, the conference. It's a big conference
for HCI, maybe the biggest. Yes, it is. And Microsoft Research typically has a really big
presence there. So talk about the conference itself and why it's a big deal. And then give
us an overview of what MSR is bringing to the party in Scotland this year in the form of people,
papers, and presentations. Sure. So in terms of, you know, some of the papers that people on my
own team touched, yeah, we have a couple of the Honorable Mention Awards, which is sort of
recognizes the top 5% of papers appearing at the conference. So one in particular, we talked about
the, you know, censoring postural awareness in tablets.
Another one that's a really fun effort is using ink
as a way to think with data, right?
So think by inking kind of thing.
So if you have some sort of visualization
that you're looking at in your screen,
what if you could just sort of mark up some data points
and simply by marking it up as you're thinking about
and just sort of glancing at these visualizations.
So then you take sort of these simple marks
that people would do anyway in terms of using very pen
and paper-like behaviors, but now translated
to a digital context on your tablet, for example.
I can just mark something up with my pen,
and then I can use those marks as ways
to actually link the data points back to the underlying data.
Can I actually split my data set just
by drawing a line across it as opposed
to doing some complicated formula? And that's our Active Ink project. So instead of just
having sort of dead ink on paper, you can actually imbue it with this digital life that
actually it just naturally how people think, but then you can just start going deeper and
deeper in terms of it actually touches live data that's underneath.
Go ahead, Mary.
What I wanted to point out was the paper at CHI from Microsoft Research that I find most exciting this year is one of the best paper award winners, the Guidelines for Human-AI Interaction work that was led by Salima Amershi, who's a researcher here in the Redmond Lab.
I'm really glad this paper won the best paper award because it has a lot of immediate practical value for people in the HCI and AI communities.
So if you're going to read one paper
from CHI this year, listening audience, I'd suggest you read this one. It has a list of 18 very
concrete, actionable guidelines for developing AI systems that are meant to be used by real people
in a way that's pleasing to the end user. And it has lots of good examples and counter examples of
applying each of these guidelines.
So I think it's a really great read, and I'm glad it was recognized with the award.
Any other highlights that you want to talk about?
I mean, you've got colleagues.
Stephen Drucker has a couple papers, I think, in there, and Cecily Morrison, who's out of the UK lab.
I mean, there's tons of great stuff. And in a sense, it's an embarrassment of riches
because even myself being at Microsoft Research,
I haven't had a chance to look at all this work.
I haven't looked at the HCI AI paper
that Mary just mentioned.
So now it's like,
I learned something from this podcast too.
I need to go read this.
Yeah, in terms of other areas that we're addressing,
there's actually quite a bit of work
on virtual environments coming from Microsoft Research.
So just looking at different ways
that we can manipulate things in VR
in very unique and sort of crazy and creative ways.
So we have work exploring that.
We also have work exploring in terms of
if you're in a virtual environment,
you actually want to reach out and touch something,
your hand is just an empty space.
So how do you give the illusion
that there's actually objects you can interact with
and give you sort of dexterous ways to manipulate them?
So we've been doing a series of technologies
around how to simulate the grasping motions, for example. So you can actually feel an object and you can squeeze it and it feels
like it has compliance. We have other work looking at personal productivity when you have wearable
devices and just numerous other topics, even in terms of like, oh, can you use the language of
graphic novels as sort of a way to present visualizations to people? So you sort of have
these data comics or data tunes, we call them. Just exploring that as sort of another language for interacting with data. So there's
all kinds of great stuff going on. So you mentioned Cecily Morrison. So Cecily and her
collaborators from the UK lab will be presenting a demo at CHI this year of Code Jumper, which I
believe you interviewed her about in another podcast. It was Project Torino then. It's been renamed. But that's a tangible toolkit for children who are blind that they can use to actually learn programming skills themselves.
And so that will be on demo at CHI for people to try out.
I think besides all the papers, another thing to point out is that many people from Microsoft Research are involved in organizing workshops. And so one that I'll call out is there's a workshop.
It's called the Workshop on Hacking Blind Navigation
that discusses the challenges of developing indoor and outdoor navigation technology
for people who are visually impaired.
And Ed Kutrell from the Ability Team is one of the co-organizers of that workshop.
But that workshop is going to feature a panel discussion that includes Melanie Neisel,
who is an engineer on the Soundscape team here. And so, as you may be featuring some research from an intern, Annie
Ross from the University of Washington, who was co-advised between the Soundscape team and the
Ability team. And what Annie was looking at was a very different take on accessible virtual reality.
So she was thinking about designing an audio-only virtual reality experience that would allow a Soundscape user to practice
walking a route in the comfort of their own home or office before they actually walk that route
in the real world. And it's a very challenging design problem. And so she'll be discussing that
at the workshop also. So it's about now that I always ask the infamous what keeps you up at night question.
Your work has direct human impact, perhaps more directly than other research that we might encounter, and often with people who are marginalized when it comes to technology.
Mary, I'd like you to start this one since I know you're involved in work around AI ethics and some of the work that's going on in the broader community about that.
So what are your biggest concerns right now and how and with whom are you working to address them?
Yes. So there's been increased awareness both in the industry and in the public consciousness over
the past year or so about some of the fairness and ethics challenges of AI systems. But a lot
of that discussion, rightfully so, has focused on issues around gender and race but one thing that I've noticed in these
conversations is that there hasn't been any discussion around these issues as far as how
they might impact people with disabilities and one goal of mine for the coming year is to really
elevate awareness and conversation around this issue so So for example, are people with disabilities being included
in training data? Because if they're not, AI systems are only as good as the data. So as an
example, if you think about smart speaker systems like an Alexa or a Siri or a Cortana, do they
recognize speech input from someone with a speech disability? So if you have dysarthria or stutter or some other
atypical speech pattern, and the answer in general is no, because people with those speech patterns
weren't included in the training data set. And so that locks them out of access to these AI systems.
Well, you might want to add children. There's a video about a little girl asking Alexa to play
Baby Shark and never,
ever gets it until the mom comes in and says it in the right way.
Well, absolutely. And that goes back to a longer historical issue. The original
speech recognition systems were trained only on men, so they didn't even recognize women's
voices at all, if you want to look back several decades. So yes, there are many groups,
people with accents, children, that all come to mind. And so that's a big issue.
I think another example of an ethical issue, particularly for sensitive populations like people with disabilities, is expectation setting around the near term capabilities of AI systems.
I think a lot of the language that's used by researchers and marketing people when we talk about AI leads people to overestimate the capabilities of AI systems.
I mean, even just the word artificial intelligence suggests a human-like semantic understanding, which really today's AI systems are all about deep learning.
It's all just statistical pattern recognition with no semantics at all. And so I think if we want to go back to the sign language example, I feel like every month I see some kind of headline that says, you know, researchers have invented a sign
language recognition system. But then if you actually read between the lines, you know,
they've invented a system that can recognize 50 isolated signs, which is nowhere near recognizing
a whole language. And I think for people whose lives may be fundamentally changed by these kinds of technologies, being careful about how we communicate the capabilities of AI to those audiences is really important.
So in terms of going forward with this issue of raising awareness around issues around disability and AI fairness, I've been collaborating with Megan Lawrence, who's from the Office of the Chief Accessibility Officer, as well as Kate Crawford from the New York City Lab. We've been organizing some external events to raise awareness
around this issue. Megan led a panel discussion at the recent CSUN conference, and the three of us
co-hosted a workshop last month at the AI Now Institute. So I think you'll be hearing more
about this topic in the future. Ken, aside from having a French test that you didn't study for, is there anything that keeps you up at night?
Anything that we should be concerned about?
In my research, it doesn't keep me up at night.
It's kind of, I'm really looking for what are some unique ways that we can add to human potential.
So for me, it's a very satisfying and very exciting topic to think about. So it doesn't
keep me awake. It kind of gets me excited at night, or maybe it keeps me up because I'm just,
my mind keeps buzzing with creative new ways that we can use technology to kind of amplify
human thought to make it so people can express new and better or maybe more expressive concepts
than they would with pen and paper and physical materials. I often like to return to, you know, Kranzberg's laws are sort of a way to think about this,
and particularly the first one where, you know, basically the gist of it is that technology is
not inherently good, nor is it bad, but nor is it neutral, right? So it just kind of exists out
there and it's often, you know, human desires and needs and those shape how technologies get
applied. And we need to think really carefully about that when we're bringing something new into the world.
So yeah, so I don't really like the approach of like,
oh, let's just build something and kind of hope it works out.
You really need to think carefully through like,
what is the human impact?
And is this actually going to empower people
and make it so they can do new things?
I love stories.
And one I'd like to hear is how you two know each other.
And another is how you each ended up at Microsoft
Research. So Ken, tell us your story. And then we'll have Mary talk about how she, in her own
words, wormed her way in here. Yeah, so I was a graduate student at the University of Virginia.
And I was a computer science student, but my desk was actually situated in the neurosurgery
department. And most of my funding was through some grants that the neurosurgery department had,
because they had these really interesting medical image data sets of, you know,
MRI scans of patients that were coming in. And the lab I was working with was actually working
on ways to create new tools for neurosurgeons so they could actually access these things directly
and look at them. So they were looking for tools to visualize and plan surgeries. So that was my
entree into the world of HCI and research. So George Robertson
had been sort of one of the pioneers of 3D interaction at Xerox PARC. He had moved over
to Microsoft Research. He was starting up a team here. And so he was like, hey, Ken,
you want to come work on these things? So we started doing that. And just over time,
because there's so many great things going on here in Microsoft Research, there's lots of
opportunities to jump to different topics. So I started working with Mike Sinclair. and he was saying like, hey, look, I got this new accelerometer sensor and
maybe you should try putting that in your phone stuff that you're looking at. And so that's where
this automatic screen rotation came from, for example. So that's sort of how I ended up here
on this track of exploring devices and so forth. That's actually a great segue into my story,
the automatic screen rotation. So that's how I met Ken. I don't even know if Ken knows this story
because it was very important to me at the time. So I was a student at Stanford in the computer
science department doing my PhD, and it was my first year of school. And Ken was an invited
speaker for the seminar series. And Ken spoke about this work that he and Mike had done on
attaching the sensors to the phone with the rotation. And it was an amazing
talk, an amazing project, but also it was the first time I had heard of Microsoft Research
because I was a brand new grad student. And so I was really impressed. And after that talk, I thought,
oh, I really want to get an internship at Microsoft Research. But it's a very competitive internship,
and usually they want students who are further along in their career, who already have a track
record of publication. So I wasn't really a competitive applicant on paper. And so then a couple months later, I attended the HCIC workshop. And at this workshop, I met a different Microsoft employee who invited me to come interview for a more product-oriented internship. And so they were going to fly me up from California to Seattle.
So I took advantage of the fact that I was going to be in Seattle and I mailed Ken.
And I said, Ken, you know, I'm a student at Stanford. I heard your talk. It turns out that
I'm going to be in the Seattle area next week. While I'm there, could I come talk to you and
your group about internship opportunities? And so I kind of snuck my way in the back door to having this on-site interview at the
time.
And I did get the internship.
And I ended up actually spending my time that summer working with Eric Horvitz and Susan
Dumais, who were part of the same team.
But it was because of seeing Ken's talk that I managed to worm my way into Microsoft Research.
And then you went back to finish your studies and then came back here?
Right. So because I had such a great experience at that internship, I was very focused on becoming
a professor of computer science. But I had a great experience at my Microsoft Research internship.
And so Eric and Sue and all the people that I worked with really encouraged me to come and also interview with Microsoft Research.
And so that's how I ended up back here.
Ken, what do you remember of that?
That's fantastic.
I do remember talking with Mary after I gave this talk.
Her advisor, Terry Wintergrad, had invited me down to give a talk.
Another sort of wonderful person in our field.
But yeah, but I was definitely impressed by Mary.
She asked me a lot of great questions after the talk.
And you know, sometimes you get students coming up afterwards,
but then they don't really say anything
that kind of gets to the next level.
So I remember Mary asking a lot of really good questions.
And so I was very interested in bringing her in as an intern.
But as it turned out, I didn't actually
have an intern slot to offer anybody that summer.
So it's great that she sort of managed
to work her way into actually getting in the building.
And I think Sue was the one who actually
had the internship to offer.
And Sue Dumais, I don't know if she's been
on these podcasts or not, but she's also wonderful
and sort of one of the people here at Microsoft Research
that I look up to the most.
So it was even better than the opportunity with me,
but working with Sue is wonderful.
And Mary's just continued to blaze new trails
in just every topic that she touches.
So it's been great.
Mary, what's fascinating about what you just said,
and you're not the first one who's told a story about a kind of bold move.
It's like, take a risk.
Why not?
Right?
Absolutely.
That's how I warmed my way into grad school, too.
I had heard Bill Buxton's lecture about HCI.
We didn't have HCI at Brown.
I did some searching about HCI on the internet and came across one
of Terry Winograd's projects at Stanford. And so just out of the blue, I emailed him and said,
I read about your project on the internet. I'm an undergraduate in computer science.
I really want to get to know about HCI research. Can I come spend the summer with you working?
And he said, yes. And that was how I got involved in HCI
research, did a good job. He offered me to come back for graduate school. But if I hadn't sent
that email, it never would have happened. So I guess the lesson is be bold and advocate for
yourself. Well, you kind of just answered the last question I'm going to ask you, which is at the end
of every podcast, I give my guests a chance to talk to our listeners with some parting thoughts, maybe wisdom advice,
and stories about how being bold got you what you wanted. So here's your chance to say something
you believe is important to researchers who might be interested in making humans and computers have
better interactions. Ken, what do you think? I think for me in terms of like what I
see working well and having good research outcomes, I see lots of people, they try to sort of plan out
this grand scheme of like, oh, I'm going to do this and this and this in my PhD. And you kind of
outline everything to death. But where the really exciting sort of nonlinear breakthroughs come in
is you just start walking down this path. And at some point you're kind of beyond the area
where the lights are from the city, right? And you're kind of out in the wilderness in the
darkness and you're like, well, I'm not quite sure where to go, but you just start trying things,
right? You basically just try ideas, see what works, see what doesn't work. And then from that,
you take the next step. And if you're kind of willing to just take this approach where
don't try to plan anything out, but be willing to walk off into the darkness in terms of
just exploring unknown terrain.
That's where the really interesting things come up.
Mary, what do you say?
I guess my advice would be around just being curious.
And as part of that, asking questions.
I know that for years, I think it was at least five years after I had my PhD,
before I felt confident enough to ask a question during a talk,
instead of just waiting afterwards
to talk to the speaker one-on-one because I was worried like, oh, if I ask this question,
will people think I'm stupid or they'll think I don't understand? And after I asked questions,
you know, I realized that like lots of other people had the same question as me. So not only
did I learn new things, but I helped enrich dialogue at the conference and the conference
experience for other people. And a lot of times asking these questions leads to really good conversations and collaborations and new research ideas afterwards.
So I think my advice is to ask questions.
That's what research is all about, isn't it?
Ken Hinckley, Mary Ringel-Morris, thank you so much for joining us today.
It's been really fun to have two people who play off each other and have great stories. Thanks for coming on the podcast. Thanks for having us. Thanks very much.
To learn more about Dr. Ken Hinckley and Dr. Mary Ringel-Morris,
and their up-to-the-minute research in human-computer interaction, visit microsoft.com
slash research.