Microsoft Research Podcast - 054 - Soundscaping the world with Amos Miller
Episode Date: December 12, 2018Amos Miller is a product strategist on the Microsoft Research NeXT Enable team, and he’s played a pivotal role in bringing some of MSR’s most innovative research to users with disabilities. He als...o happens to be blind, so he can appreciate, perhaps in ways others can’t, the value of the technologies he works on, like Soundscape, an app which enhances mobility independence through audio and sound. On today’s podcast, Amos Miller answers burning questions like how do you make a microwave accessible, what’s the cocktail party effect, and how do you hear a landmark? He also talks about how researchers are exploring the untapped potential of 3D audio in virtual and augmented reality applications, and explains how, in the end, his work is not so much about making technology more accessible, but using technology to make life more accessible.
Transcript
Discussion (0)
Until you are out there in the wind, in the rain, with the people, experiencing or at
least trying to get a sense for the kind of experience they're going through, you'll never
understand the context in which your technology is going to be used. It's not something you can
imagine or glean from secondary data or even from video or anything.
Until you're there, seeing how they grapple with the issues that they're dealing with,
it's almost impossible to really understand that context.
You're listening to the Microsoft Research Podcast,
a show that brings you closer to the cutting edge of technology research and the scientists behind it.
I'm your host, Gretchen Huizenga.
Amos Miller is a product strategist on the Microsoft Research Next Enable team,
and he's played a pivotal role in bringing some of MSR's most innovative research to users with disabilities.
He also happens to be blind, so he can appreciate, perhaps in ways
others can't, the value of the technologies he works on, like Soundscape, an app which enhances
mobility independence through audio and sound. On today's podcast, Amos Miller answers burning
questions like, how do you make a microwave accessible? What's the cocktail party effect?
And how do you hear a landmark?
He also talks about how researchers are exploring the untapped potential of 3D audio
in virtual and augmented reality applications,
and explains how, in the end, his work is not so much about making technology more accessible,
but using technology to make life more accessible.
That and much more on
this episode of the Microsoft Research Podcast.
Amos Miller, welcome to the podcast.
Thank you. It's great to be here.
You are unique in the Microsoft Research ecosystem.
Your work is mission-driven, your
personal life strongly informs your professional life, and we'll get more specific in a bit.
But for starters and broad strokes, tell us what gets you up in the morning. Why do you do what
you do? I've always been passionate about technology from a very young age, but really
in the way that it impacts people's lives.
And it's not a mission that I necessarily knew about when I went through my career and experiences with technology.
But when I look back, I see that those are the areas
where I could see that a person feels differently about themselves
or about the environment as a result of the interaction with that technology.
That's where I thought, okay, that is having meaning to this person. And I have this huge, wonderful opportunity to do what I do in Microsoft Research to actually
have turned that passion into my day job, which is very, I feel extremely fortunate with that. And I sometimes have to pinch myself
to kind of see that it's not a dream.
Well, tell us a little bit about your background and how that plays into what you're doing
here.
I'm very much a person that grew up in the technology world. I also moved a number of countries over my career and my life.
I grew up in Israel.
I spent many years in the UK, in London.
I spent a few other years in Asia, in Singapore, and now I'm here.
So all of these aspects of my life have been very important to me.
I also happen to be blind. I suffer from a genetic eye condition called retinitis pigmentosa. It was diagnosed when I was five and I gradually lost my sight. university with good enough sites to manage and finish university with a service dog and any kind of technology I could find to help me read the whiteboard, to help me read the text on the
computer. And I'd say by the age of 30, I totally stopped using my site. And that's when I really
started living life as a fully blind person.
Let's talk about your job for a second.
You're a product strategist at Microsoft Research.
So how would you describe what you do?
So I work in the part of the organization at Microsoft Research
that looks at really transferring technology ideas into impact,
into a way that they impact business, impact people.
A good idea will only have an impact when it's applied in the right way,
in the right environment, so that the social, the business, the technological context in which it
operates is going to make it thrive. Otherwise, it doesn't matter how good it is, it's not going to have an impact. Right. So let's circle over to this previous role you had, which was in Microsoft's
digital advisory program. And I bring it up in as much as it speaks to how often our previous work
can inform our current work. And you referred to that time as your customer facing life. How does it inform
your role as a strategist today? What always energizes me is when I see and observe the
meaning and the impact that technology can really have for people. And I don't say it lightly.
Until you are out there in the wind, in the rain, with the people, experiencing or at least trying to get a sense
for the kind of experience they're going through, you'll never understand the context in which your
technology is going to be used. It's not something you can imagine or glean from secondary data or
even from video or anything. Until you're there, seeing how they grapple
with the issues that they're dealing with, it's almost impossible to really understand
that context.
And the work that I've done in actually my first nine years in Microsoft, I worked in
a customer-facing part of the business in the strategic advisory services, today known
as the digital advisory services.
It's work that we do with our largest customers around the world to really help them figure
out how they can transform their own businesses and leverage advancements in technology.
Right.
So now, as you are working in Microsoft Research as a product strategist, how does that transfer to what you do today?
First of all, I want to introduce for a moment the team that I work with, which is the Enable team in Microsoft Research.
And the Enable team is looking at technological innovations, especially with disabilities in mind.
In our case, our two primary groups are people with ALS and people who are blind. engineering, marketing, and our customer segment and really figure out and understand how we can
harness what we have from a technology perspective and as an organization to maximize and have that
impact that we aspire to have with that community. And that takes a great deal of, again, going back to my earlier point, spending time with
that community, going out there and spending time, in my case, with other people who are
blind, because I only know my own experience.
I don't have everybody else's experience.
The only way for me to learn about that is to be out there.
And in our team, every developer goes out there
to spend time with end users
because that's the only way you can really get under the covers
and understand what's going on.
Right.
So the website says you drive a research program that seeks to understand and invent accessibility in a world, this is the fun part, where AI agents and mixed reality are the primary forms of interaction.
Sounds kind of sci-fi to me.
A little bit. Let me unpack that a little bit. When we traditionally think about accessibility, we think about how do you make something accessible? So how do you make a microwave accessible?
Well, there isn't anything inherently inaccessible in putting a piece of pizza and warming it
up in the microwave. The only reason it's inaccessible is putting a piece of pizza and warming it up in the microwave.
The only reason it's inaccessible is because the microwave was designed in an inaccessible way.
It could have been accessible from the beginning.
But the world we are moving to is, it's not about me operating the microwave.
It's not about the accessibility of the microwave.
It's about me preparing dinner
for my family. That's the experience that I'm in. And there's a bunch of technologies
that support that experience. And that experience is what I am seeking to make an accessible and
inclusive experience. That means that we are no longer talking about the microwave. We're talking about a set of interactions that involve people,
that involve technology,
that involves physical things in the environment.
It's not about making the technology accessible.
It's about using technology to make life more accessible.
Whether you're going for a walk with a friend,
whether you're going to see a movie with a friend,
whether it's sitting in a meeting and brainstorming a storyboard for video, all of these are experiences
and the goal is how do you make those experiences accessible experiences.
That kind of gets you thinking about accessibility in a very different way where your interaction
is with the person that you're sitting in front of.
The technology is just there in support of that interaction.
As I'm researching the interview, I find myself thinking of the various solutions,
maybe the technical guide dog
mentality, like let's replace all these things with technology that people have traditionally used for independence.
And the technology as it enters that ecosystem, some people might think the aim is to replace those things.
But I don't think that's the point of what's going on here.
Am I right?
That's right.
But there is a tendency when you come at a problem with a technology solution to look at what you are currently doing and replace that with something that's automatic.
Oh, you're using a guide dog? How can I replace that guide dog and give you a robot?
So I work on technology that enhances mobility independence through audio and sound, which we'll talk about in a minute.
But often people ask me, how would that
work for people who can't hear? And the natural inclination to them is to say, oh, okay, well,
you'll have to deliver the information in a different way. The thing is that people get a
sense of their space and their surroundings using the senses that they have. To me, the question is not how do we shortcut that? It's how do they sense their space today? They do. They don't sit there feeling completely disconnected. And if you are going to intervene in that, because you and I talked earlier about the fundamental role that design plays in the you were just referring to. Yeah. You know, if I'm looking at you and I think, well, my solution to how you interact
with the world with technology would be braille, that's an assumption.
So I'm just going to give you free rein here.
Tell us what you want us to know about this from your perspective.
We all make assumptions about other people's experience of life.
You're referring to Bill Buckson, who was on your podcast a few weeks ago.
Right.
And he's actually been a very close friend and a mentor
throughout the work that we are doing on Soundscape,
which we'll talk about in a minute.
Yes.
And he's really brought to our attention that what we've done
of going out there and experiencing the real
situation that people are experiencing is about empathy and it's about trying to understand and
probe ideas that challenge your assumptions about what effect they will have. But really seeing, observing and understanding their experience in that particular situation
and then maybe applying from your learning some form of intervention into that experience
and observing how that affects that experience.
It doesn't have to be a complete piece of software or technology.
It's just an intervention.
It can be completely lo-fi.
That helps you to start expanding your understanding.
And you don't have to do it with 100 people.
Do it with two, three people.
You will discover a whole new world you didn't know about.
I'm sorry, but you don't need 200 data points to support
that experience. You've just seen it. And you can build on that. So can you enhance
that in any way to give them an even richer awareness of their surrounding? And those
are the kind of questions that taking design through that very experiential lens has led us to
the work that we actually do on our work on Soundscape, which is the technology that we've
been developing over the last few years to really see how far we can take this notion
of how people perceive the world and how you can enhance that so their perception is enhanced.
Well, let's talk about 3D sound
and an exciting launch earlier this year in the form of Microsoft Soundscape.
And this is such a cool technology with so many angles to talk about.
First, just give our listeners an overview of Soundscape.
What is it? Who's it for? How does it work? How do people experience it?
Soundscape is a technology that we developed in collaboration with Guide Dogs, certainly in the early stages and still do.
And the idea is very much using audio that's played in 3D
using a stereo headset,
you can hear the landmarks that are around you
and you can thereby really enrich your awareness of your surroundings, of what's where,
in a very natural, easy way. And that really helps you feel more independent, more confident
to explore the world beyond what you know. How do you hear a landmark?
How do you hear a landmark? So, for example, if you're standing and Starbucks is in
front of you and to the right, we will say the word Starbucks, but we won't say it's in front
of you and to the right. It will sound like it is over there where Starbucks is. Okay. And that's
generated using, the technical term is head rotation transfer of synthetic binaural audio.
So it's work that actually was developed in Microsoft Research over a number of years by Ivan Tashev and his team.
And effectively, you can generate sound to make it sound like it's not in between your ears.
You can hear it as though it's out in the space around you.
And it's really quite amazing.
And we also use non-audio cues.
For example, one of the ideas that we built into Soundscape is this notion of a virtual audio beacon.
Not to be confused with Bluetooth beacons.
It's completely virtual.
But let's suppose you're standing on a street corner and you are heading to a restaurant that's a block and a half away.
What you can do with Soundscape is place an audio beacon that will sound like it's coming from that restaurant. So no matter which way you're standing, which way
you're heading, you can always hear that click-click sound. So you know exactly where that
restaurant is. You can see it with your ears. How do you do that? How do you place a beak in
some place, technically? Binaural audio is when you have a slightly different sound in each ear,
which tricks the brain into having a sense of that sound is three-dimensional.
It's exactly the same way that 3D images work.
Audio works almost the same.
If Ivan was here, he would say it's not exactly the same.
But by generating a slightly different sound wave in each ear,
you are able to make sound sound like it's coming from a specific direction.
But by playing it in each ear slightly differently,
it will actually sound like it's coming from in front of you and to the right.
Now, how do we know where to place that beacon?
At present, it's largely designed to be used outdoors.
So we use GPS so we know where you are standing.
We know where that restaurant is.
So we have two coordinates to work with.
We also estimate which way you're facing.
So if you were facing the restaurant,
we would want to play that beacon right in front of you.
If you were standing at 90 degrees to the restaurant,
we'd want to make that beacon sound like it's coming
not only from your right ear,
but 100 meters away to your right.
Unbelievable.
And so taking all of those sensory inputs and taking the information
from the map, the GPS location, the direction, we reproduce the sound image in your stereo headset
so that you can hear the direction of the sound and where things is. And the most amazing thing
is this is all done in real time, completely dynamic.
So as you walk down the street, that restaurant may sound in front of you at 45 degrees to your right.
And as you progress, you'll hear it getting closer and closer and further and further to your right and further and further to your right.
And if you overshoot it, it'll start to sound behind you a little bit.
Now, why is this so important?
Because I'm not going to the restaurant on my own.
I'm there with my kid or with my wife or with my friend.
And if I were to hold a phone with the GPS instructions and all of that,
I can't hold a conversation with that person at the same time
because I'm so engaged with the technology.
And we talked earlier about how do you get the technology to be in the background.
That beacon sound is totally in the background.
You don't have to think about it.
You don't have to attend to it mentally.
It's just there. So you know where the restaurant is and you continue to have the conversation with the person you're with.
Or you can daydream.
Or you can read your emails, listen to a podcast.
And all of that happens at the same time.
Because it's played in 3D space.
Because it's non-intrusive.
We minimize the use of language and all of these subtle aspects are absolutely crucial for this
kind of technology to be relevant to this situation you're not sitting in front of the
computer and it's the only thing you're doing you are outdoors there's a ton of things happening
all the time that you have to deal with you can't expect the person to disassociate themselves from all of that. You know, soundscape is one way of addressing this very, very interesting and important question.
And throughout history, technology has always changed the way that we do things.
But I think that we're starting to see that as technology developers, we really have to be much more mindful about just from the subtleties of how we design something on what is so in a way that makes the person feel empowered and
develop a new skill. Great runners learn to feel their heartbeat. But if they have a heart monitor,
they'll stop feeling that heartbeat because the device on their wrist tells them what it is. Well, that's only because that's how
it was designed. If the heart monitor, instead of telling you you're at, I don't know, 150,
it'd say, what do you think you're at? And you'd say, oh, I'm at 140. And then it'll say,
oh, you're actually at 150. You will have learned something new from that. It's exactly the same function,
but you have developed yourself as a result of that interaction. And I think that that's the
kind of opportunity that we need to start looking for. I want to circle back to this 3D audio and
the technology behind it and something that you referred to as the cocktail party effect. Can you explain that a
little bit and how Microsoft Research is sort of leading the way here? The cocktail party effect
is an effect in the world of psychoacoustics that is very simple. If you imagine you're sitting
around a table in a cocktail party having a very exciting conversation with somebody
and there are lots of other similar
conversations happening around you at the same time. Because all of those conversations are
happening in 3D space, you're actually able to hear all of those conversations, even though
you're attending just to yours. You're listening and you can understand and engage in your conversation.
But if your name came up in any of those other conversations, you'll immediately turn your head and say,
hey guys, what are you talking about there?
And that's an incredible capability of the brain to manage a very rich set of inputs in the auditory space that is very much underutilized today in the technology space.
We always feel that if we need to convert something into audio, it's got to be sequenced because we can only hear one thing at a time.
When it's in 3D, that's no longer the case.
And that's a huge opportunity. We play a lot of that in VR and augmented reality,
and we spend a lot of time on the visual aspect of virtual reality
and really pushing the envelope on how far we can take
the use of immersive experiences and objects in all directions.
But the same is available with audio.
Even more with audio because your eyes are no longer engaged.
Audio is in 360.
If we block our ears for a moment, all of a sudden our awareness level drops.
But we are so unaware of the power of audio because vision just takes over everything.
And I think the work that we have done,
both in the acoustic work on 3D audio,
and the application, especially in the disability space,
where we place the constraints on the team.
There is no vision.
Now let's figure it out.
And that leads to new frontiers of discovery and innovation
in this space that I think could be applicable
and would be applicable in many other spaces.
And that heads-up experience when you're out and about in the street, not focused on the screen, but engaged in your surrounding.
And that's a perfect situation where audio has huge advantages that we can look at.
I ask each of my guests some version of the question, what keeps you up at night?
Because I'm interested in how researchers are addressing unintended consequences
of the work they're doing. Is there anything that concerns you, Amos? Anything that keeps
you up at night?
Amos Kittes I think things keep me up at night because
they are so interesting and yet unsolved. You know, we talked a bit about how do you
really express and portray the physical space around you in ways that utilize your other senses
and really maximize the ability of the brain to make sense of places without vision.
And I really think that with Soundscape we've only started to scratch the surface of that question.
Over half of the brain is devoted to perception. And I think that
when we find ways to really engage, even further engage that incredible human capability,
we will discover a whole new frontier of machine and human interaction in ways that we don't
understand today. You said you arrived at Microsoft Research from left field.
What's your story on how you came to be working on research and accessibility at Microsoft
Research?
I started life as a developer and did a business degree and joined the strategic advisory services
in Microsoft Consulting in the UK. And I think it was a very special moment in Microsoft over the last few years when we
really started to understand the meaning of impacting every person on the planet with
technology and seeing that as our mission.
And that led to a series of conversations that opened an opportunity for us to actually get behind that statement.
And we basically joined Microsoft Research through that mission, through the work that we're doing in Soundscape.
And because we already had very strong relationships, thanks to some wonderful people in the company,
strong relationships here in Microsoft Research and some wonderful people in the company, strong relationships
here in Microsoft Research and in other parts of the company.
Before we close, Bill Buxton asked me to ask you about the kayak regatta that you organized.
Oh, I didn't talk about that.
Just tell that story quickly, because I do have one question I want to wrap up with before
we go. Okay. Well, we talked about Soundscape
as a technology that really enables you
to hear signals in 3D around you.
And that was largely designed
to be used in the street, right?
And then we thought,
what would happen if we placed
that audio beacon on a lake?
So we got a bunch of people during the summer hackathon
and said, okay, well, let's try it out.
So we organized an event on Lake Sammamish.
We hacked Soundscape to work on the lake
and placed some virtual audio beacons around the lake
and invited a group of people who are blind
to come and kayak with us
and see how they enjoy it.
And they absolutely loved it.
And I think that was a real eye-opener for us.
You have to understand the difference here.
Could they kayak before?
Sure, no problem.
Because a sighted person would be with them and would tell them,
OK, now you row straight, now you row left.
But, I'm sorry, that's a very boring experience you're not in control you're not independent
you're just doing the work and by being able to hear where those beacons are you are truly
in the driving seat and that is a sense of independence that we've not really seen to that extent before we did this event.
I like how you called it an eye-opening event.
It was.
There's so many metaphors about vision that we just sort of take for granted, right?
Maybe it's because I have prior sight, maybe not.
But first of all,
I use those metaphors
all the time,
and I also feel,
you know,
I could close my eyes
and feel that my eyes
are closed
and open them
and feel that they're open,
and I definitely
take everything in
in a very different way.
Sure.
Even though the eyes
don't actually do
the scientific aspect
of what they're designed to do.
As we close,
I always ask my guests
to offer some parting advice to our listeners,
whether that be in the form of inspiration for future research or challenges that remain to
be solved or personal advice on next steps along the career path, whether you have a guide dog
with you or soundscape. What would you say to your 25-year-old self if you were just getting started in this arena?
I honestly would say get real-life experience,
especially in the areas that you're passionate about.
Be passionate about them with even more energy and see the work that you do in the context of what you're passionate about
because you can only really apply your personal experience as to what you're passionate about, because you can only really apply your personal experiences to what you do.
It's so great here in Microsoft Research
to see the interns coming here in the summer
and the creativity and passion and new perspectives
that they bring to our work here.
And there's a little bit of a side of me
that worried that they'll jump into the job
before they went out and
explored the world.
And I think it's important that they find a way to do something that gives them that
meaningful context to the work that they'll be doing here.
Amos Miller, thank you for joining us today.
It's been, can I say it, an eye-opening experience.
Sure.
My pleasure. Thanks so much for having me.
To learn more about Amos Miller and the latest innovations in audio,
sound, and accessibility technology, visit microsoft.com slash research.