Microsoft Research Podcast - Collaborators: Healthcare Innovation to Impact
Episode Date: May 20, 2025In this discussion, Matthew Lungren, Jonathan Carlson, Smitha Saligrama, Will Guyman, and Cameron Runde explore how teams across Microsoft are working together to generate advanced AI capabilities and... solutions for developers and clinicians around the globe.
Transcript
Discussion (0)
You're listening to collaborators at Microsoft Research Podcast,
showcasing the range of expertise that goes into transforming
mind-blowing ideas into world-changing technologies. Despite the advancements in AI over the decades, generative AI exploded into view in 2022 when
ChatGPT became the sort of internet browser for AI and became the fastest adopted consumer
software application in history.
From the beginning, healthcare stood out to us as an important opportunity for general
reasoners to improve the lives and experiences of patients and providers.
Indeed, in the past two years, there's been an explosion of scientific papers looking
at the application first of text reasoners in medicine, then multimodal reasoners that
can interpret medical images, and now
most recently healthcare agents that can reason with each other.
But even more impressive than the pace of research has been the surprisingly rapid diffusion
of this technology into real-world clinical workflows.
So today we'll talk about how our cross-company collaboration has shortened that gap and delivered
advanced AI capabilities and solutions into the hands of developers and clinicians around the world,
empowering everyone in Health and Life Sciences to achieve more.
I'm Dr. Matt Lundgren, Chief Scientific Officer for Microsoft Health and Life Sciences.
And I'm Jonathan Carlson, Vice President and Managing Director of Microsoft Health Futures.
And together we brought some key players leading in the space of AI and healthcare from across Microsoft. Our guests today are Smitha Salangrama,
Principal Group Engineering Manager within Microsoft Health and Life
Sciences, Will Gaiman, Group Product Manager within Microsoft Health and
Life Sciences, and Cameron Rundee, a Senior Strategy Manager for Microsoft
Health Futures. We've asked these brilliant folks to join us because
each of them represents a mission critical group
of cutting edge stakeholders scaling breakthroughs
into purpose built solutions and capabilities
for healthcare.
We'll hear today how generative AI capabilities
can unlock reasoning across every data type in medicine,
text, images, wave forms, genomics,
and further how multi-agent frameworks in healthcare
can accelerate complex workflows,
in some cases acting as a specialist team member
safely secured inside the Microsoft 365 tools
used by hundreds of millions of healthcare enterprise users
across the world.
The opportunity to save time today and lives tomorrow with AI
has never been larger.
Jonathan, you know, it's been really interesting kind of observing Microsoft research over the decades.
I've been watching you guys in my prior academic career.
You are always on the front of innovation, particularly in healthcare.
And I find it fascinating that millions of people are using the solutions that your team
has developed over the years.
And yet you still find ways to stay cutting edge and stay the art, even in this accelerating time
of technology and AI particularly.
How do you do that?
I mean, to some level it's in our DNA.
I mean, we've been publishing in health and life sciences
for two decades here.
But when we launched Health Futures as a mission-focused lab
about seven or eight years ago,
we really started with the premise
that the way to have impact was to really close the loop between not just good ideas that get
published but good ideas that can actually be grounded in real problems that clinicians
and scientists care about that then allow us to actually go from that first proof of
concept into an incubation, into getting real world feedback that allows us to close that
loop.
And now with the HLS organization here, the product group,
we have the opportunity to work really closely with you all to not just prove
what's possible in the clinic or in the lab, but actually start scaling that
into the broader community.
And one thing I'll add here is that the problems that we're trying to tackle in
healthcare are extremely complex.
And so as Jonathan said, it's really important that we come together and collaborate across disciplines,
as well as across the company of Microsoft,
and with our external collaborators as well
across the whole industry.
So Matt, back to you though,
what are you guys doing in the product group?
How do you guys see these models getting into the clinic?
I think a lot of people think about AI as just,
maybe just even a few years old because of GPT
and how that really captured the public's consciousness, right?
And so you think about the speech-to-text technology
of being able to dictate something for a clinic note
or for a visit.
That was typically based on nuanced technology.
And so there's a lot of product understanding of the market,
how to deliver something that clinicians will use,
understanding the pain points and workflows,
and really that health IT space,
which is sometimes the third rail, I feel like,
with a lot of innovation in healthcare.
But beyond that, I mean, I think now that we have this
really powerful engine of Microsoft
and the platform capabilities,
we're seeing innovations on
the healthcare side for data storage, data interoperability, with different types of
medical data. We have new applications coming online, the ability of course to see generative
AI now infused into the speech to text and becoming Dragon Copilot, which is something
that has been tremendously received by the community.
Physicians are able to now just have a conversation
with a patient.
They turn to their computer and the note is ready for them.
There's no more of this, we call it keyboard liberation.
I don't know if you've heard that before.
And that's just been tremendous.
And there's so much more coming from that side.
And then there's other parts of the workflow
that we also get engaged in, the diagnostic workflow.
So medical imaging,
sharing images across different hospital systems,
the list goes on.
So now when you move into AI,
we feel like there's this huge opportunity to deliver
capabilities into the clinical workflow
via the products and solutions we already have.
But I mean, Will, now that we've expanded
our team to involve
Azure and platform, we're really able to now focus on the developers.
Yeah, and you're always telling me as a doctor how frustrating it is to be spending time
at the computer instead of with your patients. I think you told me, you know, 4,000 clicks
a day for the typical doctor, which is tremendous. And something like Dragon Copilot can save that five minutes per patient.
But it can also now take actions after the patient encounter.
So it can draft the after visit summary, it can order labs and
medications, the referral.
And that's incredible and we want to keep building on that.
There's so many other use cases across the ecosystem and so
that's why in Azure AI Foundry,
we have translated a lot of
the research from Microsoft Research and made that
available to developers to
build and customize for their own applications.
Yeah. And as Will was saying,
in our transformation of moving from solutions to platforms,
and as scaling solutions to other multiple scenarios as we put our models in
AI foundry.
We provide this developer capabilities like bring your own data and fine tune these models
and then apply it to scenarios that we couldn't even imagine.
So that's kind of the platform play we are scaling now.
Well, I want to do a reality check because I think to us that are now really focused
on technology, it seems like I've heard this story before.
I remember even in my academic clinical days where it felt like technology was always the
quick answer and it felt like technology was, there
was maybe a disconnect between what my problems were or what I think needed to be done versus
kind of the solutions that were kind of created or offered to us.
And I guess at some level, how, Jonathan, do you think about this?
Because to do things well in the science space is one thing, to do things well in science,
but then also have it be something that actually drives healthcare innovation and practice and translation.
It's tricky, right?
Yeah.
I mean, as you said, I think one of the core pathologies of big tech is we assume every
problem is a technology problem and that's all it will take to solve the problem.
And I think, look, I was trained as a computational biologist and that sits in the awkward middle
between biology and computation.
And the thing that we always have to remember, the thing that we were very acutely aware
of when we set out was that we are not the experts.
And we do have, you know, you as an MD, we have other MDs on the team, we have biologists
on the team.
But this is a big space.
And the only way we're going to have real impact, the only way we're even going to pick
the right problems to work on is if we really partner deeply with providers, with EHR vendors, with scientists, and really understand what's
important and again get that feedback loop.
Yeah, I think we really need to ground the work that we do in the science itself. We
need to understand the broader ecosystem and the broader landscape across healthcare and
life sciences so that we can tackle the most important problems, not just the problems
that we think are important because as Jonathan said, we're not the experts in healthcare
and life sciences.
And that's really the secret sauce.
When you have the clinical expertise come together with the technical expertise, that's
how you really accelerate healthcare. When we really launched this mission seven or eight years ago,
we really came in with the premise of, if we decide to stop,
we want to be sure the world cares.
And the only way that's going to be true is if we're really
deeply embedded with the people that matter,
the patients, the providers, and the scientists.
And now it really feels like this collaborative effort
really can help sort of extend that mission, right?
I think we know Will and Smith,
that we definitely feel the passion and the innovation,
and we certainly benefit from those collaborations too,
but then we have these other partners and even customers,
right, that we can start to tap into
and have that flywheel keep spinning.
Yeah, and the whole industry is an ecosystem,
so we have our own data sets at Microsoft Research
that you've trained amazing AI models with,
and those are on the catalog.
But then you've also partnered with institutions
like Providence or Page AI,
and those models are in the catalog with their data.
And then there are third parties like Nvidia
that have their own specialized proprietary data sets,
and their models are there too.
So we have this ecosystem of open source models,
and maybe Smithy you want to talk about
how developers can actually customize these.
Yeah. So we use the Azure AI Foundry ecosystem.
Developers can feel at home if they're using the AI Foundry,
so they can look at our model cards
that we publish as part of the models we publish,
understand the use cases of these models,
how to quickly bring up these APIs,
and look at different use cases of how to apply these,
and even fine-tune these models with their own data,
and then use it for specific tasks
that we couldn't have even imagined.
Yeah, it has been interesting to see,
we have these healthcare models on the catalog.
Again, some that came from research,
some that came from third parties
and other product developers.
And Azure is kind of becoming the home base,
I think, for a lot of health and life science developers.
They're seeing all the different modalities,
all the different capabilities, and then in combination
with Azure OpenAI, which as we know
is incredibly competent in lots of different use cases.
How are you looking at the use cases?
What are you seeing folks use these models for as they
come to the catalog and start sharing their discoveries
or products?
Well, the general purpose, large language models
are amazing for medical general reasoning.
So Microsoft research has shown that they can perform super
well on, for example, like the United States medical licensing
exam.
They can exceed doctor performance
if they're just picking between different multiple choice
questions.
But real medicine we know is messier.
It doesn't always start with
the whole patient context provided as text in the prompt.
You have to get the source data and
that raw data is often non-text.
The majority of it is non-text.
It's things like medical imaging, radiology, pathology,
ophthalmology, dermatology, it goes on and on.
There's then this signal data, lab data.
So all of this diverse data type needs to
be processed through specialized models,
because much of that data is not
available on the public Internet.
And that's why we're taking this partner approach,
first-party and third-party models that can
interpret all this kind of data and then connect them
ultimately back to these general reasoners to reason over that.
So, I've been at this company for a while, and I'm familiar with how long it takes generally
to get a really good research paper, do all the studies, do all the data analysis, and
then go through the process of publishing, which takes, as you know, a long time.
And it's very rigorous.
And one of the things that struck me last year,
I think we started this big collaboration.
And within a quarter, you had a Nature paper coming out
from Microsoft Research.
And that model that the Nature paper was describing
was ready to be used by anyone on the Azure HIA Foundry
within that same quarter.
It kind of blew my mind when I thought about it. Even though we were all working very hard to get that done, to be used by anyone on the Azure HIA Foundry within that same quarter.
It kind of blew my mind when I thought about it, even though we were all working very hard
to get that done.
Any thoughts on that?
I mean, has this ever happened in your career?
And what's the secret sauce to that?
Yeah.
I mean, the time scale from research to product has been massively compressed.
And I'd push that even further, which is to say the reason why it took a quarter was because
we were laying the railroad tracks as we were driving the train.
We have examples right after that where we were launching on Foundry the same day we
were publishing the paper.
And frankly, the review times are becoming longer than it takes to actually productize
the models.
I think there's two things that are going on with that that are really converging.
One is that the overall ecosystem is converging on a relatively small number of patterns.
And that gives us as a tech company a reason to go off and really make those patterns hardened
in a way that allows not just us, but third parties as well, to really have a nice workflow
to publish these models.
But the other is actually I think a change in how we work.
And for most of our history as an industrial research lab, we would do research, and then we'd go pitch it to somebody
and try and throw it over the fence.
We've really built a much more integrated team.
In fact, if you look at that Nature paper
or any of the other papers, there's
folks from product teams.
Many of you are on the papers along
with our clinical collaborators.
Yeah, I think one thing that's really important to note
is that there's a ton of different ways
that you can have impact.
So I like to think about phasing., in Health Futures at least, I like to think about phasing
the work that we do.
So first we have research, which is really early innovation.
And the impact there is getting our technology and our tools out there and really sharing
the learnings that we've had.
So that can be through publications, like you mentioned, it can be through open sourcing our models. And then you go to incubation. So this is,
I think, one of the more new spaces that we're getting into, which is maybe that blurred
line between research and product, right, which is how do we take the tools and technologies
that we've built and get them into the hands of users, typically through our partnerships,
right?
So we partner very deeply and collaborate very deeply
across the industry.
And incubation is really important
because we get that early feedback,
we get an ability to pivot if we need to,
and we also get the ability to see what types of impact
our technology is having in the real world.
And then lastly, when you think about scale,
there's tons of different ways that you can scale.
We can scale third party through our collaborators
and really empower them to go to market,
to commercialize the things that we've built together.
You can also think about scaling internally,
which is why I'm so thankful that we've created
this flywheel between research and product.
And a lot of the models that we've built
that have gone through research,
have gone through incubation,
have been able to scale on the Azure AI Foundry.
But that's not really our expertise, right?
The scale piece in research, that's research and incubation.
Smitha, how do you think about scaling?
So there are several angles to scaling the models,
the state of the art models we receive from the research team.
The first angle is we open source it
to get the developer trust and very generous
commercial licenses so that they can use it
and for their own use cases.
The second is we also allow them to customize these models, fine
tuning these models with their own data. So a lot of different angles of how we
provide support in scaling these state-of-the-art models we get from
the research. And as one example, you know University of Wisconsin Health, you
know which Matt knows well, they took one of our models, which is highly versatile.
They customized it in Foundry, and they optimized it
to reliably identify abnormal chest x-rays,
the most common imaging procedure,
so they could improve their turnaround time triage quickly.
And that's just one example, but we have other partners
like Sector who are doing more of operations,
use cases,
automatically routing imaging to the radiologist setting them up to be efficient.
And then PageAI is doing biomarker identification for
actually diagnostics and new drug discovery.
So there's so many use cases that we have partners already who are building and
customizing.
Yeah, the part that's striking to me is just that we could all sit in a room and
think about all the different ways someone might use these models on the catalog. Yeah, the part that's striking to me is just that we could all sit in a room and think
about all the different ways someone might use these models on the catalog.
And I'm still shocked at the stuff that people use them for and how effective they are.
And I think part of that is, again, we talk a lot about generative AI in healthcare and
all the things it can do.
Again, in text, you refer to that earlier.
And certainly off the shelf, there's really powerful applications. But there is this tip of the iceberg effect
where under the water, most of the data
that we use to take care of our patients is not text.
It's all the different other modalities.
And I think that this has been an unlock,
taking these innovations from the community,
putting them in this ecosystem catalog, essentially, right?
And then allowing folks to build and develop applications with all these different types
of data.
Again, I've been surprised at what I'm seeing.
This has been just one of the most profound shifts that's happened in the last 12 months,
really.
Two years ago, we had general models in text that really shifted how we think about natural
language processing, got totally upended by that.
It turns out the same technology works for images as well.
It doesn't only allow you to automatically extract concepts from images, but allows you
to align those image concepts with text concepts, which means now you can have a conversation
with that image.
And once you're in that world, now you're in a place where you can start stitching together
these multimodal models that really change how you can interact with the data and how
you can start getting more information out of the raw primary data that is part of the
patient journey.
Well, and we're going to get to that because I think you just touched on something, and
I want to reemphasize, stitching these things together.
There's a lot of different ways to potentially do that, right?
There's ways that you can literally train the model end
to end with adapters and all kinds of other early fusion,
late fusion, all kinds of ways.
But one of the things that the word of the year
is going to be agents.
And agent is a very interesting term
to think about how you might abstract away
some of the components or the tasks
that you want the model to accomplish
in the midst of a real human to maybe model interaction.
Can you talk a little bit more about how we're thinking about
agents in this platform approach?
Well, this is our newest addition to the Azure AI Foundry.
So there's an agent catalog now where we have a set of
pre-configured agents for health care.
And then we also have a multi-agent orchestrator that
can jumpstart the process of developers building
their own multi-agent workflows to tackle
some complex real-world tasks that clinicians have to deal with.
And these agents basically combine a general reasoner
like a large language model like a GPT-4.0 or an O-series
model, with a specialized model, like a model that understands
radiology or pathology, with domain-specific knowledge
and tools.
So the knowledge might be public guidelines or medical journals
or your own private data from your EHR or medical imaging
system, and then tools like
Code Interpreter to deal with all of the numeric data,
or tools that clinicians are using today,
like PowerPoint, Word, Teams, and etc.
So, we're allowing developers to build and
customize each of these agents in Foundry,
and then deploy them into their workloads.
I really like that concept because as a,
from the user persona, I think about myself as a user,
how am I gonna interact with these agents?
Where does it naturally fit?
And I sort of, I've seen some of the demonstrations
and some of the work that's going on with Stanford
in particular, showing that literally in a Teams chat, I can have my collision colleagues
and I can have specialized healthcare agents that kind of interact like I'm interacting
with a human on a chat.
It is a completely mind blowing thing for me and it's a light bulb moment for me too.
And I wonder what have we heard from folks that have tried out this healthcare agent
orchestrator in
this kind of deployment environment via Teams?
Well someone joked, you know, are you sure you're not using Teams because you work at
Microsoft?
But then we actually were meeting with one of the radiologists at one of our partners
and they said that that morning they had just done a Teams meeting where they had met with
other specialists to talk about a patient's cancer case
where they were coming up with a treatment plan.
That was the light bulb moment for us,
we realized actually Teams is already being used by
physicians as an internal communication tool,
as a tool to get work done.
Especially since the pandemic,
a lot of the meetings moved to virtual and telemedicine.
And so it's a great distribution channel for AI,
which has often been a struggle for AI
to actually get in the hands of clinicians.
And so now we're allowing developers to build
and then deploy very easily and extend it
into their own workflows.
I think that's such an important point.
If you think about one of the really important concepts in computer science is
an application programming interface, like some set of rules that allow two applications
to talk to each other.
One of the big pushes, really important pushes in medicine has been standards that allow
us to actually have data standards and APIs that allow these to talk to each other.
And yet still, we end up with these silos.
There are silos of data, There's silos of applications.
And just like when you and I work on our phone,
we have to go back and forth between applications.
One of the things that I think agents do
is it takes the idea that now you
can use language to understand intent
and effectively program an interface.
And it creates a whole new abstraction layer
that allows us to simplify the interaction between
not just humans and the endpoint but also for developers.
It allows us to have this abstraction layer that
lets different developers focus on
different types of models and yet stitch them all together in
a very natural way not just for the users,
but for the ability to actually deploy those models.
Just to add to what Jonathan was mentioning,
the other cool thing about the Microsoft Teams user interface is it's also enterprise ready.
And one important thing that we're thinking about is exactly this. From the
very early research through incubation and then to scale obviously, right? And so
early on in research we are actively working with our partners and our collaborators
to make sure that we have the right data, privacy and consents in place.
We're doing this in incubation as well, and then obviously in scale.
So I think AI has always been thought of as a savior kind of technology.
We talked a little bit about how there's been some ups and downs in terms of the ability
for technology to be effective in health care. At the same time, we're seeing a lot of new
innovations that are really making a difference. But then we kind of get, you know, we talked
about agents a little bit. It feels like we're maybe abstracting too far. And, you know, it's,
it may be if things are going too fast, almost. What makes this different? I mean, in your mind,
is this a truly a logical next step, or is it going to take some time?
I think there's a couple of things that have happened.
I think first on just the pure technology, what led to chat GBT?
And I like to think of really three major breakthroughs.
The first was new mathematical concepts of attention, which really means that we now
have a way that a machine can figure out which parts of the context it should actually focus
on just the way our brains do.
I mean, if you're a clinician and somebody's talking to you, the majority of that conversation
is not relevant for the diagnosis, but you know how to zoom in on the parts that matter.
That's a super powerful mathematical concept.
The second one is this idea of self-supervision.
So I think one of the fundamental problems of machine learning has been that you have
to train on labeled training data.
And labels are expensive, which means data sets are small, which means the final models
are very narrow and brittle.
And the idea of self-supervision is that you can just get a model to automatically learn
concepts.
And language is just predict the next word.
And what's important about that is that leads to models that can actually manipulate and
understand really messy text and pull out what's important about that and
then stitch that back together in interesting ways.
And the third concept that came out of those first two is just the observation of scale.
And that's the more is better, more data, more compute, bigger models.
And that really leads to a reason to keep investing and for these models to keep getting
better.
So that is a groundwork.
That's what led to Chatt GPT.
That's what led to our ability now to not just have rule-based systems or simple machine
learning-based systems to take a messy EHR record, say, and pull out a couple concepts,
but to really feed the whole thing in and say, OK, I need you to figure out which concepts
are in here and is this particular attribute there, for example.
That's now led to the next breakthrough, which is all those core ideas apply to images as
well. They apply to images as well.
They apply to proteins, to DNA.
And so we're starting to see models that understand images and the concepts of images and can
actually map those back to text as well.
So you can look at a pathology image and say, not just that's a cell, but it appears that
there's some sort of cancer in this particular tissue there.
And then you take those two things together and you layer on the fact that now you have
a model or a set of models that can understand intent, can understand human concepts and
biomedical concepts, and you can start stitching them together into specialized agents that
can actually reason with each other, which at some level gives you an API as a developer
to say, okay, I need to focus on a pathology model and get this really, really sound while
somebody else is focusing on a radiology model that now allows us to stitch these all together with
a user interface that we can now talk to through natural language.
I'd like to double click a little bit on that medical abstraction piece that you mentioned.
Just the amount of data, clinical data that there is for each individual patient.
Let's think about cancer patients for a second
to make this real, right?
For every cancer patient, it could take a couple of hours
to structure their information.
Why is that important?
Because you have to get that information
in a structured way and abstract relevant information
to be able to unlock precision health applications, right, for each patient. So to be able to match precision health applications for each patient.
So to be able to match them to a trial, someone has to sit there and go through all the clinical
notes from their entire patient care journey from the beginning to the end.
And that's not scalable.
And so one thing that we've been doing in an active project that we've been working
on with a handful of our partners, but Providence specifically, I'll call out,
is using AI to actually abstract and curate
that information so that gives time back
to the healthcare provider to spend with patients
instead of spending all their time
curating this information.
And this is super important
because it sets the scene and the backbone for all those precision health
applications.
Like I mentioned, clinical trial matching.
Tumor boards is another really important example here.
And maybe, Matt, you can talk to that a little bit.
It's a great example.
And it's so funny.
We've talked about this use case.
And the health care agent orchestrators
is sort of the initial lighthouse use case
was a tumor board setting.
And I remember we first started working with some of the partners on this, I think
we were under a research kind of lens thinking about what could this, what new diagnoses
could it come up with or what new insights may it have.
And what was really a really key moment for us, I think, was noticing that we had developed
an agent that can take all of the multimodal data about
a patient's chart, organize it in a timeline in chronological fashion, and then allow folks
to click on different parts of the timeline to ground it back to the note.
And just that, which doesn't sound like a really interesting research paper, it was
mind-blowing for clinicians who, again, as you said, spend a great deal of time, often
outside of their typical work hours, trying to organize these patient records in order
to go present at a tumor board.
And a tumor board is a critical meeting that happens at many cancer centers where specialists
all get together, come with their perspective, and make a comment on what would be the best
next step in treatment.
But the background in preparing for that is, you know, again, organizing the data. But to your
point also, what are the clinical trials that are active? There are thousands of clinical trials.
There's hundreds every day added. How can anyone keep up with that? And these are the kinds of
use cases that start to bubble up. And you realize that a technology that understands concepts, context,
and can reason over vast amounts of data with a language interface, that is a powerful tool.
Even before we get to some of the, you know, unlocking new insights and even precision medicine,
this is that idea of saving time before lives to me. And there's an enormous amount of
undifferentiated heavy lifting that happens in healthcare that these agents and these kinds of workflows can start to unlock.
And we've packaged these agents, the manual abstraction work, that manually takes hours.
Now we have an agent.
It's in Foundry along with the clinical trial matching agent, which I think at Providence
you showed could double the match rate over the baseline that they were using by
using the AI from multiple data sources.
So, we have that and then we have this orchestration
that is using this really neat technology from
Microsoft Research, semantic kernel,
and MagentaQuant OmniParser.
These are technologies that are good at
figuring out which agent to use for a given task.
So, a clinician who's used to working with
other specialists like a radiologist,
a pathologist, a surgeon,
they can now also consult these specialist agents
who are experts in their domain.
There's shared memory across the agents,
there's turn-taking, there's negotiation between the agents.
So, there's this really interesting system that's emerging.
Again, this is all possible to be used through Teams.
There's some great extensibility as well.
We've been talking about that and working on some cool tools.
Yeah. No, I think if I have to geek out a little bit on how
all this agent pick orchestrations are coming up,
like I've been in software engineering for decades,
it's a next version of
distributed systems where
you have these services that talk to each other. It's a more natural way
because LLMs are giving these natural ways, instead of a structured API ways
of conversing, we have these agents which can naturally understand how to talk to
each other. So this is like the next evolution of our systems now and And the way we are packaging all of this is multiple ways
based on all the standards and innovations
that's happening in this space.
So first of all, we are building these agents
that are very good at specific tasks,
like Will was saying, like trial matching agent
or patient timeline agents.
So we take all of these and then we package it
in a workflow or an orchestration.
We use these standards, some of these coming from research,
the semantic kernel, the Benjet take one.
And then all of these also allow us to extend these agents
with custom agents that can be plugged in. And then all of these also allow us to extend these agents
with custom agents that can be plugged in.
So we are open sourcing the entire agent orchestration
in AI Foundry templates so that developers
can extend with their own agents and make their own workflows
out of it.
So a lot of cool innovation happening
to apply these technology to
specific scenarios and workflows. Well I was going to ask you like so as part of
that extension so I could you know folks can go say hey I have a maybe a really
specific part of my workflow that I want to use some of these agents for maybe one
of the agents that can do PubMed literature search for example. But then
there's also agents that come in from the outside.
So like I can imagine a software company or AI company
that has built an agent that plugs in as well.
Yeah, yeah, absolutely.
So you can bring your own agent
and then we have these standard ways
of communicating with agents
and integrating with our orchestration frameworkangirls so you can bring your own agent and extend this health
care agent, agent orchestrator to your own needs. I can just think of like in a
group chat like a bunch of different specialist agents and I really would
want an orchestrator to help find the right tool to your point earlier because
I'm guessing this ecosystem is gonna expand quickly, and I may not know which tool is best for which question.
I just want to ask the question.
Yeah.
Yeah.
Well, I think to that point, too, I mean, you said an important point here, which is
tools, and these are not necessarily just AI tools, right?
I mean, we've known this for a while, right?
LLMs are not very good at math, but you can have it use a calculator, and then it works
very well.
And you guys both brought up the universal medical abstraction a couple times.
And one of the things that I find so powerful about that is we've long had this vision within
the precision health community that we should be able to have a learning hospital system.
We should be able to actually learn from the actual real clinical experiences that are
happening every day so that we can stop practicing medicine based off averages.
There's a lot of work that's gone on for the last 20 years about how to actually do causal
inference, say.
That's not an AI question.
That's a statistical question.
The bottleneck, the reason why we haven't been able to do that is because most of that
information is locked up in unstructured text.
And these other tools need essentially a table.
And so now you can decompose this problem and say, well, what if I can use AI not to get to the causal answer,
but to just structure the information so now I can put it
into the causal inference tool.
And these sorts of patterns, I think, again, become very,
not just powerful for a programmer,
but they start pulling together different specialties.
And I think we'll really see an acceleration,
really of collaboration across disciplines because of this. So when I joined Microsoft Research 18 years ago, I was doing work in computational biology
and I would always have to answer the question, why is Microsoft in biomedicine?
And I would always kind of joke saying, well, it is, we sell office and windows to every healthcare system in the world.
We're already in this space.
And it really struck me to now see that we've actually come full circle and now you can actually connect
in Teams, Word, PowerPoint, which are these tools
that everybody uses every day, but they're actually
now specializable through these agents.
Can you guys talk a little bit about what that looks like
from a developer perspective?
How can provider groups actually start playing with this
and see this come to life?
A lot of healthcare organizations already use
Microsoft productivity tools as you mentioned.
So as the developers build these agents and use
our healthcare orchestrations to plug in
these agents and expose these in these productivity tools,
they will get access to all these healthcare workers.
So the healthcare agent orchestrator we have today
integrates with Microsoft Teams,
and it showcases an example of how you can
add mention these agents and talk to them
like you were talking to another person in a Teams chat.
And then it also provides examples of these agents
and how they can use these productivity tools.
One of the examples we have there is how they can summarize
the assessments of this whole chat into a Word doc
or even convert that into a pop-up presentation
for later on.
One of the things that has struck me
is how easy it is to do.
I mean, well, I don't know if you've worked with folks
that have gone from zero to 60.
Like, how fast?
What does that look like?
Yeah, it's funny.
For us, the technology to transfer all this context
into a Word document or PowerPoint presentation
for a doctor to take to a meeting
is relatively straightforward compared to the complicated
clinical trial matching multimodal processing.
The feedback has been tremendous in terms of,
wow, that saves so much time to have
this organized report that then I can show it to a meeting with,
and the agents can come with me to that meeting,
because they're literally having a Teams meeting
often with other human specialists,
and the agents can be there and answer questions,
and fact check, and source all the right information on the fly.
So there's a nice integration into these existing tools.
We've worked with several different centers just to kind of
understand where this might be useful.
And like I think we talked about before,
the ideas that we've come up with, again,
this is a great one because it's complex, it's kind of hairy.
There's a lot of things happening under the hood
that don't necessarily require a medical license to do,
to prepare for tumor board and to organize data.
But it's fascinating, actually.
So folks have come up with ideas of,
could I have an agent that can operate an MRI machine?
And I can ask the agent to change some parameters
or redo a protocol.
We thought that was a pretty powerful use case.
We've had others that have just said,
I really want to have a specific agent that's
able to act like deep research does for the consumer side,
but based on the context of my patient
so that it can search all the literature
and pull the data and the papers that
are relevant to this case.
And the list goes on and on, from operations all the way
to clinical decision making at some level.
And I think that the research community that's
going to sprout around this will help us, guide us, I guess,
to see what is the most high impact use cases, where
is this effective, and maybe where it's not effective.
But to me, the part that makes me so, I guess, excited about this is just that I don't have
to think about, OK, well, then we have to figure out health IT.
Because we always have great ideas and research.
And it always feels like there's such a huge chasm to get it in front of the health care
workers that might want to test this out.
And it feels like, again, this productivity tool use case, again, with the enterprise
security, the possibility for bringing in third parties to contribute really does feel
like it's a new surface area for innovation.
Yeah, I love that. Let me end by putting you all on the spot. So in three years, multimodal
agents will do what? Now I'll start with you.
I am convinced that it's going to save a massive amount of time
before it saves many lives.
I'll focus on the patient care journey and diagnostic journey.
I think it will kind of transform
that process for the patient itself
and shorten that process.
I think we've seen already papers recently showing that different modalities surface
complementary information.
And so we'll see kind of this AI and these agents becoming an essential companion to
the physician, surfacing insights that would have been overlooked otherwise.
As similar to what you guys were saying, the agents will become important assistants to
healthcare workers, reducing a lot of documentation and workflow access work they have to do.
I love that.
I guess for my part, I think really what we're going to see is a massive unleash of creativity.
We've had a lot of folks that have been innovating in this space, but they haven't had a way to actually get it
into the hands of early adopters.
And I think we're going to see that really lead to an explosion
of creativity across the ecosystem.
So where do we get started?
Like, where are the developers who are listening to this,
the folks that are at labs, research labs,
and developing health care solutions,
where do they go to get started with the foundry,
the models we've talked about, the health care agent orchestrator. Where do they go to get started with the Foundry, the models we've talked about,
the healthcare agent orchestrator, where do they go?
So, ai.azure.com is the AI Foundry.
It's a website you can go as a developer,
you can sign in with your Azure subscription,
get your Azure account,
your own VM, all that stuff.
And you have an agent catalog,
the model catalog, you can start from there.
There is documentation and templates that you can then
deploy into Teams or other applications.
And tutorials are coming, right?
We have recordings of tutorials,
we'll have hackathons, some sessions,
and then more to come.
Yeah, we're really excited.
Thank you so much guys for joining us.
Yeah, it was a great conversation.
Thank you. Thanks everyone.
Thanks for joining us. Yeah, it was a great conversation. Thanks for having us.