Microsoft Research Podcast - Collaborators: Healthcare Innovation to Impact

Starting point is 00:00:00 You're listening to collaborators at Microsoft Research Podcast, showcasing the range of expertise that goes into transforming mind-blowing ideas into world-changing technologies. Despite the advancements in AI over the decades, generative AI exploded into view in 2022 when ChatGPT became the sort of internet browser for AI and became the fastest adopted consumer software application in history. From the beginning, healthcare stood out to us as an important opportunity for general reasoners to improve the lives and experiences of patients and providers. Indeed, in the past two years, there's been an explosion of scientific papers looking

Starting point is 00:00:58 at the application first of text reasoners in medicine, then multimodal reasoners that can interpret medical images, and now most recently healthcare agents that can reason with each other. But even more impressive than the pace of research has been the surprisingly rapid diffusion of this technology into real-world clinical workflows. So today we'll talk about how our cross-company collaboration has shortened that gap and delivered advanced AI capabilities and solutions into the hands of developers and clinicians around the world, empowering everyone in Health and Life Sciences to achieve more.

Starting point is 00:01:31 I'm Dr. Matt Lundgren, Chief Scientific Officer for Microsoft Health and Life Sciences. And I'm Jonathan Carlson, Vice President and Managing Director of Microsoft Health Futures. And together we brought some key players leading in the space of AI and healthcare from across Microsoft. Our guests today are Smitha Salangrama, Principal Group Engineering Manager within Microsoft Health and Life Sciences, Will Gaiman, Group Product Manager within Microsoft Health and Life Sciences, and Cameron Rundee, a Senior Strategy Manager for Microsoft Health Futures. We've asked these brilliant folks to join us because each of them represents a mission critical group

Starting point is 00:02:06 of cutting edge stakeholders scaling breakthroughs into purpose built solutions and capabilities for healthcare. We'll hear today how generative AI capabilities can unlock reasoning across every data type in medicine, text, images, wave forms, genomics, and further how multi-agent frameworks in healthcare can accelerate complex workflows,

Starting point is 00:02:25 in some cases acting as a specialist team member safely secured inside the Microsoft 365 tools used by hundreds of millions of healthcare enterprise users across the world. The opportunity to save time today and lives tomorrow with AI has never been larger. Jonathan, you know, it's been really interesting kind of observing Microsoft research over the decades. I've been watching you guys in my prior academic career.

Starting point is 00:02:52 You are always on the front of innovation, particularly in healthcare. And I find it fascinating that millions of people are using the solutions that your team has developed over the years. And yet you still find ways to stay cutting edge and stay the art, even in this accelerating time of technology and AI particularly. How do you do that? I mean, to some level it's in our DNA. I mean, we've been publishing in health and life sciences

Starting point is 00:03:15 for two decades here. But when we launched Health Futures as a mission-focused lab about seven or eight years ago, we really started with the premise that the way to have impact was to really close the loop between not just good ideas that get published but good ideas that can actually be grounded in real problems that clinicians and scientists care about that then allow us to actually go from that first proof of concept into an incubation, into getting real world feedback that allows us to close that

Starting point is 00:03:42 loop. And now with the HLS organization here, the product group, we have the opportunity to work really closely with you all to not just prove what's possible in the clinic or in the lab, but actually start scaling that into the broader community. And one thing I'll add here is that the problems that we're trying to tackle in healthcare are extremely complex. And so as Jonathan said, it's really important that we come together and collaborate across disciplines,

Starting point is 00:04:08 as well as across the company of Microsoft, and with our external collaborators as well across the whole industry. So Matt, back to you though, what are you guys doing in the product group? How do you guys see these models getting into the clinic? I think a lot of people think about AI as just, maybe just even a few years old because of GPT

Starting point is 00:04:27 and how that really captured the public's consciousness, right? And so you think about the speech-to-text technology of being able to dictate something for a clinic note or for a visit. That was typically based on nuanced technology. And so there's a lot of product understanding of the market, how to deliver something that clinicians will use, understanding the pain points and workflows,

Starting point is 00:04:51 and really that health IT space, which is sometimes the third rail, I feel like, with a lot of innovation in healthcare. But beyond that, I mean, I think now that we have this really powerful engine of Microsoft and the platform capabilities, we're seeing innovations on the healthcare side for data storage, data interoperability, with different types of

Starting point is 00:05:10 medical data. We have new applications coming online, the ability of course to see generative AI now infused into the speech to text and becoming Dragon Copilot, which is something that has been tremendously received by the community. Physicians are able to now just have a conversation with a patient. They turn to their computer and the note is ready for them. There's no more of this, we call it keyboard liberation. I don't know if you've heard that before.

Starting point is 00:05:35 And that's just been tremendous. And there's so much more coming from that side. And then there's other parts of the workflow that we also get engaged in, the diagnostic workflow. So medical imaging, sharing images across different hospital systems, the list goes on. So now when you move into AI,

Starting point is 00:05:53 we feel like there's this huge opportunity to deliver capabilities into the clinical workflow via the products and solutions we already have. But I mean, Will, now that we've expanded our team to involve Azure and platform, we're really able to now focus on the developers. Yeah, and you're always telling me as a doctor how frustrating it is to be spending time at the computer instead of with your patients. I think you told me, you know, 4,000 clicks

Starting point is 00:06:19 a day for the typical doctor, which is tremendous. And something like Dragon Copilot can save that five minutes per patient. But it can also now take actions after the patient encounter. So it can draft the after visit summary, it can order labs and medications, the referral. And that's incredible and we want to keep building on that. There's so many other use cases across the ecosystem and so that's why in Azure AI Foundry, we have translated a lot of

Starting point is 00:06:46 the research from Microsoft Research and made that available to developers to build and customize for their own applications. Yeah. And as Will was saying, in our transformation of moving from solutions to platforms, and as scaling solutions to other multiple scenarios as we put our models in AI foundry. We provide this developer capabilities like bring your own data and fine tune these models

Starting point is 00:07:18 and then apply it to scenarios that we couldn't even imagine. So that's kind of the platform play we are scaling now. Well, I want to do a reality check because I think to us that are now really focused on technology, it seems like I've heard this story before. I remember even in my academic clinical days where it felt like technology was always the quick answer and it felt like technology was, there was maybe a disconnect between what my problems were or what I think needed to be done versus kind of the solutions that were kind of created or offered to us.

Starting point is 00:07:54 And I guess at some level, how, Jonathan, do you think about this? Because to do things well in the science space is one thing, to do things well in science, but then also have it be something that actually drives healthcare innovation and practice and translation. It's tricky, right? Yeah. I mean, as you said, I think one of the core pathologies of big tech is we assume every problem is a technology problem and that's all it will take to solve the problem. And I think, look, I was trained as a computational biologist and that sits in the awkward middle

Starting point is 00:08:24 between biology and computation. And the thing that we always have to remember, the thing that we were very acutely aware of when we set out was that we are not the experts. And we do have, you know, you as an MD, we have other MDs on the team, we have biologists on the team. But this is a big space. And the only way we're going to have real impact, the only way we're even going to pick the right problems to work on is if we really partner deeply with providers, with EHR vendors, with scientists, and really understand what's

Starting point is 00:08:50 important and again get that feedback loop. Yeah, I think we really need to ground the work that we do in the science itself. We need to understand the broader ecosystem and the broader landscape across healthcare and life sciences so that we can tackle the most important problems, not just the problems that we think are important because as Jonathan said, we're not the experts in healthcare and life sciences. And that's really the secret sauce. When you have the clinical expertise come together with the technical expertise, that's

Starting point is 00:09:22 how you really accelerate healthcare. When we really launched this mission seven or eight years ago, we really came in with the premise of, if we decide to stop, we want to be sure the world cares. And the only way that's going to be true is if we're really deeply embedded with the people that matter, the patients, the providers, and the scientists. And now it really feels like this collaborative effort really can help sort of extend that mission, right?

Starting point is 00:09:46 I think we know Will and Smith, that we definitely feel the passion and the innovation, and we certainly benefit from those collaborations too, but then we have these other partners and even customers, right, that we can start to tap into and have that flywheel keep spinning. Yeah, and the whole industry is an ecosystem, so we have our own data sets at Microsoft Research

Starting point is 00:10:07 that you've trained amazing AI models with, and those are on the catalog. But then you've also partnered with institutions like Providence or Page AI, and those models are in the catalog with their data. And then there are third parties like Nvidia that have their own specialized proprietary data sets, and their models are there too.

Starting point is 00:10:23 So we have this ecosystem of open source models, and maybe Smithy you want to talk about how developers can actually customize these. Yeah. So we use the Azure AI Foundry ecosystem. Developers can feel at home if they're using the AI Foundry, so they can look at our model cards that we publish as part of the models we publish, understand the use cases of these models,

Starting point is 00:10:48 how to quickly bring up these APIs, and look at different use cases of how to apply these, and even fine-tune these models with their own data, and then use it for specific tasks that we couldn't have even imagined. Yeah, it has been interesting to see, we have these healthcare models on the catalog. Again, some that came from research,

Starting point is 00:11:15 some that came from third parties and other product developers. And Azure is kind of becoming the home base, I think, for a lot of health and life science developers. They're seeing all the different modalities, all the different capabilities, and then in combination with Azure OpenAI, which as we know is incredibly competent in lots of different use cases.

Starting point is 00:11:32 How are you looking at the use cases? What are you seeing folks use these models for as they come to the catalog and start sharing their discoveries or products? Well, the general purpose, large language models are amazing for medical general reasoning. So Microsoft research has shown that they can perform super well on, for example, like the United States medical licensing

Starting point is 00:11:55 exam. They can exceed doctor performance if they're just picking between different multiple choice questions. But real medicine we know is messier. It doesn't always start with the whole patient context provided as text in the prompt. You have to get the source data and

Starting point is 00:12:11 that raw data is often non-text. The majority of it is non-text. It's things like medical imaging, radiology, pathology, ophthalmology, dermatology, it goes on and on. There's then this signal data, lab data. So all of this diverse data type needs to be processed through specialized models, because much of that data is not

Starting point is 00:12:30 available on the public Internet. And that's why we're taking this partner approach, first-party and third-party models that can interpret all this kind of data and then connect them ultimately back to these general reasoners to reason over that. So, I've been at this company for a while, and I'm familiar with how long it takes generally to get a really good research paper, do all the studies, do all the data analysis, and then go through the process of publishing, which takes, as you know, a long time.

Starting point is 00:13:00 And it's very rigorous. And one of the things that struck me last year, I think we started this big collaboration. And within a quarter, you had a Nature paper coming out from Microsoft Research. And that model that the Nature paper was describing was ready to be used by anyone on the Azure HIA Foundry within that same quarter.

Starting point is 00:13:24 It kind of blew my mind when I thought about it. Even though we were all working very hard to get that done, to be used by anyone on the Azure HIA Foundry within that same quarter. It kind of blew my mind when I thought about it, even though we were all working very hard to get that done. Any thoughts on that? I mean, has this ever happened in your career? And what's the secret sauce to that? Yeah. I mean, the time scale from research to product has been massively compressed.

Starting point is 00:13:40 And I'd push that even further, which is to say the reason why it took a quarter was because we were laying the railroad tracks as we were driving the train. We have examples right after that where we were launching on Foundry the same day we were publishing the paper. And frankly, the review times are becoming longer than it takes to actually productize the models. I think there's two things that are going on with that that are really converging. One is that the overall ecosystem is converging on a relatively small number of patterns.

Starting point is 00:14:05 And that gives us as a tech company a reason to go off and really make those patterns hardened in a way that allows not just us, but third parties as well, to really have a nice workflow to publish these models. But the other is actually I think a change in how we work. And for most of our history as an industrial research lab, we would do research, and then we'd go pitch it to somebody and try and throw it over the fence. We've really built a much more integrated team. In fact, if you look at that Nature paper

Starting point is 00:14:31 or any of the other papers, there's folks from product teams. Many of you are on the papers along with our clinical collaborators. Yeah, I think one thing that's really important to note is that there's a ton of different ways that you can have impact. So I like to think about phasing., in Health Futures at least, I like to think about phasing

Starting point is 00:14:50 the work that we do. So first we have research, which is really early innovation. And the impact there is getting our technology and our tools out there and really sharing the learnings that we've had. So that can be through publications, like you mentioned, it can be through open sourcing our models. And then you go to incubation. So this is, I think, one of the more new spaces that we're getting into, which is maybe that blurred line between research and product, right, which is how do we take the tools and technologies that we've built and get them into the hands of users, typically through our partnerships,

Starting point is 00:15:24 right? So we partner very deeply and collaborate very deeply across the industry. And incubation is really important because we get that early feedback, we get an ability to pivot if we need to, and we also get the ability to see what types of impact our technology is having in the real world.

Starting point is 00:15:43 And then lastly, when you think about scale, there's tons of different ways that you can scale. We can scale third party through our collaborators and really empower them to go to market, to commercialize the things that we've built together. You can also think about scaling internally, which is why I'm so thankful that we've created this flywheel between research and product.

Starting point is 00:16:03 And a lot of the models that we've built that have gone through research, have gone through incubation, have been able to scale on the Azure AI Foundry. But that's not really our expertise, right? The scale piece in research, that's research and incubation. Smitha, how do you think about scaling? So there are several angles to scaling the models,

Starting point is 00:16:23 the state of the art models we receive from the research team. The first angle is we open source it to get the developer trust and very generous commercial licenses so that they can use it and for their own use cases. The second is we also allow them to customize these models, fine tuning these models with their own data. So a lot of different angles of how we provide support in scaling these state-of-the-art models we get from

Starting point is 00:16:55 the research. And as one example, you know University of Wisconsin Health, you know which Matt knows well, they took one of our models, which is highly versatile. They customized it in Foundry, and they optimized it to reliably identify abnormal chest x-rays, the most common imaging procedure, so they could improve their turnaround time triage quickly. And that's just one example, but we have other partners like Sector who are doing more of operations,

Starting point is 00:17:24 use cases, automatically routing imaging to the radiologist setting them up to be efficient. And then PageAI is doing biomarker identification for actually diagnostics and new drug discovery. So there's so many use cases that we have partners already who are building and customizing. Yeah, the part that's striking to me is just that we could all sit in a room and think about all the different ways someone might use these models on the catalog. Yeah, the part that's striking to me is just that we could all sit in a room and think

Starting point is 00:17:45 about all the different ways someone might use these models on the catalog. And I'm still shocked at the stuff that people use them for and how effective they are. And I think part of that is, again, we talk a lot about generative AI in healthcare and all the things it can do. Again, in text, you refer to that earlier. And certainly off the shelf, there's really powerful applications. But there is this tip of the iceberg effect where under the water, most of the data that we use to take care of our patients is not text.

Starting point is 00:18:13 It's all the different other modalities. And I think that this has been an unlock, taking these innovations from the community, putting them in this ecosystem catalog, essentially, right? And then allowing folks to build and develop applications with all these different types of data. Again, I've been surprised at what I'm seeing. This has been just one of the most profound shifts that's happened in the last 12 months,

Starting point is 00:18:37 really. Two years ago, we had general models in text that really shifted how we think about natural language processing, got totally upended by that. It turns out the same technology works for images as well. It doesn't only allow you to automatically extract concepts from images, but allows you to align those image concepts with text concepts, which means now you can have a conversation with that image. And once you're in that world, now you're in a place where you can start stitching together

Starting point is 00:19:03 these multimodal models that really change how you can interact with the data and how you can start getting more information out of the raw primary data that is part of the patient journey. Well, and we're going to get to that because I think you just touched on something, and I want to reemphasize, stitching these things together. There's a lot of different ways to potentially do that, right? There's ways that you can literally train the model end to end with adapters and all kinds of other early fusion,

Starting point is 00:19:28 late fusion, all kinds of ways. But one of the things that the word of the year is going to be agents. And agent is a very interesting term to think about how you might abstract away some of the components or the tasks that you want the model to accomplish in the midst of a real human to maybe model interaction.

Starting point is 00:19:49 Can you talk a little bit more about how we're thinking about agents in this platform approach? Well, this is our newest addition to the Azure AI Foundry. So there's an agent catalog now where we have a set of pre-configured agents for health care. And then we also have a multi-agent orchestrator that can jumpstart the process of developers building their own multi-agent workflows to tackle

Starting point is 00:20:15 some complex real-world tasks that clinicians have to deal with. And these agents basically combine a general reasoner like a large language model like a GPT-4.0 or an O-series model, with a specialized model, like a model that understands radiology or pathology, with domain-specific knowledge and tools. So the knowledge might be public guidelines or medical journals or your own private data from your EHR or medical imaging

Starting point is 00:20:43 system, and then tools like Code Interpreter to deal with all of the numeric data, or tools that clinicians are using today, like PowerPoint, Word, Teams, and etc. So, we're allowing developers to build and customize each of these agents in Foundry, and then deploy them into their workloads. I really like that concept because as a,

Starting point is 00:21:07 from the user persona, I think about myself as a user, how am I gonna interact with these agents? Where does it naturally fit? And I sort of, I've seen some of the demonstrations and some of the work that's going on with Stanford in particular, showing that literally in a Teams chat, I can have my collision colleagues and I can have specialized healthcare agents that kind of interact like I'm interacting with a human on a chat.

Starting point is 00:21:34 It is a completely mind blowing thing for me and it's a light bulb moment for me too. And I wonder what have we heard from folks that have tried out this healthcare agent orchestrator in this kind of deployment environment via Teams? Well someone joked, you know, are you sure you're not using Teams because you work at Microsoft? But then we actually were meeting with one of the radiologists at one of our partners and they said that that morning they had just done a Teams meeting where they had met with

Starting point is 00:22:03 other specialists to talk about a patient's cancer case where they were coming up with a treatment plan. That was the light bulb moment for us, we realized actually Teams is already being used by physicians as an internal communication tool, as a tool to get work done. Especially since the pandemic, a lot of the meetings moved to virtual and telemedicine.

Starting point is 00:22:25 And so it's a great distribution channel for AI, which has often been a struggle for AI to actually get in the hands of clinicians. And so now we're allowing developers to build and then deploy very easily and extend it into their own workflows. I think that's such an important point. If you think about one of the really important concepts in computer science is

Starting point is 00:22:47 an application programming interface, like some set of rules that allow two applications to talk to each other. One of the big pushes, really important pushes in medicine has been standards that allow us to actually have data standards and APIs that allow these to talk to each other. And yet still, we end up with these silos. There are silos of data, There's silos of applications. And just like when you and I work on our phone, we have to go back and forth between applications.

Starting point is 00:23:10 One of the things that I think agents do is it takes the idea that now you can use language to understand intent and effectively program an interface. And it creates a whole new abstraction layer that allows us to simplify the interaction between not just humans and the endpoint but also for developers. It allows us to have this abstraction layer that

Starting point is 00:23:32 lets different developers focus on different types of models and yet stitch them all together in a very natural way not just for the users, but for the ability to actually deploy those models. Just to add to what Jonathan was mentioning, the other cool thing about the Microsoft Teams user interface is it's also enterprise ready. And one important thing that we're thinking about is exactly this. From the very early research through incubation and then to scale obviously, right? And so

Starting point is 00:24:01 early on in research we are actively working with our partners and our collaborators to make sure that we have the right data, privacy and consents in place. We're doing this in incubation as well, and then obviously in scale. So I think AI has always been thought of as a savior kind of technology. We talked a little bit about how there's been some ups and downs in terms of the ability for technology to be effective in health care. At the same time, we're seeing a lot of new innovations that are really making a difference. But then we kind of get, you know, we talked about agents a little bit. It feels like we're maybe abstracting too far. And, you know, it's,

Starting point is 00:24:38 it may be if things are going too fast, almost. What makes this different? I mean, in your mind, is this a truly a logical next step, or is it going to take some time? I think there's a couple of things that have happened. I think first on just the pure technology, what led to chat GBT? And I like to think of really three major breakthroughs. The first was new mathematical concepts of attention, which really means that we now have a way that a machine can figure out which parts of the context it should actually focus on just the way our brains do.

Starting point is 00:25:05 I mean, if you're a clinician and somebody's talking to you, the majority of that conversation is not relevant for the diagnosis, but you know how to zoom in on the parts that matter. That's a super powerful mathematical concept. The second one is this idea of self-supervision. So I think one of the fundamental problems of machine learning has been that you have to train on labeled training data. And labels are expensive, which means data sets are small, which means the final models are very narrow and brittle.

Starting point is 00:25:29 And the idea of self-supervision is that you can just get a model to automatically learn concepts. And language is just predict the next word. And what's important about that is that leads to models that can actually manipulate and understand really messy text and pull out what's important about that and then stitch that back together in interesting ways. And the third concept that came out of those first two is just the observation of scale. And that's the more is better, more data, more compute, bigger models.

Starting point is 00:25:55 And that really leads to a reason to keep investing and for these models to keep getting better. So that is a groundwork. That's what led to Chatt GPT. That's what led to our ability now to not just have rule-based systems or simple machine learning-based systems to take a messy EHR record, say, and pull out a couple concepts, but to really feed the whole thing in and say, OK, I need you to figure out which concepts are in here and is this particular attribute there, for example.

Starting point is 00:26:20 That's now led to the next breakthrough, which is all those core ideas apply to images as well. They apply to images as well. They apply to proteins, to DNA. And so we're starting to see models that understand images and the concepts of images and can actually map those back to text as well. So you can look at a pathology image and say, not just that's a cell, but it appears that there's some sort of cancer in this particular tissue there. And then you take those two things together and you layer on the fact that now you have

Starting point is 00:26:45 a model or a set of models that can understand intent, can understand human concepts and biomedical concepts, and you can start stitching them together into specialized agents that can actually reason with each other, which at some level gives you an API as a developer to say, okay, I need to focus on a pathology model and get this really, really sound while somebody else is focusing on a radiology model that now allows us to stitch these all together with a user interface that we can now talk to through natural language. I'd like to double click a little bit on that medical abstraction piece that you mentioned. Just the amount of data, clinical data that there is for each individual patient.

Starting point is 00:27:22 Let's think about cancer patients for a second to make this real, right? For every cancer patient, it could take a couple of hours to structure their information. Why is that important? Because you have to get that information in a structured way and abstract relevant information to be able to unlock precision health applications, right, for each patient. So to be able to match precision health applications for each patient.

Starting point is 00:27:47 So to be able to match them to a trial, someone has to sit there and go through all the clinical notes from their entire patient care journey from the beginning to the end. And that's not scalable. And so one thing that we've been doing in an active project that we've been working on with a handful of our partners, but Providence specifically, I'll call out, is using AI to actually abstract and curate that information so that gives time back to the healthcare provider to spend with patients

Starting point is 00:28:18 instead of spending all their time curating this information. And this is super important because it sets the scene and the backbone for all those precision health applications. Like I mentioned, clinical trial matching. Tumor boards is another really important example here. And maybe, Matt, you can talk to that a little bit.

Starting point is 00:28:34 It's a great example. And it's so funny. We've talked about this use case. And the health care agent orchestrators is sort of the initial lighthouse use case was a tumor board setting. And I remember we first started working with some of the partners on this, I think we were under a research kind of lens thinking about what could this, what new diagnoses

Starting point is 00:28:53 could it come up with or what new insights may it have. And what was really a really key moment for us, I think, was noticing that we had developed an agent that can take all of the multimodal data about a patient's chart, organize it in a timeline in chronological fashion, and then allow folks to click on different parts of the timeline to ground it back to the note. And just that, which doesn't sound like a really interesting research paper, it was mind-blowing for clinicians who, again, as you said, spend a great deal of time, often outside of their typical work hours, trying to organize these patient records in order

Starting point is 00:29:31 to go present at a tumor board. And a tumor board is a critical meeting that happens at many cancer centers where specialists all get together, come with their perspective, and make a comment on what would be the best next step in treatment. But the background in preparing for that is, you know, again, organizing the data. But to your point also, what are the clinical trials that are active? There are thousands of clinical trials. There's hundreds every day added. How can anyone keep up with that? And these are the kinds of use cases that start to bubble up. And you realize that a technology that understands concepts, context,

Starting point is 00:30:06 and can reason over vast amounts of data with a language interface, that is a powerful tool. Even before we get to some of the, you know, unlocking new insights and even precision medicine, this is that idea of saving time before lives to me. And there's an enormous amount of undifferentiated heavy lifting that happens in healthcare that these agents and these kinds of workflows can start to unlock. And we've packaged these agents, the manual abstraction work, that manually takes hours. Now we have an agent. It's in Foundry along with the clinical trial matching agent, which I think at Providence you showed could double the match rate over the baseline that they were using by

Starting point is 00:30:45 using the AI from multiple data sources. So, we have that and then we have this orchestration that is using this really neat technology from Microsoft Research, semantic kernel, and MagentaQuant OmniParser. These are technologies that are good at figuring out which agent to use for a given task. So, a clinician who's used to working with

Starting point is 00:31:04 other specialists like a radiologist, a pathologist, a surgeon, they can now also consult these specialist agents who are experts in their domain. There's shared memory across the agents, there's turn-taking, there's negotiation between the agents. So, there's this really interesting system that's emerging. Again, this is all possible to be used through Teams.

Starting point is 00:31:26 There's some great extensibility as well. We've been talking about that and working on some cool tools. Yeah. No, I think if I have to geek out a little bit on how all this agent pick orchestrations are coming up, like I've been in software engineering for decades, it's a next version of distributed systems where you have these services that talk to each other. It's a more natural way

Starting point is 00:31:49 because LLMs are giving these natural ways, instead of a structured API ways of conversing, we have these agents which can naturally understand how to talk to each other. So this is like the next evolution of our systems now and And the way we are packaging all of this is multiple ways based on all the standards and innovations that's happening in this space. So first of all, we are building these agents that are very good at specific tasks, like Will was saying, like trial matching agent

Starting point is 00:32:24 or patient timeline agents. So we take all of these and then we package it in a workflow or an orchestration. We use these standards, some of these coming from research, the semantic kernel, the Benjet take one. And then all of these also allow us to extend these agents with custom agents that can be plugged in. And then all of these also allow us to extend these agents with custom agents that can be plugged in.

Starting point is 00:32:47 So we are open sourcing the entire agent orchestration in AI Foundry templates so that developers can extend with their own agents and make their own workflows out of it. So a lot of cool innovation happening to apply these technology to specific scenarios and workflows. Well I was going to ask you like so as part of that extension so I could you know folks can go say hey I have a maybe a really

Starting point is 00:33:15 specific part of my workflow that I want to use some of these agents for maybe one of the agents that can do PubMed literature search for example. But then there's also agents that come in from the outside. So like I can imagine a software company or AI company that has built an agent that plugs in as well. Yeah, yeah, absolutely. So you can bring your own agent and then we have these standard ways

Starting point is 00:33:41 of communicating with agents and integrating with our orchestration frameworkangirls so you can bring your own agent and extend this health care agent, agent orchestrator to your own needs. I can just think of like in a group chat like a bunch of different specialist agents and I really would want an orchestrator to help find the right tool to your point earlier because I'm guessing this ecosystem is gonna expand quickly, and I may not know which tool is best for which question. I just want to ask the question. Yeah.

Starting point is 00:34:10 Yeah. Well, I think to that point, too, I mean, you said an important point here, which is tools, and these are not necessarily just AI tools, right? I mean, we've known this for a while, right? LLMs are not very good at math, but you can have it use a calculator, and then it works very well. And you guys both brought up the universal medical abstraction a couple times. And one of the things that I find so powerful about that is we've long had this vision within

Starting point is 00:34:31 the precision health community that we should be able to have a learning hospital system. We should be able to actually learn from the actual real clinical experiences that are happening every day so that we can stop practicing medicine based off averages. There's a lot of work that's gone on for the last 20 years about how to actually do causal inference, say. That's not an AI question. That's a statistical question. The bottleneck, the reason why we haven't been able to do that is because most of that

Starting point is 00:34:54 information is locked up in unstructured text. And these other tools need essentially a table. And so now you can decompose this problem and say, well, what if I can use AI not to get to the causal answer, but to just structure the information so now I can put it into the causal inference tool. And these sorts of patterns, I think, again, become very, not just powerful for a programmer, but they start pulling together different specialties.

Starting point is 00:35:18 And I think we'll really see an acceleration, really of collaboration across disciplines because of this. So when I joined Microsoft Research 18 years ago, I was doing work in computational biology and I would always have to answer the question, why is Microsoft in biomedicine? And I would always kind of joke saying, well, it is, we sell office and windows to every healthcare system in the world. We're already in this space. And it really struck me to now see that we've actually come full circle and now you can actually connect in Teams, Word, PowerPoint, which are these tools that everybody uses every day, but they're actually

Starting point is 00:35:51 now specializable through these agents. Can you guys talk a little bit about what that looks like from a developer perspective? How can provider groups actually start playing with this and see this come to life? A lot of healthcare organizations already use Microsoft productivity tools as you mentioned. So as the developers build these agents and use

Starting point is 00:36:14 our healthcare orchestrations to plug in these agents and expose these in these productivity tools, they will get access to all these healthcare workers. So the healthcare agent orchestrator we have today integrates with Microsoft Teams, and it showcases an example of how you can add mention these agents and talk to them like you were talking to another person in a Teams chat.

Starting point is 00:36:41 And then it also provides examples of these agents and how they can use these productivity tools. One of the examples we have there is how they can summarize the assessments of this whole chat into a Word doc or even convert that into a pop-up presentation for later on. One of the things that has struck me is how easy it is to do.

Starting point is 00:37:02 I mean, well, I don't know if you've worked with folks that have gone from zero to 60. Like, how fast? What does that look like? Yeah, it's funny. For us, the technology to transfer all this context into a Word document or PowerPoint presentation for a doctor to take to a meeting

Starting point is 00:37:20 is relatively straightforward compared to the complicated clinical trial matching multimodal processing. The feedback has been tremendous in terms of, wow, that saves so much time to have this organized report that then I can show it to a meeting with, and the agents can come with me to that meeting, because they're literally having a Teams meeting often with other human specialists,

Starting point is 00:37:39 and the agents can be there and answer questions, and fact check, and source all the right information on the fly. So there's a nice integration into these existing tools. We've worked with several different centers just to kind of understand where this might be useful. And like I think we talked about before, the ideas that we've come up with, again, this is a great one because it's complex, it's kind of hairy.

Starting point is 00:38:02 There's a lot of things happening under the hood that don't necessarily require a medical license to do, to prepare for tumor board and to organize data. But it's fascinating, actually. So folks have come up with ideas of, could I have an agent that can operate an MRI machine? And I can ask the agent to change some parameters or redo a protocol.

Starting point is 00:38:23 We thought that was a pretty powerful use case. We've had others that have just said, I really want to have a specific agent that's able to act like deep research does for the consumer side, but based on the context of my patient so that it can search all the literature and pull the data and the papers that are relevant to this case.

Starting point is 00:38:42 And the list goes on and on, from operations all the way to clinical decision making at some level. And I think that the research community that's going to sprout around this will help us, guide us, I guess, to see what is the most high impact use cases, where is this effective, and maybe where it's not effective. But to me, the part that makes me so, I guess, excited about this is just that I don't have to think about, OK, well, then we have to figure out health IT.

Starting point is 00:39:11 Because we always have great ideas and research. And it always feels like there's such a huge chasm to get it in front of the health care workers that might want to test this out. And it feels like, again, this productivity tool use case, again, with the enterprise security, the possibility for bringing in third parties to contribute really does feel like it's a new surface area for innovation. Yeah, I love that. Let me end by putting you all on the spot. So in three years, multimodal agents will do what? Now I'll start with you.

Starting point is 00:39:42 I am convinced that it's going to save a massive amount of time before it saves many lives. I'll focus on the patient care journey and diagnostic journey. I think it will kind of transform that process for the patient itself and shorten that process. I think we've seen already papers recently showing that different modalities surface complementary information.

Starting point is 00:40:09 And so we'll see kind of this AI and these agents becoming an essential companion to the physician, surfacing insights that would have been overlooked otherwise. As similar to what you guys were saying, the agents will become important assistants to healthcare workers, reducing a lot of documentation and workflow access work they have to do. I love that. I guess for my part, I think really what we're going to see is a massive unleash of creativity. We've had a lot of folks that have been innovating in this space, but they haven't had a way to actually get it into the hands of early adopters.

Starting point is 00:40:49 And I think we're going to see that really lead to an explosion of creativity across the ecosystem. So where do we get started? Like, where are the developers who are listening to this, the folks that are at labs, research labs, and developing health care solutions, where do they go to get started with the foundry, the models we've talked about, the health care agent orchestrator. Where do they go to get started with the Foundry, the models we've talked about,

Starting point is 00:41:05 the healthcare agent orchestrator, where do they go? So, ai.azure.com is the AI Foundry. It's a website you can go as a developer, you can sign in with your Azure subscription, get your Azure account, your own VM, all that stuff. And you have an agent catalog, the model catalog, you can start from there.

Starting point is 00:41:24 There is documentation and templates that you can then deploy into Teams or other applications. And tutorials are coming, right? We have recordings of tutorials, we'll have hackathons, some sessions, and then more to come. Yeah, we're really excited. Thank you so much guys for joining us.

Starting point is 00:41:40 Yeah, it was a great conversation. Thank you. Thanks everyone. Thanks for joining us. Yeah, it was a great conversation. Thanks for having us.

Microsoft Research Podcast - Collaborators: Healthcare Innovation to Impact

In this discussion, Matthew Lungren, Jonathan Carlson, Smitha Saligrama, Will Guyman, and Cameron Runde explore how teams across Microsoft are working together to generate advanced AI capabilities and... solutions for developers and clinicians around the globe.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.