Microsoft Research Podcast - The AI Revolution in Medicine, Revisited: Real-world healthcare AI development and deployment—at scale

Starting point is 00:00:00 It's hard to convey the huge complexity of today's healthcare system. Processes and procedures, rules and regulations, and financial benefits and risks all interact, evolve, and grow into a giant edifice of paperwork that is well beyond the capability of any one human being to master. This is where the assistance of an AI like GPT-4 can be not only useful, but crucial. This is the AI Revolution in Medicine Revisited. I'm your host, Peter Lee. Shortly after OpenAI's GPT-4 was publicly released, Kerry Goldberg, Dr. Zak Kohani, and I published The AI Revolution in Medicine to help educate the world of healthcare and medical research about the transformative impact this new generative AI technology could have.

Starting point is 00:01:03 But because we wrote the book when GPT-4 was still a secret, we had to speculate. Now, two years later, what did we get right and what did we get wrong? In this series, we'll talk to clinicians, patients, hospital administrators, and others to understand the reality of AI in the field and where we go from here. The passage I read at the top there is from chapter 7 of the book, The Ultimate Paperwork

Starting point is 00:01:32 Shrider. Paperwork plays a particularly important role in healthcare. It helps convey treatment information that supports patient care, and it's also used to help demonstrate that providers are meeting regulatory responsibilities, among other things. But if we're being honest, it's also used to help demonstrate that providers are meeting regulatory responsibilities, among other things. But if we're being honest, it's taxing for everyone, and it's a big contributor to the burnout our clinicians are experiencing today. Carrie, Zach, and I identified this specific pain point as one of the best early avenues to pursue as far as putting generative AI to good work in the healthcare space.

Starting point is 00:02:06 In this episode, I'm excited to welcome Dr. Matt Lundgren and Seth Hain to talk about matching technological advancements in AI to clinical challenges such as the paperwork crisis, to deliver solutions in the clinic and in the Health System Back Office. Matt is the Chief Scientific Officer for Microsoft Health and Life Sciences, where he focuses on translating cutting-edge technology, including generative AI and cloud services, into innovative healthcare applications. He's a clinical interventional radiologist and a clinical machine learning researcher doing collaborative research and teaching as an adjunct professor at Stanford University. His scientific work has led to more than 200 publications, including work on new

Starting point is 00:02:48 computer vision and natural language processing approaches for health care. Seth is senior vice president of research and development at Epic, a leading healthcare software company specializing in electronic health record systems, also known as EHR, as well as other solutions for connecting clinicians and patients. During his 19 years at Epic, Seth has worked on enhancing the core analytics and other technologies in Epic's platforms, as well as their applications across medicine, bringing together his graduate training in mathematics and his dedication to better health.

Starting point is 00:03:25 I've had the pleasure of working closely with both Matt and Seth. Matt, as a colleague here at Microsoft, really focused on our health and life sciences business. Seth, as a collaborator at Epic, as we embark on the questions of how to integrate and deploy generative AI into clinical applications at scale. Here's my conversation with Dr. Matt Olingre.

Starting point is 00:03:58 Matt, welcome. It's just great to have you here. Thanks so much, Peter. Appreciate being here. So I'd like to just start just talking about you. I had mentioned your role as the chief scientific officer for Microsoft Health and Life Sciences. Of course, that's just a title.

Starting point is 00:04:16 So what the heck is that? What is your job exactly? And what does a typical day at work look like for you? So really, what you could boil my work down to is essentially cross-coll cross collaboration, right? If we have a very large company, lots of innovation happening all over the place, lots of partners that we work with.

Starting point is 00:04:33 And then obviously this sort of a healthcare mission. And so what innovations, what kind of advancements are happening that can actually solve clinical problems, right? And sort of kind of direct that. And we can go into some examples later, but then the other direction too is important, right? So identifying problems that may benefit

Starting point is 00:04:53 from a technologic application or solution and kind of translating that over into the pockets of innovation saying, hey, if you kind of tweaked it this way, this is something that would really help the clinical world. And so that it's really a bidirectional role. So my day-to-day is, every day is a little different to be honest with you.

Starting point is 00:05:09 Some days it's very much in the science and learning about new techniques. On the other side though, it can be very much in the clinic, right? So what are the pain points that we're seeing? Where are the gaps in the solutions that we've already rolled out? And again, what can we do to make health care better broadly?

Starting point is 00:05:26 So I think of you as a technologist, and Matt, you and I actually are colleagues working together here at Microsoft. But you also do spend time in the clinic still as well. Is that right? Initially, it was very much a non-negotiable for me in taking an industry role. I think like a lot of, you know,

Starting point is 00:05:45 physicians, you know, we're torn with the idea like, hey, I spent 20 years training. I love what I do, you know, with a lot of caveats there in terms of some of the administrative burden and some of the hassle sometimes. But for the most part, I love what I do. And there's no greater feeling than using something that you trained years to do and actually see the impact on a human life. It's unbelievable, right? So I think part of me was just like, I didn't want to let that part of my identity go. And frankly, as I often say, to this day, I walk by a fax machine in our office today, like in 2025. So just to be extra clear, it really grounds me in like, yes, I love the possibilities. I love thinking about what

Starting point is 00:06:24 we can do. But also I have a very stark understanding of the in like, yes, I love the possibilities. I love thinking about what we could do, but also I have, I have a very stark understanding of the reality on the ground, both in terms of the technology, but also the burnout, right? The challenges that we're facing and taking care of patients has gotten, you know, much, much more difficult in the last few years. And, um, you know, I like to think it keeps my perspective. Yeah. You know, I think some listeners to this podcast might be surprised that we have doctors on

Starting point is 00:06:52 staff in technical roles at Microsoft. How do you explain that to people? Yeah, no. Yeah, it is interesting. I would say that, you know, from the legacy nuance world, it wasn't so far-fetched that you have physicians that were power users and eventually sort of became, hey, listen, I think this is a strategic direction, you should take it or whatever. And certainly, maybe in the last, I don't want to say five years or so, I've seen more and more physicians

Starting point is 00:07:21 who have taken the time, sometimes on their own, to learn some of the AI capabilities, learn some of the principles and concepts, and frankly some are even coding solutions and leading companies. So I do think that that has shifted a bit in terms of like, hey, doctor, this is your lane and over here, here's a technical person. And I think that's fused quite a bit more. But yeah, it is an unusual thing, I think that's fused quite a bit more, but yeah, it is an unusual thing, I think, in sort of how

Starting point is 00:07:48 we've constructed what at least my group does. But again, I can't see any other way around some of the challenges. I think, you know, an anecdote I'd like to tell you. When I was running the Amy Center, we were bringing the medical school together with the computer science bar right at Stanford. And I remember one day a student, very smart, came into my office on a clinical day or something. And he's like, is there just like a book or something

Starting point is 00:08:15 where I can just learn medicine? Because I feel like there's a lot of translation you have to do for me. It really raised an important insight, which is that you can learn the medicine, so to speak, but go to med school, take the test and all that. But you don't really understand the practice of medicine until you are doing that. And in fact, I even push it a step further to say, after training those first two or three years of you are the responsible person, you can turn around and there's no one there. Like you are making a decision. Getting used to that and then having a healthy respect for that

Starting point is 00:08:53 actually I think provides the most educational value of anything in healthcare. You know, I think what you're saying is so important because as I reflect on my own journey, of course, I'm a computer scientist. I don't have medical training, although at this point, I feel confident that I could pass a step one medical exam. I have no doubt. But I think that the tech industry, because of people like you have progressed tremendously in having a more sophisticated

Starting point is 00:09:28 nuanced understanding of what actually goes on in clinic and also what goes on in the boardrooms of healthcare delivery organizations and and of course at the end of the day I think that's really been your role. So roughly speaking your job as executive at a big tech company has been to understand what the technology platforms need to be, particularly with respect to machine learning, AI and cloud computing to best support healthcare. And, and so maybe let's start pre GPT four pre chat GPT. And tell us a little bit, you know, about maybe some of

Starting point is 00:10:14 your proudest moments in getting advanced technologies like AI into the clinic. You know, when I first started, so remember, like, you go all the way back to about 2013, right? So when I, my first faculty job and, you know, we're building a clinical program and I, you know, I had a lot of interest in public health and building large data sets for pop health, etc. But I was doing a lot of that, you know, sort of labeling to get those insights manually,

Starting point is 00:10:40 right? So like, I was the, I was the person that you'd probably look at now and say, what are you doing, right? So but I had a complete random encounter with Andrew Engel, I didn't know at the time at Stanford. And I, you know, went to one of the seminars that he was holding at the Gates building. And, you know, they were talking about their performance on ImageNet, you know, cat and dog and, you know, tree, bush, whatever. And I remember sitting in kind of the back and, you know, I and dog and, you know, tree, bush, whatever.

Starting point is 00:11:05 And I remember sitting in kind of the back and, you know, I think I may have my scrubs on at the time and just kind of like what, like, why, like this, we could use this in healthcare, you know. But for me, it was a big moment and I was like, this is, this is huge, right? This is a, and you remember the deep learning really kind of started to show its stuff with, with, you know, Fei-Fei Li's internet ImageNet and stuff. So, so anyway, we started the collaboration and that actually became a NIDUS. And one of the first things we worked on, we just said, listen, one of the most

Starting point is 00:11:30 common medical image examinations in the world is the chest x-ray, right? Two or three billion are done every year in the world. And so is that not a great place to start? And of course, we had a very democratizing kind of mission. As you know, Andrew has done a lot of work in that space and I had the similar ambitions. And so we really started to focus on bringing the sort of clinical and the CS together and see what could be done. So we did Chexnet. And remember, this is around the time when like Jeffrey Hinton was saying things like we should stop training radiologists and all this stuff was going on.

Starting point is 00:12:05 So it was a lot of hype. And this is the narrow way I did, is just to remind the audience. How did you feel about that since you are a radiologist? Well, it was so funny. So Andrew is obviously very prolific on social media. And I was, who am I? So I remember he tagged me.

Starting point is 00:12:19 Well, first he said, Matt, you need to get a Twitter account. I said, okay. And he tagged me on the very first post of our, what we call checks now. That was like kind of like the hello world for this work. And I remember it was a clinical day. I had set my phone, as you do outside the O.R., I go in, do my procedure, you know, hour or so, come back, my phone's dead.

Starting point is 00:12:36 I'm like, oh, that's weird. Like I had a decent charge. So I plug it in, I turn on, I had like hundreds of thousands of notifications because Andrew had tweeted out to his millions or whatever about checks and that. And so then of course, as you point out, I go to RSNA that year, which is our large radiology conference and that Jeffrey Hinton quote had come out. And everyone's looking at me like, what are you doing, Matt? You know, like, are you coming after us?

Starting point is 00:13:00 Especially I'm like, no, no, that's, you know, it's a it's, it's a, it's a way to interpret it, but you have to take a much longer horizon view, right? Well, um, you know, we're going to, uh, just as an enticement for listeners to this podcast, listen to the very end, I'm going to pin you down toward the end on your assessment of whether Jeffrey Hinton will eventually be proven right or not, but, uh, let's, uh, let's take our time to get there. Now let's go ahead and enter the generative AI era.

Starting point is 00:13:29 When we were first exposed to what we now know of as GPT-4, this was before it was disclosed to the world, a small number of people at Microsoft and Microsoft Research were given access in order to do some technical assessment. And Matt, you and I were involved very early on in trying to assess what might this technology mean for medicine.

Starting point is 00:13:51 Tell us, what was the first encounter with this new technology like for you? It was the weirdest thing, Peter. Like I joined that summer. So the summer before the you know, the actual GBT came out, I had literally no idea what was getting into it. So I started asking questions, you know, kind of general stuff, right? Just, you know, I was like, oh, all right, it's pretty good. And so then I would sort of go a little deeper. And eventually I got to the

Starting point is 00:14:16 point where I'm asking questions that, you know, maybe there's three papers on it in my community. And remember, I'm a sub-subsist, right? Pediatric intervention radiology. And the things that we do in vascular malformations and rare cancers are really, really strange and not very commonly known. And I kind of walked away from that. First, I said, can I have this thing? But then I, you know, I don't want to sound dramatic, but I didn't sleep that well, if I'm being honest, for the first few nights, partially because I couldn't tell anybody except for the few that I knew were involved. And partially because I just couldn't wrap my head around how we went from what I was doing in LSTMs, right, which was state of the art-ish at the time for NLP. And all of a sudden, I have this thing that is broadly domain experts, representations

Starting point is 00:15:07 of knowledge that there's no way you could think of it would be in distribution for a normal approach to this. And so I really struggled with it, honestly. Interpersonally, I would be like, well, let's not work on that. They're like, why not? We were just excited about it last week. I'm like, I don't know. I think that we could think of another approach later.

Starting point is 00:15:26 And so yeah, when we were finally able to really look at some of the capabilities and really think clearly, it was really clear that we had a massive opportunity on our hands to impact health care in a way that was never possible before. Yeah, and at that time, you were still a part of nuance. Nuance I think was in process of being acquired by Microsoft, is that right? That's right. And so of course this was also technology that would have profound and

Starting point is 00:15:57 very direct implications for nuance. How did you think about that? Nuance for those in the audience don't know for 25 years was sort of the medical speech detects thing that all physicians used. But really the brass ring had always been, and I want to say going back to 2013, 2014, nuance had tried to figure out, OK, we see this pain point. Doctors are typing on their computers while they're trying to talk to their patients. We should be able to figure out a way to get that ambient conversation turned into text

Starting point is 00:16:29 that then accelerates the doctor, takes all the important information. That's a really hard problem, right? You're having a conversation with a patient about their knee pain, but you're also talking about their cousin's wedding and their next vacation and their dog is sick or whatever. And all that gets recorded, right? And so then you have to have the intelligence slash context to be able to tease out what's important for a note. And then it has to be at the performance level that a physician who, again, 20 years of training and education plus a huge, huge amount of need to know, need to get through his cases efficiently.

Starting point is 00:17:05 That's a really difficult problem. And so for a long time, there was a human in the loop aspect to doing this, because you needed a human to say, this transcript's great, but here's actually what needs to go on the note. And that can't scale, as you know. When the GPT-4, you know, model kind of, you know, showed what it was capable of, I think it was an immediate light bulb, because there's no, you can ask any physician in your life, anyone in the audience, you know,

Starting point is 00:17:30 what is your biggest pain point when you go to see your doctor? Like, they don't talk to me. Don't look me in the eye. They're rushing around trying to finish a note. If we could get that off their plate, that's a huge unlock, Peter. And I think that, again, as you know,

Starting point is 00:17:42 it's now led to so much more, but that was kind of the initial, I think, reaction. And so maybe that gets us into our next set of questions, our next topic, which is about the book and all the predictions we made in the book. Because Carrie, Zach, and I, actually we did make a prediction that this technology would have a huge impact

Starting point is 00:18:04 on this problem of clinical note taking. And so you're just right in the middle of that. You're directly hands-on creating, I think, what is probably the most popular early product for doing exactly that. So were we right? Were we wrong? What else do we need to understand about this? No, you were right. I think in the book, I think you called it like a paper shredder or something. I think you used a term like that.

Starting point is 00:18:36 That's exactly where the activity is right now and the opportunity. I've even taken that so far as to say that when folks are asking about what the technology is capable of doing, we say, well, listen, it's gonna save time before it saves lives. It'll do both. But right now it's about saving time. It's about peeling back the layers of the onion that if you put me in where I started medicine in 2003,

Starting point is 00:19:01 and then fast forward and showed me a day in the life of 2025, I would be shocked at what I was doing that wasn't related to patient care. So all of those layers that have been stacked up over the years, we can start finding ways to peel that back. And I think that's exactly what we're seeing. And to your point, I think you mentioned this too, which is, well, sure, we can do this transcript and we can turn a note, but then we can do other things, right?

Starting point is 00:19:24 We can summarize that in the patient's language or education level of choice. We can pen orders. We can eventually get to a place decision support. So, hey, did you think about this diagnosis doctor? Like those kinds of things. And all those things I think you highlighted beautifully. And again, it sounds like with a lot of, right, just of just guesswork and prediction, but those things are actually happening every single day right now.

Starting point is 00:19:50 Well, so now in this episode, we're really trying to understand where the technology industry is in delivering these kinds of things. From your perspective, in the business that you're helping to run here at Microsoft, what are the things that are actually shipping as product versus things that clinicians are doing, let's say, off label, just by using, say, chat GPT on their personal mobile devices? And then what things aren't happening? Yeah, I'll start with the shipping part, because I think, again, you know my background, that are personal mobile devices, and then what things aren't happening.

Starting point is 00:20:27 Yeah, I'll start with the shipping part. Cause I think, again, you know my background, right? Academic clinician did a lot of research, hadn't had a ton of product experience. In other words, like, again, I'm happy to show you what benchmarks we beat or a new technique or get a grant to do all this, or even frankly, talk about startups, but to actually have an audience that is accustomed to a certain level of performance for the

Starting point is 00:20:48 solutions that they use, to be able to deliver something new at that same level of expectation, wow, that's a big deal. And again, this is part of the learning by being around this environment that we have, which is we have this incredibly focused, very experienced clinical product team, right? And then I think on the other side, to your point about the general purpose aspect of this, it's no secret now, right? That this is a useful technology in a lot of different medical applications.

Starting point is 00:21:18 And let's just say that there's a lot of knowledge that can be used, particularly by the physician community. And I think the most recent survey I saw was from the British Medical Journal, which said, hey, which doctors are you using? Are you willing to tell us what you're doing? And it turns out that folks are, what, 30% or so said that they were using it regularly in clinic.

Starting point is 00:21:37 And again, this is the API or whatever off the shelf. And then frankly, when they ask what they're using it for, it tends to be things like, hey, differential, help me fill in my differential or suggest. And to me, I think what that created, at least, and you're starting to see this trend really accelerate in the US especially, is, well, listen, we can't have everybody pulling out their laptops and potentially exposing patient information by accident or something to a public API. We have to figure this out. And so brilliantly, I think NYU was one of the first now,

Starting point is 00:22:08 I think there's 30 plus institutions that said, listen, OK, we know this is useful to the entire community in the health care space. We know the administrators and nurses, and everybody thinks this is great. We can't allow this to be a very loosey-goosey approach to this, given this environment.

Starting point is 00:22:26 So what we'll do is we'll set up a HIPAA compliant instance to allow anyone in the health system to use the models. Then whatever, the newest model comes, it gets hosted as well. What's cool about that, and that's happened now a lot of places, is that at the high level, first of all, people get to use it and experiment and learn. But at the high level, they're actually seeing what are the common use cases because you could ask 15 people and you might get, you know, super long list and it may not help you

Starting point is 00:22:53 decide what to operationalize in your house. But let me ask you about that. When you observe that, are there times when you think, oh, some specific use cases that we're observing in that sort of organic way need to be taken into specialized applications and made into products? Is it best to keep these things sort of open chat interface types of general purpose platform? Honestly, it's both. And that's exactly what we're seeing. I'm most familiar with Stanford, kind of the work that Nigam Shah leads on this, but he basically, you know,

Starting point is 00:23:31 there's a really great paper that is coming out in JAMA, but basically saying, here's what our workforce is using it for. Here are the things in the literature that would suggest what would be popular. And some of those line up, like helping with a clinical diagnosis or documentation, but some of them don't.

Starting point is 00:23:47 But for the most part, the stuff that flies at the top, those are opportunities to operationalize and productize, et cetera. And I think that's exactly what we're seeing. So let's get into some of the specific predictions. We've, I think, beaten note taking to death here, but there's other kinds of paperwork, like filling out

Starting point is 00:24:05 prior authorization request forms or referral letters, and after visit note or summary to give instructions to patients and so on. These were all things that we were making guesses in our book might be happening. What's the reality there? I've seen every single one of those. In fact, I've probably seen a dozen startups too, right, in doing exactly those things. We touched a little bit on translation into the actual clinic. That's actually another thing that I used to kind of under appreciate, which is that,

Starting point is 00:24:40 listen, you can have a computer scientist and a physician or a nurse or whatever, and like you have the domain expertise and you physician or a nurse or whatever, and like give the domain expertise and you think you're ready to build something. The health IT is another part of that Venn diagram that's so incredibly critical. And then exactly how are you gonna bring that into the system? That's a whole new, it's a whole new ball game. And so I do wanna do a call out

Starting point is 00:25:01 because the collaboration that we have with Epic is monumental. Because here you have the system of record that most physicians, at least in the US, use. And they're going to use an interface and they're going to have an understanding of, hey, we know these are pain points. And so I think there's some really, really cool new innovations that are coming out of the relationship that we have with Epic. And certainly the audience may be familiar with those that I think will start to knock off a lot of the things that you predicted in your book relatively soon.

Starting point is 00:25:30 I think most of the listeners to this podcast will know what Epic is. But for those that are unfamiliar with the health industry and especially the technology foundation, Epic is probably the largest provider of electronic health record systems. And of course, in collaboration with you and your team, they've been integrating generative AI quite a bit. Are there specific uses that Epic is making and deploying that get you particularly excited? First of all, the ambient note generation, by the way, is integrated into Epic now.

Starting point is 00:26:07 So it's not another screen, another thing for physicians. So that's a huge, huge unlock in terms of the translation. But then Epic themselves, so they have, I guess, on the last roadmap that they talked, more than 60. But the one that's kind of been used now is this inbox response. So again, maybe someone might be familiar with why is this such a big deal? Well, if you're a physician, you already have 20 patients to see that day, and you got all

Starting point is 00:26:33 those notes to do. And then Jeff Simms' paradox, right? So if you give me better access to my doctor, well, maybe I won't make an appointment. I'm just going to send him a note and this is kind of this inbox, right? So then at the end of my day, I got to get all my notes done. And then I got to go through all the inbox messages I've received from all of my patients and make sure that they're not like having chest pain and they're blowing it off or something. Now that's a lot of work and the cold start problem of like, okay, I have to respond to

Starting point is 00:27:00 them. So Epic has leveraged this system to say, let me just draft a note for you, understanding the context of what's going on with the patient, et cetera. And you can edit that and sign it, right? So you can accelerate some of those. So that's probably one I'm most excited about, but there's so many right now. Well, I think I need to let you actually state the name

Starting point is 00:27:20 of the clinical note-taking product that you're associated with. Would you like to do that? Sure. Yeah, it's called Dax Copilot. And for the record, it is the fastest growing copilot in the Microsoft ecosystem. We're very proud of that. Five hundred institutions are in here are using it.

Starting point is 00:27:36 And millions of notes have already been created with it, and the feedback has been tremendous. So you sort of refer to this a little bit, you know, this idea of AI being a second set of eyes. So a doctor makes some decisions in diagnosis or kind of working out potential treatments or medication decisions. And in the book, you know, we surmise that, well, AI might not replace the doctor doing those things. It could, but might not. But AI could possibly reduce errors if doctors and nurses are making decisions by just looking at those decisions and just checking them out. Is that happening at all?

Starting point is 00:28:28 And what do you see the future there? Yeah, I would say that's kind of the jagged edge of innovation, right? Where sometimes the capability gets ahead of the ability to operationalize that. Part of that is just related to the systems. And the evidence has been interesting on this. So like, you know this,

Starting point is 00:28:44 a colleague Eric Horowitz has been doing a lot of work in sort of looking at physician, physician with GPT-4, let's say, and then GPT-4 alone for a whole variety of things. You know, we've been saying to the world for a long time, particularly in the narrow AI days, that AI plus human is better than either alone. We're not really seeing that bear out really that well yet in some of the research.

Starting point is 00:29:08 But it is a signal to me, and to the use case you're suggesting, which is that if we let this system in the right way kind of handle a lot of the safety net aspects of what we do, but then also potentially take on some of the things that maybe are not that challenging or at least somewhat simple. And of course, this is really an interesting use case in my world, in the vision world, which is that we know these models are multimodal, right?

Starting point is 00:29:36 They can process images and text. And what does that look like for pathologists or radiologists where we do have a certain percentage of the things we look at in a given day are normal, right, or as close to normal as you can imagine. So is there a way to do that? And then also, by the way, have a safety net. And so I think that this is an extremely active area right now. I don't think we've figured out exactly how to have the human and AI model interact in this space yet.

Starting point is 00:30:05 But I know that there's a lot of attempts at it right now. Yeah, I think this idea of a true co-pilot, a true collaborator, I think is still something that's coming. I think we've had a couple of decades of people being trained to think of computers as question answering machines. Ask a question, get an answer, provide a document, get a summary and so on. But the idea that something might actually be this second set of eyes just assisting you all

Starting point is 00:30:36 day continuously, I think is a new mode of interaction and we haven't quite figured that out. Now, in preparation for this podcast, Matt, you said that you actually used AI to assist you in getting ready. Would you like to share what you learned by doing that? Yeah, it's very funny. So you may have heard this term coined by Ethan Malik called the secret cyborg, which is referring to the phenomena of folks using GPT,

Starting point is 00:31:05 realizing it can actually help them a ton in all kinds of parts of their work, but not necessarily telling anybody that they're using it, right? And so in a similar secret cyborgish way, I was like, well, listen, I haven't read your book in like a year, right? I recommend it to everybody. And I'm just a refresher. So what I did was I took your book, I put it into GPT-4, okay, it asked it to sort of talk about the predictions that you made. And then I took that and put it in the stronger

Starting point is 00:31:31 reasoning model, in this case, the deep research that you may have just seen or heard of, right, in the audience from from opening the eye, and asked it to research all the current papers and, you know, blogs and whatever else, and tell me like what was right, what was wrong in terms of the predictions. So it actually had, it was an incredible thing. Like it's a six or seven pages. It probably would have taken me two weeks, frankly, to do this amount of work.

Starting point is 00:31:55 I'll be looking forward to reading that in the New England Journal of Medicine shortly. That's right. Yeah, before this podcast comes out, I'll submit it as an opinion piece. No. But yeah, I think on balance, incredibly insightful views. And I think part of that was part, you know, your team that got together really had a lot of different angles on

Starting point is 00:32:14 this. But, you know, and I think the only area that was like, which I've observed as well, it's just, man, this can do a lot for education. We haven't seen, I don't think we're looking at this as a tutor. To your point, we're kind of looking at it as a transactional in and out. But as we've seen in all kinds of data, both in low-middle-income countries and even in Harvard, using this as a tutor can really accelerate your knowledge in profound ways. And so that is probably one area where I think your prediction was maybe slightly even further ahead of the curve because I don't think folks have really grok that opportunity yet.

Starting point is 00:32:47 Yeah, and for people who haven't read the book, the guess was that you might use this as a training aid if you're an aspiring doctor. For example, you can ask GPT-4 to pretend to be a patient that presents a certain way, and that you are the doctor that this patient has come to see. And so you have an interaction.

Starting point is 00:33:07 And then when you say end of encounter, you ask GPT-4 to assess how well you did. And we thought that this might be a great training aid. And to your point, it seems not to have materialized. There's some sparks. Yeah, with communication, end of life conversations that no physician loves to have, right? It's very, very hard to train someone in those. I've seen some work with like communication end-of-life conversations that no physician loves to have right? It's very very hard to train someone in those I've seen some work done, but you're right. It's it's not quite hit mainstream

Starting point is 00:33:31 Yeah on the subject of things that we missed one thing that you've been very very involved in in the last several months has been in Shipping products that are multimodal. So that was something I think that we missed completely. What is the current state of affairs for multimodal healthcare AI, medical AI? Yeah, the way I like to explain, and first of all, no fault to you, but this is not an area that we were just so excited about the text use case, that I can't fault you.

Starting point is 00:34:05 But yeah, I mean, so if we look at healthcare, how we take care of patients today, as you know, the vast majority of the data in terms of just data itself is actually not in text. It's going to be in pathology and genomics and radiology, et cetera. And it seems like an opportunity here to watch this huge curve just go straight up in the general reasoning and frankly medical competency and capabilities of the models that are coming seems like an opportunity here to watch this huge curve just go straight up in the general reasoning and frankly medical competency and capabilities of the models that are coming and continue to come, but then to see that it's not as proficient for medical specific imaging and video and other data types.

Starting point is 00:34:39 And that gap is kind of what I describe as the multimodal medical AI gap. We're probably in GP2 land, right, for this other modality type versus the, you know, we're now at 03, who knows where we're going to go. At least in our view, we can innovate in that space. How do we help bring those innovations to the broader community to close that gap and see some of these use cases really start to accelerate in the multimodal world? And I think we've taken a pretty good crack at that. A lot of that is credit to the innovative work. I mean, MSR was two or three years ahead of everyone else on a lot of this.

Starting point is 00:35:14 And so how do we package that up in a way that the community can actually access and use? And so we took a lot of what your group had done in let's just say radiology or pathology in particular, and say, okay, well, let's put this in an ecosystem of other models. Other groups can participate in this, but let's put it in a platform where maybe I'm really competent in radiology or pathology. How do I connect those things together? How do I bring the general reasoner knowledge into a multimodal use case? And I think that's what we've done pretty well so far. We have a lot of work to do still,

Starting point is 00:35:46 but this is very, very exciting. We're seeing just such a ton of interest in building with the tools that we put out there. Well, I think how rapidly that's advancing has been a surprise to me. So I think we're running short on time. So two last questions to wrap up this conversation. The first one is, as we think ahead on AI in medicine,

Starting point is 00:36:11 what do you think will be the biggest changes or make the biggest differences two years from now, five years from now, 10 years from now? This is really tough. Okay, I think the two-year timeframe, I think we will have some autonomous agent-based workflows for a lot of the, what I would call, undifferentiated heavy lifting in healthcare. This is happening in the pharmaceutical industry. Every aspect is sort of looking at their operations at a macro level.

Starting point is 00:36:44 Where are these big bureaucratic processes that largely involve techs and where can we shrink those down and really kind of unlock a lot of our workforce to do things that might be more meaningful to the business. I think that's my safe one. Going five years out, you know, I have a really difficult time grappling

Starting point is 00:37:04 with this seemingly shrinking timeline to AGI that we hear from people who I would respect and certainly know more than me. And in that world, I think there's only been one paper that I've seen that has attempted to say, what does that mean in healthcare when we have this? And the fact is, I actually don't know. I wonder whether there'll still be a gap in some modalities.

Starting point is 00:37:28 Maybe there'll be the ability to do new science and all kinds of interesting things will come with that. But then if you go all the way to your tenure, I do feel like we're going to have systems that are acting autonomously in a variety of capacities, if I'm being honest. And what I would like to see, if I have any influence on some of this,

Starting point is 00:37:45 is can we start to celebrate the closing of hospitals instead of opening them? Meaning that, can we actually start to address at a personal individual level, care? And maybe that's outside the home, maybe that's in a way that doesn't have to use so many resources and frankly, really be very reactive instead of proactive.

Starting point is 00:38:06 I really wanna see that. That's been the vision of precision medicine for geez, 20 plus years. And I feel like we're getting close to that being something we can really tackle. So we talked about Jeff Hinton and his famous prediction that we would soon not have human radiologists and of course, maybe he got the date wrong.

Starting point is 00:38:29 So let's reset the date to 2028. So Matt, do you think Jeff is right or wrong? Yeah, so I'm not gonna dodge the question, but let me just answer this a certain different way. We have a clear line of sight to go from images to draft reports. I think that is unmistakable and that's now in 2025. How it will be implemented and what the implications

Starting point is 00:38:55 of that will be, I think will be heavily dependent on the health system or the incentive structure for where it's deployed. So if I'm trying to take a step back, back to my global health days, man, that can't come fast enough. Because you have entire health systems, in fact, entire countries that have five

Starting point is 00:39:13 medical imaging experts for the whole country, but they still need this to take care of patients. Zooming in on today's crisis in the US, we have the burnout crisis just as much as the doctors who are to stand patients and write notes. We can't keep up with the volume. In fact, we're not training folks fast enough. So there is a push pull, there may be a flip to your point of autonomous reads across some segments of what we do. By 2028, I think that's a reasonable expectation that we'll have some form of that. Yes.

Starting point is 00:39:45 I tend to agree. And, you know, I think things get reshaped, but, you know, it seems very likely that even far into the future, we'll have humans wanting to take care of other humans and be taken care of by humans. Matt, this has been a fantastic conversation and I feel it's always a personal privilege to have a chance to work with someone like you, so keep it up. Thank you so much, Peter. Thanks for having me. I'm always so impressed when I talk to Matt and I feel lucky that we get a chance to work together here at Microsoft.

Starting point is 00:40:27 One of the things that always strikes me whenever I talk to him is just how disruptive Generative AI has been to a business like Nuance. Nuance has had clinical note taking as part of their product portfolio for a long, long time. And so when Generative AI comes along, it's not only an opportunity for them, but also a threat because in a sense, it opens up the possibility of almost anyone being able to make clinical note-taking capabilities into products. It's really interesting how Matt's product, DAX Copilot, which since the time that we had our conversation

Starting point is 00:41:06 has expanded into a full healthcare workflow product called Dragon Copilot, has really taken off in the marketplace. And how many new competing AI products have also hit the market, and all in just two years, because of generative AI. The other thing that I always think about is just how important it is for these kinds of systems to work together, and especially how they integrate into the electronic health record systems. This is something that Carrie, Zach, and I didn't really realize fully when we wrote our book. But when you talk to both Matt and Seth,

Starting point is 00:41:43 of course, we see how important it is to have that integration. Finally, what a great example of yet another person who is both a surgeon and a tech geek. People sometimes think of healthcare is moving very slowly when it comes to new technology, but people like Matt are actually making it happen much more quickly than most people might expect. Well, anyway, as I mentioned, we also had a chance to talk to Seth Hayne. And so here's my conversation with Seth. Seth, thank you so much for joining.

Starting point is 00:42:25 Now, Peter, it's such an exciting time to sit down and talk about this topic. So much has changed in the last two years. Thanks for inviting me. Yeah. In fact, I think in a way, both of our lives have been upended in many ways by the emergence of AI. The traditional listeners of the Microsoft Research Podcast, I think for the most part, aren't steeped in the healthcare industry.

Starting point is 00:42:50 And so maybe we can just start with two things. One is, what is Epic, really? And then two, what is your job? What does the Senior Vice President for R&D at Epic do every day? Yeah. Well, let's start with that first question. So what is Epic? Most people across the world experience Epic through something we call MyChart. They might use it to message their physician. They might use it to check the lab values after they've gotten a recent test. But it's an app on

Starting point is 00:43:22 their phone, right, for connecting in with their doctors and nurses and really making them part of the care team. But the software we create here at Epic goes beyond that. It's what runs in the clinic, what runs at the bedside in the back office to help facilitate those different pieces of care. From collecting vital information at the bedside, to helping place orders if you're coming in for an outpatient visit, maybe with a kiddo with an earache, and capturing that note and record

Starting point is 00:43:55 of what happened during that encounter, all the way through back office encounters, a back office information for interacting with payers as an example. And so we provide a suite of software that health systems and increasingly a broader set of the healthcare ecosystem, like payers and specialty diagnostic groups use to connect with that patient at the center around their care. And my job is to help our applications across to the company take advantage of those latest pieces of technology

Starting point is 00:44:33 to help improve the efficiency of folks like clinicians in the exam room when you go in for a visit, we'll get into, I imagine some use cases, like ambient conversations, capturing that conversation in the exam room to help drive some of that documentation. But then providing that platform for those teams to build those and then strategize around what to create next to help both the physicians be efficient and also the health systems. But then ultimately continuing to use those tools to advance the science of medicine.

Starting point is 00:45:06 Right. You know, one thing that I explained to fellow technologists is that I think today health records are almost entirely digital. I think the last figures I saw is well over 99% of all health records are digital. But in the year 2001, fewer than 15% of health records were digital. They were literally in folders on paper in store rooms. And if you're old enough, you might even remember seeing those store rooms. So it's been quite a journey. Epic and Epic's competitors, I think Epic is really the most important company, have really moved the entire infrastructure

Starting point is 00:45:54 of record keeping and other communications in health care to a digital foundation. And I think one thing we'll get into, of course, one of the issues that has really become, I think a problem for doctors and nurses is clerical or paperwork, record-keeping burden. And for that reason, epic and epic systems end up being a real focus of attention.

Starting point is 00:46:24 And so we'll get into that in a bit here. For that reason, EPIC and EPIC systems end up being a real focus of attention. And so we'll get into that in a bit here. And I think that hits just to highlight it on both sides. There is both the need to capture documentation. There's also the challenge in reviewing it. The average medical record these days is somewhere between the length of Fahrenheit 451 and to kill a mockingbird. So there's a fair amount of effort going in on that review side as well.

Starting point is 00:46:53 Yeah, indeed. So much to get into there. But I would like to talk about encounters with AI. So obviously, I think there are two eras here, before the emergence of chat GPT and what we now call of as genderative AI and afterwards. And so let's take the former. Of course, you've been thinking about machine learning and health data probably for decades.

Starting point is 00:47:24 Do you have a memory of how you got into this? Why did you get an interest in data analytics and machine learning in the first place? Well, my background, as you noted, is in mathematics before I came to Epic. And the sort of patterns in what could emerge were always part of what drove that. Having done development and kind of always been

Starting point is 00:47:49 around computers all my life, it was a natural transition as I came here. And I started by really focusing on how do we scale systems for the very largest organizations, making sure they were highly available and also highly responsive. Time is critical in these contexts in regards to rapidly getting information to doctors and nurses. And then really, say in the 2010s, there started to be an emergence of

Starting point is 00:48:21 capabilities from a storage and compute perspective, where we could begin to build predictive analytics models. And these were models that were very focused, right? It predicted the likelihood somebody would show up for an appointment. It predicted the likelihood that somebody may fall during an inpatient stay, as an example. And I think a key learning during that time period

Starting point is 00:48:48 was thinking through the full workflow, what information was available at that point in time, right? At the moment somebody walks into the ED, you don't have a full picture to predict the likelihood that they may deteriorate during an inpatient encounter. In addition to what information was available, it was what can you do about it? And a key part of that was how do we help get the right people in the right point in

Starting point is 00:49:19 time at the bedside to make an assessment, right? It was a human in the loop type of workflow where for example, you would predict deterioration in advance and have a nurse come to the bedside or a physician come to the bedside to assess. And I think that combination of narrowly focused predictive models with an understanding that to have them make an impact,

Starting point is 00:49:44 you had to think through the full workflow of where a human would make a decision was a key piece. Obviously there is a positive human impact. And so for sure, part of the thought process for these kinds of capabilities comes from that, but Epic is also business and you have to worry about, what are doctors and clinics and health care systems willing to buy?

Starting point is 00:50:14 And so how do you balance those two things, and do those two things ever come into conflict as you're imagining what kinds of new capabilities and features and products to create? Sort of two aspects, I think, really come to mind. First off, generally speaking, we see analytics and AI as a part of the application. So in that sense, it's not something we license separately.

Starting point is 00:50:43 We think that those insights and those pieces of data are part of what makes the application meaningful and impactful. At the scale that many of these health systems operate and the number of patients that they care for, as well as having tens of thousands of users in the system daily, one needs to think about the compute overhead that these things cause.

Starting point is 00:51:10 And so in that regard, there is always a ROI assessment that is taking place to some degree around what happens if this runs at full scale. And in a way, that really got accelerated as we went into the generative AI era. Right. Okay, so you mentioned generative AI. What was the first encounter and what was that experience for you? So in the winter of 22 and into 2023, I started experimenting alongside you with what we at that time called DV3 or DaVinci 3 and eventually became GPT-4. And immediately a few things became obvious. a few things became obvious.

Starting point is 00:52:08 The tool was highly general purpose. One was able to, in putting in a prompt, have it sort of convert into the framing and context of a particular clinical circumstance and reason around that context. But I think the other thing that started to come to bear in that context was there was a fair amount of latent knowledge inside of it that was very, very different than anything we'd seen before. And there's some examples from the Sparks of AI, AGI paper from Microsoft research,

Starting point is 00:52:47 where a series of objects end up getting stacked together in the optimal way to build height. And just given the list of objects, it seems to have a understanding of physical space that it intuited from the training processes we hadn't seen anywhere. So that was an entirely new capability that programmers now had access to. Well, in fact, I think that winter of 2022, and we'll get into this, one of your projects that you've been running

Starting point is 00:53:26 for quite a few years is something called Cosmos, which I find exceptionally interesting. And I was motivated to understand whether this type of technology could have an impact there. And so I had to receive permission from both OpenAI and Microsoft to provide you with early access. When I did first show this technology to you, you must have had an emotional response, either skepticism

Starting point is 00:53:56 or I can't imagine you just trusted me to the extent of believing everything I was telling you. Well, I think there's, there's always a question of what is it actually, right? It's often easy to create demos. It's often easy to show things in a narrow circumstance. And it takes getting your hands on it and really spending your 10,000 hours digging in and probing it in different ways to see just how general purpose it was. And so the skepticism was really around

Starting point is 00:54:39 how applicable can this be broadly? And I think the second question, and we're starting to see this play out now in I think the second question, and we're starting to see this play out now in some of the later models, was is this just a language thing? Is it narrowly only focused on that, or can we start to imagine other modalities really starting to factor into this?

Starting point is 00:55:03 How would it impact basic sciences, those sorts of things. On a personal note, I mean, I had, at that point, now they're now 14 and 12, two kids that I wondered, what did this mean for them? What is the right thing for them to be studying? And so I remember sleepless nights on that topic as well. So I remember sleepless nights on that topic as well. OK, so now you get early access to this technology. You're able to do some experimentation.

Starting point is 00:55:34 I think one of the things that impressed me is just less than four months later at the major health tech industry conference, HIMSS, which also happened timing-wise to take place just after the public disclosure of GPT-4. Epic showed off some early prototype applications of generative AI. And so describe what those were and how did you choose what to try to do there? Yeah, and we were at that point, we actually had the very first started this development in very, very late December, January of 2023, was a problem that its origins

Starting point is 00:56:37 really were during the pandemic. So during the pandemic, we started to see patients increasingly messaging their providers, nurses, and clinicians through MyChart, that patient portal I mentioned with about 190 million folks on it. And as you can imagine, that was a great opportunity in the context of COVID to limit the amount of direct contact between providers and patients while still getting their questions answered.

Starting point is 00:57:07 But what we found as we came out of the pandemic was that folks preferred it regardless. And that messaging volume had stayed very, very high and was a time-consuming effort for folks. And so the first use case we came out with was a draft message in the context of the message from the patient and understanding of their medical history using that medical record that we talked about. And the nurse or physician using the tool had two options. They could either click

Starting point is 00:57:43 to start with that draft and edit it and then hit send, or they could go back to the old workflow and start with a blank text box and write it from their own memory as they preferred. And so that was that very first use case. There were many more that we had started from a development perspective. But yeah, we had that rolling out right in in March of 2023. There was the first folks. So I know from our occasional discussions that some things worked very well. In fact, this is real products now for Epic. And and it seems to be really a very, very popular feature now.

Starting point is 00:58:26 I know from talking to you that a lot of things have been harder. And so I'd like to dive into that. As a developer, tech developer, what's been easy, what's been hard, what's in your mind still is left to do in terms of the development of AI? Yeah.

Starting point is 00:58:48 The first thing that comes to mind sort of starting foundationally, and we hinted at this earlier in our conversation, was at that point in time, it was kind of per a message, rather compute intensive to run these. And so there were always trade-offs we were making in regards to how many pieces of information we would send into the model and how much would we request back out of it. The result of that was that while kind of theoretically or even from a research perspective,

Starting point is 00:59:22 we could achieve certain outcomes that were quite advanced. One had to think about where do you make those trade-offs from a scalability perspective as you wanted to roll that out to a lot of folks. So- Were you charging your customers more money for this feature?

Starting point is 00:59:38 Yeah, essentially the way that we handle that is there's compute that's required. As I mentioned, the future is just part of our application. So it's just what they get with an upgrade. But that compute overhead is something that we needed to pass through to them. And so it was something particularly given both the staffing challenges, but also the margin pressures that health systems are feeling today, we wanted to be very cautious and careful about. Let's put that on the stack,

Starting point is 01:00:10 because I do want to get into, from the selling perspective, that challenge and how you perceive health systems as a customer making those trade-offs. But let's continue on the technical side here. Yeah, on the technical side, it was a consideration, right? We needed to be thoughtful about how we used them. But going up a layer in the stack, at that time,

Starting point is 01:00:33 there's a lot of conversation in the industry around something called RAG, or Retrieval Augmented Generation. And the idea was, could you pull the relevant bits, the relevant pieces of the chart into that prompt, that information you shared with the generative AI model, to be able to increase the usefulness of the draft that was being created. And that approach ended up proving and continues to be to some degree, although

Starting point is 01:01:07 the techniques have greatly improved, somewhat brittle. Right? You have a general purpose technology that is drafting the response, but in many ways you needed to, for a variety of pragmatic reasons, have a somewhat brittle capability in regards to what you pulled into that approach. It tended to be pretty static. And I think this becomes one of the things that looking forward as these models have gotten a lot more efficient,

Starting point is 01:01:44 we are and will continue to improve upon because as you get a richer and richer amount of information into the model, it does a better job of responding. I think the third thing, and I think this is gonna be something we're gonna continue to work through as an industry, was helping users understand and adapt to these circumstances.

Starting point is 01:02:09 So many folks when they hear AI think, it will just magically do everything perfectly. And particularly early on with some of those challenges we're talking about, it doesn't. If it's helpful 85% of the time, that's great, but it's not going to be 100% of the time. And it's interesting as we started, we do something we call immersion, where we always make sure that developers are right there elbow to elbow with the users of the software. And one of the things that I realized through that experience with some of

Starting point is 01:02:46 the very early organizations like UCSD or University of Wisconsin here in Madison, was that even when I'm responding to an email or a physician is responding to one of these messages from a patient, depending on the patient and depending on the person, they respond differently. depending on the patient and depending on the person, they respond differently. In that context, there's opportunity to continue to mimic that behavior as we go forward more deeply. And so you learn a lot about kind of human behavior as you're putting these use cases out into the world.

Starting point is 01:03:23 So, this increasing burden of electronic communications between doctors, nurses, and patients is centered in one part of Epic. I think that's called your in-basket application, if I understand correctly. But that also creates, I think, a reputational risk and challenge for Epic. Because as doctors feel overburdened by this, and they're feeling burnt out, and as we know, that's a big issue, then they point to, oh, you know, I'm just stuck in this epic system.

Starting point is 01:04:05 And I think a lot of the dissatisfaction about the day-to-day working lives of doctors and nurses then focuses on epic. And so to what extent do you see technologies like generative AI as a solution to that or contributing either positively or negatively to this? You know, earlier I made the comment that in December

Starting point is 01:04:33 as we started to explore this technology, we realized there were a class of problems that now might have solutions that never did before. And as we've started to dig into those, and we now have about 150 different use cases that are under development, many of which are live across, we've got about 350 health systems using them, one of the things we've started to find

Starting point is 01:05:02 is that physicians, nurses, and others start to react to saying it's helping them move forward with their job. And examples of this, obviously the draft of the in-basket message response is one, but using ambient voice recognition as a kind of new input into the software so that when a patient and a physician sit down in the exam room, the physician can start a recording

Starting point is 01:05:29 and that conversation then ends up getting translated or summarized if you will, including using medical jargon into the note in the framework that the physician would typically write. In another one of those circumstances where they then review it, don't need to type it out from scratch, for example, and can quickly move forward. I think looking forward, you brought up Cosmos earlier. It's a suite of applications, but at its core is a data set of about 300 million de-identified patients.

Starting point is 01:06:06 And so using generative AI, we built research tools on top of it. And I bring that up because it's a precursor of how that type of deep analytics can be put into context at the point of care. And that's what we see this technology more deeply enabling in the future. Yeah. When you are creating, so you said there are about 150 sort of integrations of generative AI going into different parts of EPIC's software products. When you are doing those developments and then you're making a decision that something

Starting point is 01:06:42 is going to get deployed, one thing that people might worry about is, well, these AI systems hallucinate. They have biases. They're unclear accountabilities, you know, maybe patient expectations. You know, for example, if there's a note drafted by AI that's sent to a patient, does the patient have a right to know what was written by AI and what was written by the human doctor? So can we run through how you have thought about those things? I think one thing that is important context

Starting point is 01:07:14 to set here for folks, because, and I think it's often a point of confusion when I'm chatting with folks in public, is that their interaction with generative AI is typically through a chatbot, right? It's something like chat GPT or Bing or one of these other products where they're essentially having a back and forth conversation. And that is a dramatically different experience than how we think it makes sense to embed into an enterprise set of applications.

Starting point is 01:07:50 So an example use case, maybe in the back office, there are folks that are coding encounters. So when a patient comes in, right, they have the conversation with the doctor, the doctor documents it, that encounter needs to be billed for, and those folks in the back office associate to that encounter a series of codes

Starting point is 01:08:14 that provide information about how that billing should occur. So one of the things we did from a workflow perspective was add a selector pane to the screen that uses generative AI to suggest a likely code. Now, this suggestion runs the risk of hallucination. So the question is, how do you build into the workflow additional checks that can help the user do that? And so checks that can help the user do that. And so in this context, we always include a citation back to the part of the medical record that justifies or supports that code.

Starting point is 01:08:54 So quickly on hover, the user can see, does this make sense before selecting it? And it's those types of workflow pieces that we think are critical to using this technology as an aid to helping people make decisions faster, right? It's similar to drafting documentation that we talked about earlier. And it's interesting because there's a series of patterns

Starting point is 01:09:22 and I going back to the AI revolution book you folks wrote two years ago, some of these are really highlighted there, right? This idea of things like a universal translator is a common pattern that we ended up applying across the applications. And in my mind, translation, this may sound a little bit strange, but summarization is an example of translating a very long series of information in a medical record into the context that an ED physician might care about

Starting point is 01:10:01 where they have three or four minutes to quick review that very long chart. And so in that perspective, and back to your earlier comment, we added the summary into the workflow, but always made sure that the full medical record was available to that user as well. So a lot of what we've done over the last couple of years

Starting point is 01:10:24 has been to create a series of repeatable techniques in regards to both how to build the backend use cases, where to pull the information, feed it into the generative AI models. But then I think more importantly, are the user experience design patterns to help mitigate those risks you talked about and to maintain consistency

Starting point is 01:10:45 across the integrated suite of applications of how those are deployed. You might remember from our book we had a whole chapter on reducing paperwork and I think that's been a lot of what we've been talking about. I want to get beyond that but before transitioning let's get some numbers. So you talked about messages drafted to patients to be sent to patients. So give a sense of the volume of what's happening right now. We are seeing across the 300 and I think it's 48 health systems that are now using generative AI. And to be clear, we have about 500 health systems we have the privilege of working with,

Starting point is 01:11:28 each with many, many hospitals. There are tens of thousands of physicians and nurses using this software. That includes drafting million plus, for example, notes a month at this point, as well as helping to generate in a similar ballpark that number of responses to patients. The thing I'm increasingly excited about is the broader set of use cases that we're seeing folks starting to deploy now. One of my favorites has been,

Starting point is 01:12:09 it's natural that as part of, for example, a radiology workflow and studying that image, the radiologist made note that it would be worth double checking, say in six to eight months, that the patient have this area scanned of their chest. Something looks a little bit fishy there, but there's not a definitive finding. There's not a definitive finding at that point. Part of that workflow is that the patient's

Starting point is 01:12:40 physician place an order for that in the future. And so we're using generative AI to note that back to the physician and with one click allow them to place that order helping that patient get better care. That's one example of dozens of use cases that are now live, both help improve the care patients are getting, but also help the workforce. So going back to the translation summarization example, a nurse at the end of their shift needs to write up a summary of that shift for the next nurse, for each patient that they care for. Well, they've been documenting information in the chart over those eight or 12 hours,

Starting point is 01:13:24 right? So we can use that information to quickly draft that end of shift note for the nurse. They can verify it with those citations we talked about and make any additions or edits that they need and then complete their end of day far more efficiently. Right. OK, so now let's get to Cosmos, which has been one of these projects that I think has been your baby for many years and has been something that has had a profound impact on my thinking about possibilities.

Starting point is 01:14:01 So first off, what is Cosmos? Well, just as isn't a side. I appreciate the thoughtful comments. There is a whole team of folks here that are really driving these projects forward. And a large part of that has been, as you brought up, both Cosmos is a foundational capability, but then beginning to integrate it into applications. And that's what those folks spend time on. Cosmos is this effort

Starting point is 01:14:32 across hundreds of health systems that we have the privilege of working with to build out a de-identified data set with today, and it climbs every day, but 300 million unique patient records in it. And one of the interesting things about that structure is that, for example, if I end up in a hospital in Seattle and have that encounter documented at a health system in Seattle, I still, a de-identified version of me,

Starting point is 01:15:08 still only shows up once in Cosmos, stitching together both my information from here in Madison, Wisconsin, where Epic is at, with that extra data from Seattle. The result is these 300 million unique longitudinal records that have a deep history associated with them. And just to be clear, a patient record

Starting point is 01:15:32 might have hundreds or even thousands of individual, I guess, what you would call clinical records or elements. That's exactly right. And it's the breadth of information from orders and allergies and blood pressures collected, for example, in a in an outpatient setting to cancer staging information that might have come through as part of an oncology visit. And the key and it's coming from a variety of sources. We exchange information about 10 million times a day between different health systems. And that

Starting point is 01:16:12 full picture is available within Cosmos in that Cosmos? Well, the real ultimate aim is to put a deeply informed in-context perspective at the point of care. So as a patient, if I'm in the exam room, it's helpful for the physician and me to know what have similar patients like me experienced in this context. What was the result of that line of treatment, for example? Or as a doctor, if I'm looking and working through a relatively rare or strange case to me, I might be able to connect with this as an example workflow we built called Look-A-Likes with another physician who has seen similar patients or within the workflow see a list of likely diagnoses based on patients that have been in a similar context.

Starting point is 01:17:18 And so the design of Cosmos is to put those insights into the point of care in the context of the patient. To facilitate those steps there, the first phase was building out a set of research tooling. So we see dozens of papers a year being published by the health systems that we work with. Those that participate in Cosmos have access to it to do research on it. And so they use both a series of analytical

Starting point is 01:17:54 and data science tools to do that analysis and then publish research. So building up trust that way. The examples you gave are like with look-alikes, it's very easy, I think, for people outside of the healthcare world to imagine how that could be useful. So now why is GPT-4 or any generative AI relevant to this? Wow. So a couple of different pieces, right? Earlier we talked about, and I think this is the most important, how generative AI is

Starting point is 01:18:27 able to cast things into a specific context. And so in that way, we can use these tools to help both identify a cohort of patients similar to you when you're in the exam room, and then also help present that information back in a way that relates to other research and understandings from medical literature to understand what are those likely outcomes. I think more broadly, these tools and generative AI techniques in the transformer architecture envision a deeper understanding of sequences of events, sequences of words. And that starts to open up broader questions about what can really be understood about

Starting point is 01:19:20 patterns and sequences of events in a patient's journey, which if you didn't know the name epic, just like a great long nation's journey is told through an epic story, is a patient's story. So that's where it came from. So we're running up against our time together. I mean, I always like to end with the more provocative time together. And I always like to end with the more provocative question. And so, for you, I wanted to raise a question that I think we had asked ourselves in the very earliest days that we were sharing DaVinci 3, what we now know of as GPT-4 with each other, which is, is there a world in the future, because of AI, where we don't need electronic health records anymore? Is there a world in the future without EHR?

Starting point is 01:20:11 I think it depends on how you define EHR. I see a world coming where we need to manage a hybrid workforce where there is a combination of humans and something folks are sometimes calling agents working in concert together to care for more and more of the country and of the world. And there is and will need to be a series of tools to help orchestrate that hybrid workforce. And I think things like EHRs will transform into helping that be operationally successful. But as a patient, I think there's a very different

Starting point is 01:20:59 opportunity that starts to be presented. And we've talked about kind of understanding things deeply in context, there's also a real acceleration happening in science right now. And the possibility of bringing that second and third order effects of generative AI to the point of care, be that through the real world evidence we were talking about with Cosmos,

Starting point is 01:21:26 or maybe personalized therapies that really are well matched to that individual. These generative AI techniques opened the door for that, as well as the full life cycle of managing that from a healthcare perspective all the way through monitoring after the fact. And so I think we'll still be recording people's stories. Their stories are relevant to them and they can help inform the bigger picture. But I think the real question is, how do you put those in a broader context? And these tools open the door for a lot more. Wow, that's a really a great vision for the future. Seth, I always really learned so much talking to you and thank you so much for this great chat.

Starting point is 01:22:16 Thank you for inviting me. I see Seth as someone on the very leading frontier of bringing generative AI to the clinic and into the healthcare back office and at the full scale of our massive healthcare system. It's always impressive to me how thoughtful Seth has had to be about how to deploy generative AI into a clinical setting. And you know, one thing that sticks out and he made such a point of this is, you know, generative AI in out, and he made such a point of this, is, you

Starting point is 01:22:45 know, generative AI in the clinical setting isn't just a chatbot. They've had to really think of other ways that will guarantee that the human stays in the loop. And that's, of course, exactly what Carrie, Zach, and I had predicted in our book. In fact, we even had a full chapter of a book entitled Trust But Verify, which really spoke to the need in medicine to always have a human being directly involved in overseeing the process of healthcare delivery. One technical point that Carrie, Zach and I completely missed, on the other hand, in our book was the idea of something that Seth brought up called RAG, which is retrieval augmented generation.

Starting point is 01:23:26 That's the idea of giving AI access to a database of information and allowing it to use that database as it constructs its answers. And we heard from Seth how fundamental RAG is to a lot of the use cases that Epic is deploying. And finally, I continue to find Seth's project called Cosmos to be a source of inspiration. And I've continued to urge every healthcare organization that has been collecting data to consider following a similar path. In our book, we spend a great deal of time focusing on the possibility that AI might be able to reduce or even eliminate a lot of the clerical drudgery that currently exists in the delivery of healthcare. We even had a chapter entitled The Paperwork Shredder, and we heard from both Matt and Seth that that has indeed been the early focus of their work. But we also saw in our book the possibility that AI could provide diagnoses, propose treatment options, be a second set of eyes to reduce

Starting point is 01:24:33 medical errors, and in the research lab be a research assistant. And here in EPIC's Cosmos, we are seeing just the early glimpses that perhaps generative AI can actually provide new research possibilities in addition to assistance in clinical decision making and problem solving. On the other hand, that still seems to be for the most part in our future rather than something that's happening at any scale today. But looking ahead to the future, we can still see the potential of AI helping connect healthcare delivery experiences to the advancement of medical knowledge.

Starting point is 01:25:13 As Seth would say, the ability to connect bedside to the back office to the bench. That's a pretty wonderful future that will take a lot of work and tech breakthroughs to make it real. But the fact that we now have a credible chance of making that dream happen for real, I think that's pretty wonderful. I'd like to say thank you again to Matt and Seth for sharing their experiences and insights.

Starting point is 01:25:41 And to our listeners, thank you for joining us. We have some really great conversations planned for the coming episodes, including a look at how patients are using generative AI for their own healthcare, as well as an episode on the laws, norms, and ethics developing on AI and health and more. We hope you'll continue to tune in. Until next time.

Microsoft Research Podcast - The AI Revolution in Medicine, Revisited: Real-world healthcare AI development and deployment—at scale

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.