ACM ByteCast - Regina Barzilay - Episode 44

Episode Date: October 4, 2023

In this episode, part of a special collaboration between ACM ByteCast and the American Medical Informatics Association (AMIA)’s For Your Informatics podcast, hosts Sabrina Hsueh and Adela Grando wel...come Regina Barzilay, a School of Engineering Distinguished Professor of AI & Health in the Department of Computer Science at the Massachusetts Institute of Technology and the AI Faculty Lead at MIT Jameel Clinic. She develops machine learning methods for drug discovery and clinical AI. In the past, she worked on natural language processing. Her research has been recognized with the MacArthur Fellowship, an NSF Career Award, and the AAAI Squirrel AI Award for Artificial Intelligence for the Benefit of Humanity. Regina is a member of the National Academy of Engineering and American Academy of Arts and Sciences. Regina describes her career journey, and how a personal experience with the healthcare system led her to work on an AI-based system for the early detection—and prediction of—breast cancer. She explains why entering the interdisciplinary field of clinical AI is so challenging and offers valuable advice on how to overcome some of these challenges. Regina also opines on new models for using AI, including the promise of ChatGPT in healthcare. Finally, she talks about inequity in medicine, and offers actionable insights on how to mitigate these shortfalls while moving the field of clinical AI forward.

Transcript
Discussion (0)
Starting point is 00:00:00 This episode is part of a special collaboration between ACM ByteCast and AMIA For Your Informatics Podcast, a joint podcast series for the Association of Computing Machinery, the world's largest educational and scientific computing society, and the American Medical Informatics Association, the world's largest medical informatics community. In this new series, we talk to women leaders, researchers, practitioners, and innovators who are at the intersection of computing research and practice to apply AI to healthcare and life science. They share their experiences in their interdisciplinary career paths, the lessons learned for health equity, and their own visions for the future
Starting point is 00:00:45 of computing. Okay, hello, and welcome to the ACM AMIA joint podcast series. This joint podcast series aims to explore the interdisciplinary field of medical informatics, where both the practitioners of AI and REL solution builders and the stakeholders in the healthcare ecosystem who can interest. I'm Dr. Sabrina Shea with the Association of Computing Machinery by CAST Series. My co-host today is Dr. Adela Granul from the 4-Year Informatics Podcast with American Medical Informatics Association. In addition, we have the pleasure of speaking
Starting point is 00:01:25 with our new series guest, Dr. Regina Barzilai. Well, thank you so much for joining this podcast. And today we have a very special guest. We have Dr. Regina Barzilai, and she has a very impressive CV. She's a School of Engineering Distinguished Professor for AI and Health in the MIT Department of Electrical Engineering and Computer Science. She's also a member of the MIT Computer Science and AI Laboratory. And in addition, she's an AI Faculty Lead for Jamil Clinic, which is an MIT Center for Machine Learning in health. And her research interests are in the application of deep learning to chemistry and oncology. And she has received many, many awards, including NSF, MIT, Microsoft, and UNESCO recognitions. So Dr. Barzilay,
Starting point is 00:02:19 thank you so much for joining us today. Thank you very much for having me. Really excited to be here. Hi. Yeah. So Dr. Brazai, you're one of the nation's top AI leaders in computational linguistics, chemistry, and health AI. At first, you played some breast suit for breast cancer and now for lung cancer. You seem to have a magic finger to make things happen in those interdisciplinary fields. So our audience here are from both AMIA and ACM. As you know, they are scientists, clinicians, health IT practitioners, and students. Wondering if you can let our audience here know what contributed to your magic here to make your successful interdisciplinary contributor you are today. Are there any inflection points in your journey that lead you here you feel that's worthwhile sharing with our audience? So, first of all, thank you very much for kind words.
Starting point is 00:03:15 And I wanted to say that, you know, I did my PhD and I worked for maybe like 15 or 16 years out of my PhD when I was a faculty at MIT only on natural language processing. In fact, I really stick to the subject and didn't even publish in any other application areas of natural language processing. And in 2014, I was diagnosed with breast cancer. And for real, the first time in my life, I really encountered a healthcare system because, you know, I gave birth obviously in the hospital, but it's not like really you understand in detail how the system works. And here, when you are treated for a disease, you actually have a long-term encounter with the system, starting with the diagnostics and going through the different options for care and post-care surveillance. And at that point, even though 2014 was kind of before, you know,
Starting point is 00:04:13 everybody on the street knew what is AI and it was beginning, it was really troublesome to see how very little of AI is in the healthcare system. In fact, there was none. And I was treated in one of the most prominent centers in the country, which is Massachusetts General Hospital. And, you know, they did a good job in clinical profession, but it was really disheartening to see that despite the fact that a lot of questions that, you know, both physicians have and the patients have, which are in nature prediction problems, you know, there was no technology to solve them. And, you know, going through it, I kind of felt that there is like a unique mission of me when I already recovered is that I really need to try to go and to change something,
Starting point is 00:04:59 at least for one area, which I understood reasonably well after going through my own treatment. And at that point, they start slowly transitioning into this area. And also, what's interesting that I look at medical informatics like, you know, maybe 20 years ago. And a lot of medical informatics, I didn't find, you know, it's particularly interesting. I'm saying, why do you need to do something for medical informatics i didn't find you know it's particularly interesting i'm so why do you need to do something for medical informatics if you can just do an lp tool and apply it to medical informatics you can just apply it and you know and there were a lot of issues with the tools um so i didn't feel that we can actually deliver something that's going to at
Starting point is 00:05:40 the time dramatically change care but what you know was really clear to me in 2015 that, you know, AI has a lot, a lot of things to propose to medical informatics. So the field of AI matured enough that it can really make a big difference in patient care on one hand. And on the other hand, I understood the field better. And I understood that there are unique needs that the general tool that you develop for other things cannot be straightforwardly apply because it interfaces a patient at the end. And that's motivated a lot of my research to this day. And also, you know, it was really funny when I started working on chemistry, I actually
Starting point is 00:06:27 didn't see, again, after my own treatment, you know, I was kind of in exploring state and somebody asked me to join the grant on utilizing machine learning. It was not even for chemistry, it was extracting from chemical literature. I said, okay, fine, maybe. I didn't even see the connection to health. So at the time, what happened was, you know, I was starting to doing this grant. I didn't see much connection to drug discovery, but MIT is surrounded by the pharmaceutical company. In fact, right now I see from my window Moderna and I see Amgen. And at the time, I start talking to the people around and I realized, wow, this technique can really change drug discovery.
Starting point is 00:07:08 And then slowly I started doing more and more work on molecular modeling and drug discovery. So in some ways, you can say that this path was deliberate because I deliberately decided to switch from NLP to clinical AI and drug discovery, but there was some randomness in the process. Yeah, that this process sometimes take a long time to fulfill, right? Like you started with that breast cancer journey a long time ago in 2014.
Starting point is 00:07:38 But just lately, we have seen more approval of AI-based digital pathology diagnosis tests for breast cancer now. I think about that as already six, eight years now, until it comes to a full realization of the dream you had before. I was wondering how did you feel about that process? And in that process, what keeps you working and what's your fundamental goal, you think? So in some ways, I'm very excited about the progress that we made. But on the other hand, I feel a lot of frustration. So the whole process of getting into clinical AI is very challenging for an outsider. It challenges in every single step. Like I develop tools with
Starting point is 00:08:28 my student, Adam Yala, who is now a professor at Berkeley UCSF and other students in my group, we develop tools that can take an image of breast, a mammogram, and can predict the outcomes for the patients for the next five years. And the reason you can do it today is that we are denying when we're diagnosing somebody that the tumor should be large enough that the eye can actually see it. But as we all know from biology, cancer, it's not a disease that just happens from one day to the next day. So there is a long process in the tissue that needs to happen before the tumor becomes cancerous. And what machine can do that when you train is a large number of images where you know the outcome of this patient for five years can actually look at the image that
Starting point is 00:09:10 today the doctor would say this patient is fine. It can actually say, no, this patient is likely to develop the disease within very short time. And it's important because the treatment of the disease is very different, whether you discovered it like when it's already progressed or you do it like in the earliest stages. So this way we can actually change the medicine of treating when the symptoms are apparent to much, much earlier stages, which much easier and like more likely to succeed treatment. However, again, if you look at this whole process, how did we get in? It took me two years to get mammograms. To this day in this country, there is not one publicly available years to get mammograms. To this day, in this country, there is not one
Starting point is 00:09:45 publicly available data set of mammograms to identify mammograms that if you are a computer science researcher who want to contribute, that you can actually go and try it. It just doesn't exist. And if you're thinking about the amount of money that is in NCI and in age, it's, you know, mind boggling. How could that be? And, you know, we, through the two years of this very, very challenging process, we finally got the data. Then, you know, the next thing that you need to do as an academic, you need to go and have funding to support your students to do this work.
Starting point is 00:10:21 And again, I apply in the very beginning of this process, I applied to NCI and DOD has a special program for breast cancer, and we didn't get any funding. And when you read the reviews that we got at the time, people were asking, why do you use neural models and deep learning instead of using SVM? So it really shows to you that the reviewers were completely unaware of what is the state of the art. And at that point, I was so mad that I decided I'm not even going to submit them. There is no way. They don't even understand where the field is. You cannot add a tutorial together with your grant.
Starting point is 00:10:57 So, you know, I was very lucky because multiple foundations supported this work and we were actually able to complete it, but it shouldn't be the way that a computer scientist who's interested to contribute either as a full-time, as a part of their time, cannot bring their talents and their energy to this field because the barriers are so extremely, extremely high. And then we passed those barriers, we built the models, we managed to publish it in, you know, top clinical venues. And then you're looking at the process of implementation. It's, again, it's very, very challenging. And it's not only true for breast cancer, because if you think about the last time that you went to the doctor, did you see any AI there? And the answer will be none. Maybe there is some AI that helps them to schedule you
Starting point is 00:11:48 or do this kind of thing. But this is, of course, it is important not to underestimate it. But this is not something that will directly be changing patient outcomes. So the clinical system today, for very rarest reason, do not find a good way to really take this innovation and this novelty and bring it to improve patient care. And on one hand, again, as I said, I'm very excited because we
Starting point is 00:12:15 developed this technology because there are already institutions that are using it. But on the other hand, the slow speed of adoption, the amount of non-research energy you have to put it to see it through is still very disappointing to me. Well, thank you so much for sharing your personal story. We could feel your passion, but also your frustration in your comments already in the first questions. And we want to continue talking about your journey. So we wonder, were you confronted with interdisciplinary changes? You mentioned some of them already. What did you do to overcome them? So the big challenge, the big interdisciplinary challenge is, it's really a cultural challenge. And whenever you are, like I'm talking about myself and my colleague at MIT, you know,
Starting point is 00:13:07 you are kind of, that's how your scientific career works. You know, you're working in a field X, you're taught how to write papers in field X, how to obtain funding in that field, you know, about the standards. And whenever you're starting to branch out to this new field, this field didn't even exist. Like we were among the first ones that started working on deep learning for chemistry. This field was pretty much built in the last seven, eight years.
Starting point is 00:13:30 Then a lot of questions arise. And as I already mentioned, funding is one of them because the funding agencies like in places like NIH, they specialize in reviewing more traditional statistics rather than deep learning algorithms,
Starting point is 00:13:45 because this is not in the reviewers, at the time at least, was not their reviewers training. This is problem number one. Problem number two, when you start publishing the papers in the journals. I remember the first time when we started writing the paper, you know, they are even structured differently than computer science papers. So you just need to relearn. After 16 years being a professor,
Starting point is 00:14:10 I was relearning. How do you write these papers so that they are acceptable to the audience? Very different style. And finally, after you've done all of that, you need to push it through the reviewing process because unless you publish it, you know, it's not available to scientific community.
Starting point is 00:14:27 And again, there is a lot of challenge for the reviewers in these clinical journals. How do they read and how do they separate among the papers which really talk about the contribution from the papers, which are not. And this is a challenging growth process for the whole community so i think that my advice to somebody who is considering doing it is really being patient and be aware that there will be a lot of failures and it's not going to be like your standard submission to icml that you kind of know how to make it and it was a good likelihood if the paper is reasonable it would be accepted in the first or second try you You just need really to explore and be patient. And then there is, of course, another thing, how do you communicate with
Starting point is 00:15:11 medical professionals that need to adapt it? So, you know, these people who are very busy, they have a lot of responsibilities to make sure that whatever they bring to the hospital is very safe and compliant with the regulators. It's not like you're just going to give somebody and say, oh, try it. And I remember in the beginning, I was extremely naive. It was like, oh, I have this great thing. Why won't you take it? It's like, nobody wants to take it. So you really need to learn and to understand how do you talk to these people and find the
Starting point is 00:15:41 right people that can actually help you to do the transition. So, and I'm still learning. I cannot say that I know I have a magic power to do it, but, you know, let's put it this way. Now I am better than even a year ago. So, you know, it's a growth, ongoing growth process. Yeah. Yeah.
Starting point is 00:16:01 Patience and communication are definitely important, and also communicating the values of impact with clinicians to see how we can have a common future together. Not to say that we cannot leave the room today without talking about chativity and generative AI, as that certainly has been planned as an iPhone moment, right? Did you feel so? Did you feel that that has bringing more opportunities for communication and ease that burden of communication there with all this generative AI hype that's coming up to say that there is a chance to improve efficiency and situation awareness here? Yeah, I think it is extremely
Starting point is 00:16:47 exciting that, you know, we have now these tools that are broadly used and it's truly amazing. And it continues to amaze me because I remember when I just started doing my master's in language processing and I took my first class, there was not even one example. It was in like 95 or 96. There was not even one example of program doing anything, like nothing. It was kind of theoretical class where you learn the grammar and stuff. So seeing that, and then, you know, when I started at MIT, roughly during the time Google put their first machine translation system, it was kind of funny because people will try different things and see how it works.
Starting point is 00:17:25 To create it as really a new technology. This was really amazing to see it. And it's amazing to see that people use it the same way as they use, you know, electrical devices. You know, we don't doubt. However, it's also interesting because it didn't come in the same way as electrical devices. Somebody made that analogy.
Starting point is 00:17:46 So it works many times, but sometimes it is like boom and something is totally wrong. So I think that really understanding how can it be utilized in healthcare, which is, again, extremely regulated in a way which is safe and secure and where we have control over the outcomes is really important. And I think right now we have this wonderful, in some ways, untamed beast that needs to go through some process of how we can, in a proper way, bring it into the healthcare. But I think a lot of tasks, like, for instance, writing notes, which, you know, every doctor
Starting point is 00:18:22 detests and which takes so much time and which also not only the problem is not only that it takes so much time, the problem is that many times, and there are a lot of discussions, especially in the context of health equity that, you know, the doctor summarizes whatever pick up their mind may not necessarily summarize the whole story, bringing their bias, personal bias, like we all do in the summary of the encounter. So bringing the technologies that can make it like much more equal, of course, with the human input, I think it's going to really be an important way to improve the outcomes. Yeah, I think in the Stanford AI Index, they already find that the current model is like
Starting point is 00:19:04 20% more toxic than the version we had just three years ago, the state of R1. So that's what I say a lot about bringing in bias. Are there other limitations you think people should be aware of or just among all these pressing issues? Which one do you feel that's the most needed to be addressed? To be addressed in, can you be more specific in which area? This new trend of coming up with using language models for generality AI. So I think one of the central challenges is again, to improve models. And we've worked on it at MIT, there are people who work, of course, at other universities.
Starting point is 00:19:48 How do you really quantify the uncertainty? Because we always have this misconception and it relates to interpretable AI and others where we say, oh, human, if they see it, they can decide what is true or not true, what is right and not right. But the problem is that our capacity, our knowledge is limited. Our capacity, recognized patterns is limited, our capacity, recognized patterns is limited. So it can fool you completely. And you know, I was one of my students defended in the, as a joke, he kind of asked GPT to put my publication list and she put
Starting point is 00:20:17 a list of publication and they look reasonable, just none of them were mine. And some of them related to the, it's like, there was nothing to do it. There was my name added, but you can see the stability of hallucinates is really amazing. So if it would not be me in the audience, there will be somebody else that say, yeah, it is feasible. And we need to have a machine learning tools
Starting point is 00:20:37 that can actually control other machine learning tools. We need to have stronger uncertainty estimation tools that we can highlight to the human, say, here, the uncertainty of the model is really high, rather than thinking, oh, the human can check it out. Because at the end, if human needs to check out every single sentence, the human as well
Starting point is 00:20:56 can write it down. So you really need to say, this is the part where it's not clear, or identify other tools for monitoring. Because again, if you think about any other device that we bring to our home, consumer devices, they all come with some safety device like, you know, buttons, whatever on your microwave or in anything, they will tell you, you know, something is off. So we need to create for these models a mechanism to tell us when they are unsure and make it part of this decision making when we stop relying so much on human mind because
Starting point is 00:21:32 it is limited how far alone did you think we might see something that was a real potential here i i think that we are already seeing a lot and i think there are a lot of devices and applications that are already being placed in the health care system. People are really exploring the question of how to utilize it for taking physician notes. I think it's an extremely promising application. I think, for instance, even asking questions like a patient who is prepared for procedures or who has the question, even this type of information. So because the patients today are doing the searches on Google when they have questions because they don't have access to the clinician 24-7, but having a machine help you can really
Starting point is 00:22:17 be important for that. Yeah. And also humanity, right? You mentioned in your previous interview with Dr. Les Freeman that one of your favorite science fiction is Fallout or Adorama, where we see that the augmented intelligence didn't necessarily lead humans to a happier life ever after. What did you see that we in the middle of computer science and medicine
Starting point is 00:22:41 can have humility here? I think it's a slightly different point the point is i think there are two different points one is there are a lot of tasks that we have today which are very mundane which take a lot of effort and we are relying on humans human resources are limited even if you are in the best healthcare system, unless you have like private physician to whom you can call 24-7, which 99.9% of us don't have, you really can benefit from getting information for not being the one, everybody joke and Dr. Google and clinicians are very skeptical about it. But on the other hand, you are thinking about, you know, any patient
Starting point is 00:23:26 before you have a procedure, you want to get more information, go to Google. What is the alternative? Of course, if there will be a doctor to whom you can call all the time, you would use it, but we don't. So I think that providing
Starting point is 00:23:39 better information services would really make a difference. Like it made a difference, you know, how our shopping experience got improved with Amazon or how our, you know, booking reservation, even, you know, your cell phone, which recognizes your face, you don't need to type the password. So there are millions of things that got significantly improved with this new technology. Now, going back to the question, what I meant when I said in Flowers for
Starting point is 00:24:07 Algernon, there is always assumption that, you know, the smarter you are, you know, the happier you are in life. And as a person, it's not about the services that are provided to you that they're smarter. And I think the book made a very good point. There are many other aspects in humanities, not only your smartness, that contribute to your overall happiness. I see it as a faculty at MIT. You know, I see it as a person. But I think that the places where we are bringing this AI, especially in the areas like healthcare, is really to help us to do better tasks that we are already doing or feeling that it will be great to have the automation there.
Starting point is 00:24:49 Thank you for sharing. ACM Bycast and AMIA FYI Podcast are available on Apple Podcasts, Google Podcasts, Spotify, Stitcher, and other services. If you're enjoying this episode, please subscribe and leave us a review on your favorite platform. Well, let's continue talking a little more about combining AI and medicine. You touched a lot on that already, but I wonder what would you recommend female professionals, especially who wants to start working in this interdisciplinary field. Were there any career moves that you did in the beginning that you found helpful and you would
Starting point is 00:25:30 like to share with female professionals? So one thing that actually I think it's extremely important for female professionals, like a lot of topics that I selected that I work on relate to female type of diseases. And I'm not saying that we shouldn't study diseases that affect the male population, but we know there are lots and lots of studies that show that, you know, due years of inequities where the funding was available and who was running the studies. A lot of diseases that affect women actually not studied well at all. Like my colleague and friend, Professor Linda Griffiths from MIT, for instance, was affected all her life very severely with endometriosis.
Starting point is 00:26:12 And she started an open program on endometriosis. It was despite the fact that it affects like arguably 15% of women, you know, causing severe pain and affecting fertility. This disease was not properly studied, the biology of it. And if you're trying to look at many areas of medicine, the answers are extremely unsatisfactory. So to me, a big driving factor of why I do something and what gives me power to go and listen to this rejection again and again, because I really care about like making change in certain area. And I think that this is really important because, you know, at the end, if you are not doing it, like, would you really
Starting point is 00:26:53 assume that somebody else is going to do it? So I think that having something that motivates you and helps you to stay against the expected rejection, it's really important. And also what helped me, I think, you know, I firmly believe that there are lots of people in the world who want to help you. And even though you may not know who they are, especially when you're looking into disability and, you know, you're looking, there is like this whole mass of people at MGH or whatever. Part of it is really going out there and talking to people
Starting point is 00:27:25 and trying to find the ones where it will be productive collaboration and relation. So it's kind of challenging balance. On one hand, you need to be very, very strict on your time because you can't be sitting and talking, and you need to find collaborators where it's really worth to go, but it's worth exploring. So a lot of research that I did really was just kind of starting the conversation
Starting point is 00:27:48 and learning and pursuing it further. I didn't have skill very well developed when I was a traditional NLP researcher because most of my research and communications were at MIT with my students and very close colleagues. But when we moved out of this realm, I think that really going out there, communicating, finding the right supporters really helped. Yeah, and many of our audience are in the same situation as you are switching from one field
Starting point is 00:28:17 to another interdisciplinary field and try to pick up how to communicate in that field better. So in that process, did you find any useful resources you can recommend to those that are entering this interdisciplinary career now, either clinicians learning more about AI or AI scientists learning more about medicine? Are there any advice you have there? So I didn't really find any resources like to read about. I think that since it's so interpersonal in terms of like connecting to people, you really need to try and to find the group that will work for you and type of collaborators that work for you.
Starting point is 00:29:00 You know, as a computer scientist, we are in a lucky position. There are many more clinicians that there are computer scientists who are interested in this field. So we are in a lucky position. There are many more clinicians that there are computer scientists who are interested in this field. So we're in a huge imbalance here. So I think just trying to connect to your local hospital and identifying if they have researchers can be the first step. I never actually heard of anybody from MIT who said, I'm interested in the field X, and they couldn't find.
Starting point is 00:29:23 It typically happens in another direction. There are a lot of clinicians from various disciplines that want to collaborate, but I have limited bandwidth, and unfortunately, to most of them, I say no unless I can find some colleagues. So my advice would be just go ahead and try and don't be afraid because you need to understand something about the disease indeed, but there are lots of applications and all the useful things you need to do without, you know, going and taking a course in physiology.
Starting point is 00:29:50 So there are some cases where you need to understand more, but you can, because the field is so open, you can find something that you are contributing and you're making a difference without, you know, kind of becoming a specialist in some clinical area. But I have to tell you, and this may be my first time sharing it publicly, that there were some interactions were very, very surprising. And, you know, at the end, like as an MIT professor or as any faculty in any other institution, you kind of have the freedom to interact with people that you kind of have similar views that are similar to you in many ways. So it's easy.
Starting point is 00:30:29 But, you know, I was kind of reaching out in the beginning of my career, you know, I had a variety of very colorful encounters and, you know, some of them still, you know, keep me in my mind. I just want to share one so that when one of the listeners, you know, encounters whatever difficulty they will encounter, which is normal part of the process, they would maybe listen to the story and it will tell you that, you know, there are other crazy stories that happened. So I remember that at one point I went to a hospital, to MGH, and I was discussing with them, you know, where the servers with the data will be located.
Starting point is 00:31:11 And they decided they want to move it to some city nearby, you know, was fine. So I asked them, what's the problem? It's fine with me. And it was me, my female clinical collaborator, you know, another four men who deal with, you know, with these matters. They knew that I'm professor of electrical engineering and computer science. And what happened next was really interesting. So then the person told me, oh, the machines are going to be working slowly. I said, why? They said, because now electrons need to fly from that city, from Needham to MGH.
Starting point is 00:31:40 And I was a bit confused. I said, what do you mean? And then he drew me a picture of a bath. I mean, imagine yourself in a bath. He drew me a picture in a bath and said, the reservoir of water can be located like you need them. And then the water needs to come to your bath, so it takes time.
Starting point is 00:31:56 So I was in total shock. You know, it was so bizarre that I couldn't even comment, which saved my face because I didn't, you know, engage in a negative energy here. And I was surprised that I didn't say anything. But, you know, at the end, my thought is, if I'm going to upset this person, I'm not going to get the data. And I'm not going to get the service.
Starting point is 00:32:18 So it's better that I shut up and just keep the story in my head. I would never have such an encounter at MIT. But it was a funny story. And I think that whenever you're entering this interdisciplinary field, you will have your own set of stories. And you need to have a good humor to laugh at them and, you know, keep our eyes on the goal, which is to ensure that we can get the data and resources we need. Because it requires communicating and interacting with different types of people. Well, you mentioned a little about health equity before. So, I mean, this is a hot topic.
Starting point is 00:32:54 There has been a lot of discussion about the potential of AI to both drive health equity, but also do harm. So lots of examples of bias and AI, unfortunately. So we wanted to ask you, what is health equity to you? And what do you think is the most pressing issue now? I first want to say that the issue of equity was there for decades. And if you look even at very traditional statistical models that are currently used, that are parts of the FDA approved protocols are biased. And there was a lot of documentation about it. Let me give you an example from breast cancer.
Starting point is 00:33:33 Like Tara Cusick, which predicts based on like your family history and other things, predicts your risk of breast cancer. This model, you can utilize a prediction of this model to recommend patients for MRI, to give them chemopreventative treatments and other things. This model, according to the authors, was trained decades ago on white women in London. It's used across the whole United States. It is known that it doesn't work well on a variety of populations. It's close to random on Asian population, doesn't work very well on African-American population. So these things were there and we now have a chance to kind of revisit it and be very open about it. Now, what is needed? So we know, of course, that the bias doesn't come from the model per se. It's how did you train the model? And the key aspect is to
Starting point is 00:34:23 ensure that when we're training and testing these models, they are trained and tested on the sample that is representative of the population. Or if you're deploying the model now and you're applying it to somebody who you've never seen for whatever reason, the modelist should say, you know, I don't know, like do whatever you will do without me. And we don't quite there yet, both from the technical perspective and from the data perspective. If you look at the data sets that we use to develop our newer model called SIBEL, which predicts lung cancer risk very well, that model was trained on the data set called NLST, which comes from National Low Dosage CT Trial, was funded by NCI. That data set, which is an amazing data set, really great resource, likely shared with the public, doesn't almost have any African-American population, if I'm not mistaken, less than
Starting point is 00:35:13 5%. So whatever model you're going to develop is not guaranteed to work well on the population that is not representative. We actually went out and we are now tested, for instance, on the Asian population in Taiwan. It worked quite well. But there's an element of surprise. You really need to validate and test. But you can say that whenever these type of resources are released by federal bodies, we need to ensure that they are really representative of all of us and not just a subset of a population. This is point number one. The point number two relates to FDA regulations and to deployment.
Starting point is 00:35:50 And I know that EMEA is very much involved in the process, but there is a big question. When the tool is already in production, how we can ensure that it doesn't do mistakes and it runs on a person who is different. And now the implicit assumption of the regulation is that if I publicize about the stool where they were trained and when they were tested, the doctor can make this decision. But how the doctor can make the decision? Computer scientists who develop the model cannot do this decision. We cannot just look at the statistics and say, yeah, this patient is different because in addition to your known things like gender, age, race, there are many other things how one can be different. And we really need interregulation.
Starting point is 00:36:35 Again, remove the human intuition as a main way to decide whether it is good or bad, but to have this very strong statistical models and say, no, you cannot be using it on this patient. To summarize, I think there is more technologies that needs to be developed to make it safe, but also it is a lot of place on collecting the data, which is representative of all of us. Yeah, so in the middle, did you feel there is anything that people like us
Starting point is 00:37:01 in the middle of medicine and computer science can help in this area to make this field move forward? I think that there are several things that we as a computer scientist or people who work in medical informatics can do. The first one, as I said, there is a lot of technologies that are currently lacking. Like uncertainty estimation for a very long time in deep learning is more an afterthought rather than the primary area. Because whenever you are, I don't know, try to recognize your face on iPhone, do many other things, you know, even if you translated something incorrectly, it's not a big deal. Well, here we're talking about high stake applications where certainty is really important. So like really producing more tools in this area, it would be, I think, influential for the field.
Starting point is 00:37:49 There are separate kind of unique needs for medical applications that I think are still unmet with the technical community. But I also think we as a professional who kind of understand technology and are interested to apply it to this important area need to be very active in regulatory process because we are trying to regulate technologies. It develops really fast. The FDA regulations that
Starting point is 00:38:14 came out, they're antiquated at the time that the technology is published. So there is like a huge lag. So being really like not trying to, you know, regulate now something that is going to change in half a year. And, you know, as you know, regulatory process is very slow. How do we think about it? And I think we as a professionals really need to be part of that conversations from the very beginning. Yeah. I know FDA is taking a very community-based approach now to call for comments on their discussion papers on AI. Absolutely, absolutely. And actually, I'm writing one right now.
Starting point is 00:38:51 But I think that it's really important for our community. It's not a natural thing for us to do. But I think it's really important for our community because, you know, none of us have a full picture. Like as technology people, we understand technology well, but we don't really understand maybe the regulatory landscape very well or the clinical landscape. On the other hand, people who are strong in those may not understand,
Starting point is 00:39:16 you know, what technology can and cannot do. So I think that really bringing the communities together is really important. Yeah. What I noticed also in these two years, the attitude of technology changes a lot from not wanting regulation to hoping to have regulation in place, right? So a lot more movement will certainly be seen in this area. Absolutely. And the problem is that we see very little, again, as I said, AI in health care. And the problem is that today, for the individual doctor gives a pill to the patient,
Starting point is 00:40:06 they know that it was FDA jobs to make sure that it's safe and secure and they feel more comfortable moving to a new treatment. If you're not really sure what it will do, our natural reaction when you are responsible for patient safety is to say thank you very much, let me use something else. So I think that in my understanding, regulatory science would really increase in significance and also will be essential to the translation. Well, and you touch a little about health equity,
Starting point is 00:40:39 and I know you have done a lot of work on that, so we wanted to give you the opportunity to talk about it. What do you have done to facilitate health equity in your work? When we started looking at, even at our beginning, when we started working on breast cancer, at the time, the health equity in AI was not such a big topic,
Starting point is 00:40:58 but you know, when we started doing what we're always doing, computer science, just running baseline, I couldn't believe that when you run this model, like Theracusic and other traditional models that are used today, on some population, they're close to random. And those are models that are used in clinical care.
Starting point is 00:41:13 And it became to us very clear that we absolutely have to, any tools that we develop besides like, you know, training and testing on the same population, this is your first step, is really to do broad testing in other populations. So, for instance, in Journal of Clinical Oncology, when we published results of our disc assessment model for breast, we tested the model in seven different hospitals. Some of them were in hospitals which are primarily treating African American population like Emory in Atlanta. Others were, this is like Taiwan, which clearly focuses on Asian population in Israel, in others. And we really felt that before we put the model, we need to ensure that it works across all these different types of people. And in this case, we demonstrated it. Similarly, when we published SIBA, you know, it was trained and tested on the data from the trial, but then we tested it in Taiwan and we tested it and held
Starting point is 00:42:15 out MGH population. We are now focusing even to further expand the reach of the testing. But in parallel, we started working on these tools that really tell you whether the model is calibrated on a population, trust and certainty that the model gives you, and how do you teach the model to abstain? And also even like all the sort of ad hoc decisions that we are making. If I give you now the model and you're in hospital X, how many samples do you need to collect to ensure with certain probability that the model is calibrated well? And, you know, like when we started this, let's just do 10,000. And in many hospitals, 10,000 is a lot.
Starting point is 00:42:57 But maybe in some hospitals you need 20,000. Who knows? Because it really depends how different is this population. So creating an appropriate machine learning tools that can, you know, really instead of just doing a decision from my head, like really proper mathematical mechanism is really important. And we are working on developing these types of tools.
Starting point is 00:43:17 Yes, we need to wrap up now, but before we break, we want to mention that in AMIA, we have started hosting a series of events for AI Evasion Showcase to bring out more health AI practitioners to discuss how to evaluate health AI. And in ACN, KDD, for example, we are also hosting a series of events to hopefully to bring those two groups together to discuss more. So it's to encourage the self-regulation industrial standards will emerge.
Starting point is 00:43:48 Yeah, we're hoping to see more participation from your group as well. Great. And in the meantime, did you have any parting words for us here before we break? Thank you very much for having me. And I think that for many of us, you know, the healthcare is something very personal because there is no person who never went to the hospital and who never was in the situation when they wish they would have extra information or extra predictive tools. And I think it's a great area, even though it has a lot of, as we discussed, challenges.
Starting point is 00:44:25 It's very rewarding to see how the change happens. And I'm hoping that there will be some listeners who are considering to expand their interest in this field, that it would encourage them to explore this opportunity. Thank you. And Adela, did you have any last questions? No, no more questions. That was awesome. I really enjoy your talk. You brought so many hot topics. I mean, you just brought all together all these really hot topics that everyone is talking about from the technical but also clinical perspective was excellent. Thank you so much. Thank you. Thank you very much. Really enjoyed it. Thank you. Thank you for listening to today's episode.
Starting point is 00:45:07 ACM Bycast is a production of the Association for Computing Machinery's Practitioner Board. And AMIA's For Your Informatics is a production of Women in AMIA. To learn more about ACM, visit acm.org. And to learn more about AMIA, visit acm.org. And to learn more about AMIA, visit amia.org. For more information about this and other episodes, please visit learning.acm.org slash b-y-t-e-c-a-s-t. And for AMIA's For Your Informatics podcast, visit the news tab on AMIA.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.