Microsoft Research Podcast - AI Testing and Evaluation: Learnings from genome editing
Episode Date: June 30, 2025In this episode, Alta Charo, emerita professor of law and bioethics at the University of Wisconsin–Madison, joins Sullivan for a conversation on the evolving landscape of genome editing and its regu...latory implications. Drawing on decades of experience in biotechnology policy, Charo emphasizes the importance of distinguishing between hazards and risks and describes the field's approach to regulating applications of technology rather than the technology itself. The discussion also explores opportunities and challenges in biotech’s multi-agency oversight model and the role of international coordination. Later, Daniel Kluttz, a partner general manager in Microsoft's Office of Responsible AI, joins Sullivan to discuss how insights from genome editing could inform more nuanced and robust governance frameworks for emerging technologies like AI.
Transcript
Discussion (0)
Welcome to AI Testing and Evaluation,
Learnings from Science and Industry.
I'm your host, Kathleen Sullivan.
As generative AI continues to advance,
Microsoft has gathered a range of experts from genome editing to
cybersecurity to share how
their fields approach evaluation and risk assessment.
Our goal is to learn from their successes and
their stumbles
to move the science and practice of AI testing forward. In this series, we'll explore how these
insights might help guide the future of AI development, deployment, and responsible use.
Today I'm excited to welcome R. Altasharo, the Warren P. Knowles Professor Emerita of
Law and Bioethics at the University of Wisconsin-Madison to explore testing and risk assessment in
genome editing.
Professor Sharrow has been at the forefront of biotechnology policy and governance for
decades, advising former President Obama's transition team on issues of medical research
and public health, as well as serving as a senior policy advisor at the Food and Drug Administration.
She consults on gene therapy and genome editing for various companies and organizations, and
has held positions on a number of advisory committees, including for the National Academy
of Sciences.
Her committee work has spanned women's health, stem cell research, genome editing, biosecurity, and more.
After our conversation with Professor Sharrow, we'll hear from Daniel Klutz, a partner general manager in Microsoft's Office of Responsible AI,
about what these insights from biotech regulation could mean for AI governance and risk assessment,
and his team's work governing sensitive AI uses and emerging technologies. Alta, thank you so much for being here today.
I'm a follower of your work and have really been looking forward to our conversation.
It's my pleasure. Thanks for having me.
Alta, I'd love to begin by stepping back in time a bit before you became a leading figure in bioethics and legal policy.
You've shared that your interest in science was really
inspired by your brother's interest in the topic and that your upbringing really
helped shape your perseverance and resilience. Can you talk to us about
what put you on the path to law and policy? I think it's true that many of us
are strongly influenced by our families and certainly my family had kind of a sciencey techy orientation. My father
was a refugee escaping the Nazis. And when he finally was able to start working in the
United States, he took advantage of the GI Bill to learn how to repair televisions and
radios which were really just coming in in the 1950s. So he was kind of technically oriented.
My mother retrained from being a talented amateur artist to becoming a math teacher.
And not surprisingly, both my brothers began to aim toward things like engineering and
chemistry and physics.
And our form of entertainment was to watch PBS or Star Trek.
And so the interest comes from that background coupled with, in the 1960s, this enormous surge of interest in the so-called nature versus nurture debate
about the degree to which we are destined by our biology or shaped by our environments.
It was a heady debate and one that perfectly combined the two interests in politics and science.
For listeners who are brand new to your field in genomic editing,
can you give us what I'll call a 90-second survey of the space in perhaps plain language
and why it's important to have a framework for ensuring its responsible use?
Well, you know, genome editing is both very old and very new.
At base, what we're talking about is a way to either delete sections of the genome,
our collection of genes, or to add things or to alter what's there. The goal is simply to be able to
take what might not be healthy and make it healthy, whether it's a plant, an
animal, or a human. Many people have compared it to a word processor where
you can edit text by swapping things in and out. You could change the letter G to the letter H in every word,
and in our genomes, you can do similar kinds of things.
But because of this,
we have a responsibility to make sure that whatever we change
doesn't become dangerous,
and that it doesn't become socially disruptive.
Now, the earliest forms of genome editing were very inefficient and so we didn't worry that
much.
But with the advances that were spearheaded by people like Jennifer Doudna and Emmanuel
Charpentier, who won the Nobel Prize for their work in this area, genome editing has become
much easier to do.
It's become more efficient.
It doesn't require as much sophisticated laboratory equipment.
It's moved from being something that only a few people can do
to something that we're going to be seeing in our junior high school biology labs.
And that means you have to pay attention to who's doing it, why are they doing it, what
are they releasing, if anything, into the environment, what are they trying to sell,
and is it honest and is it safe?
How would you describe the risks?
And are there, you know, sort of specifically inherent risks in the technology itself, or
do those risks really emerge only when it's applied in certain contexts like CRISPR in agriculture or CRISPR for human therapies?
Well, to answer that, I'm going to do something that may seem a little picky, even pedantic, but I'm going to distinguish between hazards and risks.
So there are certain intrinsic hazards. That is, there are things that can go wrong.
You want to change one particular gene
or one particular portion of a gene,
and you might accidentally change something else,
a so-called off-target effect.
Or you might change something in a gene,
expecting a certain effect,
but not necessarily anticipating
that there's gonna to be an interaction
between what you changed and what was there, a gene-gene interaction that might have an
unanticipated kind of result, a side effect essentially.
So there are some intrinsic hazards, but risk is a hazard coupled with the probability that
it's going to actually create something harmful.
And that really depends upon the application.
If you are doing something that is making a change in a human being that is going to
be a lifelong change, that enhances the significance of that hazard.
It amplifies what I call the risk because if something goes wrong,
then its consequences are greater. It may also be that in other settings, what you're doing is
going to have a much lower risk because you're working with a more familiar substance, your
predictive power is much greater, and it's not going into a human or an animal or
into the environment. So I think that you have to say that the risk and the benefits, by the way,
all are going to depend upon the particular application.
Yeah. I think on this point of application, there's many players involved in that, right? Like,
we often hear about this puzzle of who's actually responsible for ensuring safety and
a reasonable balance between risks and benefits or hazards and benefits, to quote you.
Is it the scientists, the biotech companies, government agencies?
And then if you could touch upon as well, maybe how does the nature of genome editing
risks, how did those responsibilities get divvied up?
Well, in the 1980s, we had a very significant policy discussion about whether we should
regulate the technology, no matter how it's used or for whatever purpose,
or if we should simply fold the technology in with all the other technologies that we currently have
and regulate its applications, the way we regulate applications generally. And we went for the second,
the so-called coordinated framework. So what we have in the United States is a system in which
if you use genome editing in purely laboratory-based work, then you will be regulated the way we regulate laboratories.
There's also, at most universities because of the way the government works with this,
something called Institutional Biosafety Committees, IBCs. You want to do research that involves
recombinant DNA and modern biotechnology, including genome editing, but not limited to it. You
have to go first to your IBC and they look
and see what you're doing to decide if there's a danger there that you have not anticipated that
requires special attention. If what you're doing is going to get released into the environment,
or it's going to be used to change an animal that's going to be in the environment, then there
are agencies that oversee the safety of our environment, predominantly
the Environmental Protection Agency and the U.S. Department of Agriculture.
If you're working with humans and you're doing medical therapies, like you're doing the gene
therapies that just have been developed for things like sickle cell anemia, then you have
to go through a very elaborate regulatory process that's overseen by the Food and Drug Administration
and also seen locally at the research stages overseen by institutional review boards that
make sure the people who are being recruited into research understand what they're getting
into, that they're the right people to be recruited, etc.
So we do have this kind of Jenga game of regulatory agencies.
And on top of all that, most of this involves professionals who've had to be licensed in
some way.
There may be state laws specifically on licensing.
If you are dealing with things that might cross national borders, there may be international
treaties and agreements that cover this.
And of course the insurance industry plays a big part because they decide whether or not
what you're doing is safe enough to be insured. So all of these things
come together in a way that is not at all easy to understand if you're not kind of
working in the field. But the bottom line thing to remember, the way to really think
about it is we don't regulate genome editing. We regulate the things that use genome editing.
Yeah, that makes a lot of sense. Actually, maybe just following up a little bit on this
notion of a variety of different, particularly like government agencies being involved, you
know, in this multi-stakeholder model, where do you see gaps today that need to be filled
with some of the pros and cons to keep in mind?
Just as we think about distributing these systems at a global level, what are some of
the considerations you are keeping in mind on that front?
Well, certainly there were times where the way the statutes were written that govern
the regulation of drugs or the regulation
of foods did not anticipate this tremendous capacity we now have in the area of biotechnology
generally or genome editing in particular. And so you can find that there are times where
it feels a little bit ambiguous and the agencies have to figure out how to apply their existing rules. So an example, if you're
going to make alterations in an animal, right, we have a system for regulating drugs, including
veterinary drugs. But we didn't have something that regulated genome editing of animals.
But in a sense, genome editing of an animal is the same thing as using a
veterinary drug. You're trying to affect the animal's physical constitution in some fashion. And it took a long
time within the FDA to sort of work out how the regulation of veterinary drugs would apply if you think about
the genetic construct that's being used to alter the animal as the same thing as injecting a chemically based drug. And on that basis they now
know here's the regulatory path, here are the tests you have to do, here are the
permissions you have to do, here's the surveillance you have to do after it
goes on the market. Even there sometimes it was confusing what happens when it's
not the kind of animal you're thinking about when you think about animal drugs?
Like, we think about pigs and dogs, but what about mosquitoes?
Because there, you're really thinking more about pests, and if you're editing the mosquito so that it can't, for example, transmit dengue fever, right? It feels more like a public health thing than
it is a drug for the mosquito itself, and it kind of fell in between the agencies that
possibly had jurisdiction, and it took a while for the USDA, the Department of Agriculture,
and the Food and Drug Administration to work out an agreement about how they would share
this responsibility. So you do get those kinds of areas in which you have at least ambiguity.
We also have situations where, frankly, the fact that some things can move across national
borders means you have to have a system for harmonizing or coordinating national rules.
If you want to, for example, genetically engineer mosquitoes, they can't
transmit dengue, mosquitoes have a tendency to fly. And so they can't fly very far. That's good.
That actually makes it easier to control. But if you're doing work that's right near a border,
then you have to be sure that the country next to you has the same rules for whether it's permitted
to do this and how to surveil what you've done in order to be sure that you got the results you wanted to get,
no other results. And that also is an area where we have a lot of work to be done
in terms of coordinating across government borders and harmonizing our rules.
Yeah, I mean, you've touched on this a little bit, but there is such this striking balance
between advancing technology, ensuring public safety, and sometimes I think it feels just You've touched on this a little bit, but there is such this striking balance between
advancing technology, ensuring public safety, and sometimes I think it feels just like you're
walking a tightrope where, you know, if we clamp down too hard, we'll stifle innovation,
and if we're too lax, we risk some of these unintended consequences.
And on a global scale, like you just mentioned as well, how has the field of genome editing
found its balance?
It's still being worked out, frankly.
But it's finding its balance application by application.
So in the United States, we have two very different approaches on regulation of things
that are going to go into the market.
Some things can't be marketed until they've gotten an approval from the government.
So you come up with a new drug, you can't sell that until it's gone through FDA approval.
On the other hand, for most foods that are made up of familiar kinds of things, you can
go on the market and it's only after they're on the market that the FDA can act to withdraw it if a problem arises.
So basically we have either pre-market controls, you can't go on without permission, or post-market controls.
We can take you off the market if a problem occurs.
How do we decide which one is appropriate for a particular application?
It's based on our experience. New drugs typically are both less familiar
than existing things on the market and also have a higher potential for injury if they
in fact are not effective or they are in fact dangerous and toxic. If you have foods, even
bioengineered foods that are basically the same as foods that are already here, it can go on the market with notice but without a prior approval.
But if you create something truly novel, then it has to go through a whole long process.
And so that is the way that we make this balance.
We look at the application area and we're just now seeing in the Department of Agriculture a new
approach on some of the animal editing, again, to try and distinguish between things that are
simply a more efficient way to make a familiar kind of animal variant and those things that are
genuinely novel and to have a regulatory process that is more rigid, the more unfamiliar it is, and the more that we
see a risk associated with it.
I know we're at the end of our time here, and maybe just a quick kind of lightning round
of a question.
For students, young scientists, lawyers, or maybe even entrepreneurs listening who are
inspired by your work, what's the single piece of advice you give them if they're interested in policy, regulation, the ethical side of things in genomics or other fields?
I'd say be a bio-optimist and read a lot of science fiction because it expands your
imagination about what the world could be like.
Is it going to be a world in which we're now going to be growing our buildings
instead of building them out of concrete?
Is it going to be a world in which our plants will glow in the evening
so we don't need to be using batteries or electrical power from other sources,
but instead our environment is adapting to our needs. You know, expand your imagination
with a sense of optimism about what could be and see ethics and regulation not as an obstacle,
but as a partner to bringing these things to fruition in a way that's responsible and helpful
to everyone. Wonderful. Well, also, this has been just an absolute pleasure.
So thank you.
It was my pleasure.
Thank you for having me.
Now I'm happy to bring in Daniel Klutz.
As a Partner General Manager in Microsoft's Office of Responsible AI, Daniel leads the
group's Sensitive Uses and Emerging Technologies program.
Daniel, it's great to have you here.
Thanks for coming in.
It's great to be here, Kathleen.
Yeah.
So maybe before we unpack all Tasharo's insights,
I'd love to just understand the elevator pitch here.
What exactly is Sensitive Uses and Emerging Technologies
program, and what was the impetus for establishing it?
Yeah, so the Sensitive Uses and Emerging Technologies program sits within our Office of Responsible
AI at Microsoft and inherent in the name there are two real core functions.
There's the Sensitive Uses and Emerging Technologies.
What does that mean?
Sensitive Uses, think of that as Microsoft's internal consulting and oversight function
for our higher risk,
most impactful AI system deployments. And so my team is a team of
multidisciplinary experts who engages in sort of a white glove treatment sort of
way with product teams at Microsoft that are designing, building, and deploying
these higher risk AI systems. And where that sort of consulting journey culminates is in a set of bespoke requirements
tailored to the use case of that given system that really implement and apply our more standardized,
generalized requirements that apply across the board.
Then the emerging technologies function of my team faces a little bit further out trying to look around corners
To see what new and novel and emerging risk are coming out of new AI
technologies with the idea that we work with our researchers or engineering partners and of course product
Leaders across the company and understand where Microsoft is going
With those emerging technologies and we're're developing rapid, quick-fire, early steer
guidance that implements our policies ahead
of that formal internal policy-making process, which
can take a bit of time.
So it's designed to both afford that innovation speed
that we like to optimize for at Microsoft,
but also integrate our responsible AI commitments and our AI principles into
emerging product development.
That segues really nicely, actually, as we met with Professor Sharrow and she was talking
about the field of genome editing and the governing at the application level.
I'd love to just understand how similar or not is that to managing the risks of AI in
our world?
Yeah, I mean, Professor Shorah's comments were music to my ears because, you know, where
we make our bread and butter, so to speak, in our team is in applying to use cases.
AI systems, especially in this era of generative AI, are almost inherently multi-use dual use.
And so what really matters is how you're going to apply
that more general purpose technology,
who's going to use it and what domain
is it going to be deployed,
and then tailor that oversight to those use cases.
Try to be risk proportionate.
Professor Sharer talked a little bit about this,
but if it's something that's been done before
and it's just a new spin on an old thing, maybe we're not so concerned about
how closely we need to oversee and gate that application of that technology.
Whereas if it's something new and novel or some new risk that might be posed by that
technology, we take a little bit closer look and we are overseeing that in a more sort
of high-touch way.
Maybe following up on that, how do you define sensitive use or maybe
like high-impact application and once that's labeled what happens? Like what
kind of steps kick in from there? Yeah, so we have this sensitive uses program
that's been at Microsoft since 2019. I came to Microsoft in 2019 when we were
starting this program
in the Office of Responsible AI. It had actually been incubated in Microsoft Research with
our Ether community of colleagues who are experts in socio-technical approaches to responsible
AI as well. Once we put it in the Office of Responsible AI, I came over. I came from academia.
I was a researcher myself.
At Berkeley, right?
At Berkeley, that's right. Yeah
sociologist by training and a lawyer in a past life
But that does help sort of bridge those fields for me
But sensitive uses we we force all of our teams when they're envisioning their system design to think about
could the reasonably foreseeable use or misuse of
The system that they're developing in practice
result in three really major sort of risk types. One is could that deployment result in a consequential
impact on someone's legal position or life opportunity? Another category we have is could
that foreseeable use or misuse result in significant psychological
or physical injury or harm?
And then the third really ties in with a long standing commitment we've had to human rights
at Microsoft.
And so could that system and its reasonably foreseeable use or misuse result in human
rights impacts and injurious consequences to folks along different dimensions of human rights impacts and injurious consequences
to folks along different dimensions of human rights.
Once you decide, we have a process
to reporting that project into my office.
And we will triage that project, working with the product team,
for example, and our responsible AI champs community,
which are folks who are dispersed
throughout the ecosystem of Microsoft
and educated in our Responsible AI program, and then determine, okay, is it in scope for
our program?
If it is, say, okay, we're going to go along for that ride with you.
And then we get into that whole sort of consulting arrangement that then culminates in this set
of bespoke use case based requirements applying our AI principles.
That's super fascinating. What are some of the approaches in the governance of genome editing?
Are you maybe seeing happening in AI governance or maybe just like bubbling up in conversations
around it? Yeah, I mean, I think we've learned a lot from fields like genome editing that
Professor Shorah talked about and others. And again, it gets back to this sort of risk proportionate based approach.
It's a balancing test.
It's a trade off of trying to sort of foster innovation and really look for the beneficial
uses of these technologies.
I appreciated her speaking about that.
What are the intended uses of the system, right?
And then getting to, okay, how do we balance trying to, again,
foster that innovation in a very fast-moving space,
a pretty complex space and a very unsettled space,
contrasting to other sort of professional fields
or technological fields that have a long history
and are relatively settled from an oversight
and regulatory standpoint.
This one is not.
And for good reason.
It is still developing.
And I think there are certain oversight and policy regimes
that exist today that can be applied.
Professor Sharrow talked about this
as well, where maybe you have certain policy and oversight
regimes that, depending on how the application of that
technology is applied, applies there versus some horizontal overarching
regulatory sort of framework. And I think that applies from an internal governance
standpoint as well. Yeah, that's a great point. So what isn't being explored from
genome editing that, you know, maybe we think could be useful to AI governance
or as we think about the evolving frameworks, what maybe we think could be useful to AI governance or as we think about the evolving
frameworks, what maybe we should be taking into account from what Professor Sharrow shared
with us.
So, one of the things I've thought about and took from Professor Sharrow's discussion was
she had just this amazing way of framing up how genome editing or regulation is done.
And she said, you know, we don't regulate genome editing, we regulate the things that use genome editing or regulation is done. And she said, we don't regulate genome editing,
we regulate the things that use genome editing.
And while it's not a one-to-one analogy with the AI space,
because we do have this sort of very general model level
distinction versus application layer and even platform layer
distinctions, I think it's fair to say,
we don't regulate AI applications writ large.
We regulate the things that use AI in a very similar way. And that's how we think of our
internal policy and oversight process at Microsoft as well. And maybe there are things that we
regulated and oversaw internally at the first instance, and the first time we saw it come
through, and it graduates
into more of a programmatic framework for how we manage that.
So one good example of that is some of our higher risk AI systems that we offer out of
Azure at the platform level.
When I say that, I mean APIs that you call the developers can then build their own applications
on top of.
We were really deep in evaluating and assessing mitigations on those platform systems in the first instance.
But we also graduated them into what
we call our limited access AI services program.
And some of the things that Professor Shuro discussed
really resonated with me.
She had this moment where she was mentioning,
you want to know who's using your tools
and how they're being used. And this is the same concepts. We want to have trust in our
customers, we want to understand their use cases, and we want to apply technical
controls that sort of force those use cases or give us signal post-deployment
that use cases are being done in a way that may give us some level of concern
to reach out and understand what those use cases are. Yeah, you're hitting on a great point. And I love this kind of layered approach
that we're taking and that Alta highlighted as well. Maybe to double
click a little bit just on that post-market control and what we're
tracking kind of once things are out and being used by our customers, how do we
take some of that deployment data and bring it back
into maybe even better inform upfront governance,
or just how we think about some of
the frameworks that we're operating in?
It's a great question.
The number one thing is for us at Microsoft,
we want to know the voice of our customer.
We want our customers to talk to us.
We don't want to just understand telemetry and data.
But it's really getting out there and understanding from our customers and not just our customers. I would say our
stakeholders is maybe a better term because that includes civil society
organizations, it includes governments, it includes all of these non sort of
customer actors that we care about and that we're trying to sort of
optimize for as well. It includes end users of our enterprise customers. If we
can gather data about how our products are being used and trying to understand
maybe areas that we didn't foresee how customers or users might be using those
things and we can tune those systems to better align with what both customers
and users want but also our own AI principles and policies and programs.
Daniel, before coming to Microsoft you led social science research and socio but also our own AI principles and policies and programs.
Daniel, before coming to Microsoft, you led social science research
and socio-technical applications
of AI-driven tech at Berkeley.
What do you think some of the biggest challenges are
in defining and maybe even just kind of measuring
at like a societal level, some of the impacts of AI
more broadly?
Measuring social phenomenon is a difficult thing.
And one of the things that as social scientists you're very interested in is scientifically
observing and measuring social phenomena.
Well that sounds great.
It sounds also very high level and darkening.
What do we mean by that?
You know, it's very easy to say that you're collecting data and you're measuring, I don't
know, trust in AI, right?
That's a very fuzzy concept.
It is a concept that we want to get to, but we have to unpack that and we have to develop
what we call measurable constructs.
What are the things that we might observe that could
give us an indication toward what is a very fuzzy and general concept? And
there's challenges with that everywhere and I'm extremely fortunate to work at
Microsoft with some of the world's leading socio-technical researchers and
some of these folks who are thinking about very steeped in measurement
theory, literally PhDs in these fields, how to both measure and allow for a scalable way
to do that at a place the size of Microsoft.
And that is trying to develop frameworks that are scalable and repeatable and put into our platform that
then serves our product teams?
Are we providing as a platform a service to those product teams that they can plug in
and do their automated evaluations at scale as much as possible?
And then go back in over the top and do some of your more qualitative targeted testing
and evaluations.
Yeah, it makes a lot of sense.
Before we close out, if
you're a game for it, maybe we do a quick lightning round, just 30 second answers
here. Favorite real-world sensitive use case you've ever reviewed? Oh gosh. Well,
this is where I get to be the social scientist. It's a defined favorite, Kathleen.
Most memorable, most painful. Let's do most memorable.
We'll do most memorable. You know, I would say the most memorable project I worked on
was when we rolled out the new Bing chat, which is no longer called Bing chat, because that was the
first really big cross-company effort to deploy GPT-4, which was the next step up in AI innovation from
our partners at OpenAI. And I really value working hand-in-hand with engineering teams and with
researchers. And that was us at our best and really sort of turbocharged the model that we have.
Wonderful. What's one of the most overused phrases that you have in your AI governance meetings?
If I hear we need to get aligned, we need to align on this more.
Right.
But you know, it's said for a reason and I think it sort of speaks to that clever nature.
That's one that comes to mind.
That's great.
And then maybe last one, what are you most excited about in the next, I don't know, let's say three months,
this world is moving so fast?
You know, the pace of innovation, as you just said,
is just staggering, is unbelievable.
And sometimes it can feel overwhelming in my space.
But what I'm most excited about is how we are building up
this emerging, I mentioned this emerging technologies
program in my team is is as a sort of formal
program is relatively new and, and I really enjoy being able to take a step back and think
a little bit more about the future and a little bit more holistically. And I love working
with engineering teams and sort of strategic visionaries who are thinking about what we're
doing a year from now or five years from now or even 10 years from now. And I get to be a part of those
conversations and that really gives me energy and helps me,
helps keep me grounded and not just dealing with the day-to-day
and you know various fire drills that you may run.
It's thinking strategically and having that foresight about what's to come
and it's exciting. Great. Well, Daniel, just thanks so much for being here. I've had such a wonderful discussion with
you. And I think the thoughtfulness in our discussion today, I hope resonates with our
listeners. And again, thanks to Alta for setting the stage and sharing her really amazing,
insightful thoughts here as well. So thank you. Thank you Kathleen. I appreciate it. It's been fun.
And to our listeners, thanks for tuning in. You can find resources related to this podcast in the show notes.
And if you want to learn more about how Microsoft approaches AI governance,
you can visit microsoft.com slash rai.
See you next time!