In The Arena by TechArena - Unleashing Generative AI for the World with Microsoft’s Nidhi Chappell
Episode Date: September 14, 2023Allyson Klein chat’s with Microsoft GM Nidhi Chappell about her team’s amazing journey delivering the compute power to train Chat GPT and integrate generative AI power for the world....
Transcript
Discussion (0)
Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators
and our host, Allison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena.
My name's Allison Klein, and today I was looking
forward to this interview all day long. I've got my friend and former colleague Nidhi Chappell
with me. She's the GM of Azure AI Infrastructure at Microsoft. Welcome to the program, Nidhi. How
are you doing? I'm doing well. Thank you for having me. I'm excited to be here. I was looking forward to it myself. Nidhi, you and your colleagues at Microsoft have
really shocked the world this year with delivery of new capabilities on AI. And I was trying to
figure out where to start this interview. But I think that what I'll start with before even
introducing what you do at Microsoft is what is this progress?
I know you've been involved in AI for a long time.
How do you see generative AI transforming?
And what do you see in terms of the response from your customer base to the new capabilities that are coming with the solutions that you and your team are delivering?
So I'll start off by saying every now and then you have these pivotal moments in your career,
in your life, in your journey of delivering technology, where you look at something and you're like, wow, this is it. And I still remember, I was not the first one to get a preview of ChatGPT.
It was a very, very small group.
From what I heard from them, it was like, oh my God, if this is really true, there will be a pre-ChatGPT period in post.
And like once that started to happen, like you could see it clearly. I have to constantly remind our customers that chat GPT was born, came to the wall on November 30th, 2020, right?
It's not been that long, but none of us can stop talking about it.
And really, it isn't that AI hasn't had its moment.
I think AI has had many critical milestones. This one is definitely one of the biggest one, because you can now see a level of
versatility that you couldn't see before in language models, in understanding, in
the summarization, and you can start to see how this could affect and influence
every use case, and we can talk a lot about those use cases, but
really, once you start to play with it, you realize that this is a whole different ballgame.
It's a whole different capability that never existed before. So definitely is a very high
moment for AI. Yeah, I'm excited about it.
And I've been writing about it a lot.
And that gets me to your actual job, which is the general manager of Azure AI infrastructure.
What has it been like to build Microsoft's AI engine?
And when you think about that pre-moment and post-moment, what has that been like to start
building out the infrastructure that's going to power
this amazing transformation?
Well, frankly, it's been super humbling and super exciting at the same time, right?
So most people think of chat GPT, but we've been actually on this journey for at least
four plus years.
Even I have been personally on this journey, have the front row seats for the least four plus years. Even I have been personally on this journey,
have the front row seats for the last four years.
And while folks have been looking at the output of that,
we started working on this when we started the partnership with OpenAI 2019.
And at that time, we didn't know what we were,
what the outcome would look like.
We had some vision, we had some investments in this.
And we were really trying to build infrastructure that could unblock innovation at a completely different scale.
We have aspirations to get to certain milestones in AI.
But last three and a half years were these small achievements before the big reveal, if you will.
And once ChatGPT got trained and we started to see how you could use that, how you could run those models, the three and a half years before that, that took what it means to actually get to that moment all worth it.
So my team is basically involved in building all of the supercomputers for training these
models.
And we've been in building many generations.
And then once you build the models to use the models to infer from these models, again,
you need infrastructure that is optimized worldwide for it.
So in some regards, I have had the privilege and the honor to be like having the
front row seat but also it's interesting because I get to be in the kitchen when the cake is being
baked like you know I'm helping in this but there's so many layers to it so to finally see
the finished product which you have some contribution it it's just, it's very rewarding.
I think I joke around that I have no issues with retention in my team because where else would you get to be able to work on something so cool?
But really on all fairness though, right?
I think being part of that journey, being able to deliver this level of capability
and then being able to be on a journey to like for the next milestone and the
milestone after that it's really exciting. I was thinking about you in preparation for this
interview and I know that you've been in AI for a long time you actually won't remember this but
you were the person that actually told me for the first time what machine learning was many years ago. So I thank you for that.
But you have been working at Microsoft on HPC configurations,
and you used the word supercomputers a little while ago.
Tell me about the parallels between building an HPC cluster
and building a supercomputer for AI training.
Yeah, I think of it as large-scale AI training
needed thousands of GPUs,
hundreds of thousands of GPUs sometimes,
like working collectively on one single job
for a long period of time.
We have lots of research that shows that
models that can accurately predict things
are trained on lots of GPUs for a long period of
time on a lot of data, right? Now to actually enable those GPUs to work well, it is actually
a supercomputing problem. So AI at large scale, AI training at large scale is a fundamental high
performance computing job. You have a job that is a single synchronous job
spread across so many compute units, right?
So we borrowed a lot of our learnings from supercomputing world
and brought them into AI training.
And that is what allowed us to build these supercomputers
at a cadence that was much faster
than a lot of the supercomputing systems worldwide do that, right?
So we have been building supercomputers almost every year and a half to two years.
Like many X bigger than the last one.
And that comes from like deep understanding of how you get performance and reliability and skill.
And that comes from having a background in high performance computing and
it's not just you know that you also have to then work with the ecosystem to make sure
well the components are reliable the software stacks is reliable so it's just been this
constant like how do you bring the high performance computing paradigm and to benefit AI training and allow AI training to
actually train at a very, very large scale. Now, I know that you've accomplished it because I use
it all the time. But one question that I have for you as you talk to customers is, where do you see
the use cases going with this core capability? And what do you think are going to be the
industries that really transform rapidly based on this? I know this is just months old, but we know
that change is coming. Yeah. And so I actually, when I first started on this, the obvious answers
were like anywhere where you have a chat interface, you know, customer support is a great example.
But that actually is not the only space
where these large language models have been used.
In fact, there's no industry now left
where LLM, large language models,
have not been applied.
Whether it's gender simulation,
where you're like, what?
To car manufacturing,
to actually human productivity discussions,
to actually enterprise level chats,
to healthcare.
So I do think all of the verticals
are really looking to see
how do we benefit from large language models,
really understanding the context of what we do
and helping us with our next problem, right?
And for me, I have actually stopped speculating where all verticals will go because I've always
been surprised.
Like every time you talk to a customer, they have a brand new use case.
One example I use is a car manufacturer is talking to us about how they would use large
language models to predict what parts
to manufacture. That was my mind, like, how would you use an LLAP for this? And I still don't
understand it, but they have figured it out. They want to do it. So, you know, there are the
traditional use cases where you are interfacing with a human being, whether you're being a
co-pilot to them. But then there are all these other use cases that are coming up
where you're supplementing our understanding,
our understanding of the context,
our understanding of what we want to do next.
And in all of these scenarios, you know,
you're ultimately looking to see how you improve the productivity
of the system, of the processes that you're building.
So lots of industries,
lots of different use cases that are coming up. And I think we'll see a lot more use cases come
forward because as people become more and more familiar with this and start to have access to
a lot of these models, their creativity will really kick in. So definitely is a place where
we'll see a lot of innovation itself.
It's interesting when you're talking about the car parts, it just gives a perfect example of
to really tap the full innovation of this technology. You need to understand the
businesses that are going to be transformed. You and I in the compute space would never know
why car parts and chat GPT. But I think that the other question that I have for you
is you've built this incredible tool.
You've built this incredible engine.
And Microsoft is known for partnering with ecosystem
to actually take advantage of the powerful tools that you have.
How are you looking to partner with ecosystem players
in this space and with customers in this space?
Yeah, so a couple of ways we're doing it.
One, we definitely have our own OpenAI models available through OpenAI.
And then we have an enterprise version of that for Azure OpenAI that actually
makes sure there are data concerns, privacy concerns that customers can control.
So we actually make this through an API available for any Azure customer.
It's called Azure OpenAI.
And the intent again is instead of somebody trying to train from scratch,
we have this Azure OpenAI models available that you can use
directly in your environment.
The other thing that we are doing also is we are embedding them in
all of our Microsoft products.
So you will hear a lot of co-pilots, N365 co-pilots.
Again, the idea is that if this can improve the productivity of our
customers using the tools, like why not?
But beyond that, I also think that we are also making ecosystem in the
sense that we are making other models.
So for example, Meta has the Lama models.
Now we have that available for our customers too.
So we will have a gallery of AI models available for customers to choose with.
We will obviously make sure that Azure OpenAI is the easiest way for them to
take an API, take a pre-built model.
You don't have to worry about
the config and you can actually plug and play with it. But if customers for a wide variety of
reasons want to use other tools or other models, for example, we actually have a whole gallery of
those available online too. Now, I was thinking about the Microsoft Suite integration, and I'm
a little worried now that you're going to arrive with an OpenAI podcast co-host for me.
I don't know if it's a podcast co-host for you,
but it is actually interesting
because you can start to do real-time
sentiment analysis to see
like how the comments are going.
Is there a particular interesting topic
that you can...
So I do...
You'll be a co-pilot in some regards
to help you steer your conversation. We got to talk about that after the show then. So I do. You'll be a co-pilot in some regards to help you steer your conversation.
We got to talk about that after the show, then. That's fantastic. You know that I'm going to ask
you about this. I'm very passionate about uses of technology for good for the society and for
the furtherment of humanity. Where have you seen people tap this technology for those purposes?
And where do you think the biggest opportunity is?
Yeah.
And so I think, again, I wouldn't speculate where is the biggest use case because I do
think my window of that is so narrow.
I'm a provider of technology, of capability, and our customers are the one in different
sectors that come up with the widest examples of use case.
So I wouldn't speculate on what is the most effective use case. The thing I would just say is,
whether you look at in healthcare, whether you look at in financial transactions, any of these
really industries, what the capability allows you to do is it augments the human capability to actually identify. For example,
there are places where we saw that when doctors are working on diagnosing really rare diseases,
sometimes having a co-pilot actually really improves the likelihood that they will find
this rare disease because the potential for opportunities is so big,
sometimes you want that augmentation with AI
to actually help you narrow down the scope
or think about opportunities
where you may not have thought about before.
So we have seen use cases where with the use of AI,
doctors are actually able to diagnose things more accurately.
Now, that is just one example of that. doctors are actually able to diagnose things more accurately.
Now, that is just one example of that.
And I don't say this, that this is the only way to do it.
But to me, this is a great example of where a capability like generative AI could really augment our current way of doing things
and can really result in betterment of everybody, right?
So I think there's a lot of
patients who suffer from not knowing or having the diagnosis for a long period of time. And if
we can reduce that timeline or have more accurate results, it benefits everyone.
Well, I don't think I'm alone in saying that I'm really looking forward to seeing what you and your team does
and how you're delivering something that is compelling and powerful for the tech sector
and for the world.
I'm excited to see it.
And I'm so excited to see it from a friend of mine.
Congratulations to you and your team for the work that you're doing.
I can't wait to see what comes next.
And to that point, if folks want to get going, if they want to use all of the work that you're doing. I can't wait to see what comes next. And to that point,
if folks want to get going, if they want to use all of the tools that you talked about today,
where would you send them to get more information and to engage with your team?
Yeah, I would say I think the fastest way to get access to any of this is through Azure OpenAI
models, right? On Azure's website, you can actually find access to Azure OpenAI.
And that's the fastest way to have the latest models.
Right.
And this is a place where we are constantly updating.
Right.
So you don't have to worry about the latest models.
And I do also want to just say, while I get to be one layer in this layer cake, I think the real innovation is also happening with how people use this technology, right?
I think technology at the end of the day is a capability and it really depends at the end of the day how you use it.
And this is where we as a society have to use it responsibly, but also have the chance
to really open up certain use cases that were not possible before.
So to me, it's very exciting to see that capability.
And I'm hugely thankful that Microsoft is taking an approach of using this as a co-pilot,
making sure the human is always in the loop, making sure the human is always the deciding
factor.
And that's the way to go about enabling this large scale.
So excited to be part of this journey and like enabling this.
In a moment where I do think we'll look back and say, wow, can't remember a world before Chad GPT.
It's so exciting.
Thank you so much for your time, Nidhi.
It's been a real pleasure.
I would love to have you back on to hear what you and your team come up with next.
Well, it's great to talk to you again, Allison.
Thank you so much for talking to me and I look forward to connecting again.
Thanks for joining the Tech Arena.
Subscribe and engage at our website, thetecharena.net.
All content is copyright by the Tech Arena. Thank you.