In The Arena by TechArena - Delivering a Foundation for AI, 5G and Edge with NVIDIA
Episode Date: February 16, 2023TechArena host Allyson Klein chats with NVIDIA’s Rajesh Gadiyar about his company’s strategy to accelerate 5G and edge adoption including cloud native vRAN....
Transcript
Discussion (0)
Welcome to the Tech Arena, featuring authentic discussions between tech's leading innovators
and our host, Alison Klein.
Now let's step into the arena.
Welcome to the Tech Arena.
My name is Alison Klein, and today is going to be a great
episode. We've got Rajesh Gadiyar, VP of Telco and Edge Architecture from NVIDIA with us.
Welcome to the program, Rajesh. Yeah. Hi, Alison. Good morning. It's great to be here today talking
to you. It's fantastic to hear your voice. Why don't you just start and introduce yourself and
your role at NVIDIA and how it relates to the topic of the day, which is 5G deployment.
Yeah, awesome. So at NVIDIA, I'm vice president of telco and edge architecture,
and I'm relatively new at NVIDIA. I took on this role in September. A key focus area for me here in this role is 5G and
the radio access network and the RAN virtualization, including cloud RAN design and development.
I'm also looking to help communication service providers or the telco operators
modernize their infrastructure and accelerate the transition to a modern software-defined cloud
based on accelerated compute infrastructure so they can reduce their
costs and improve the utilization of their infrastructure. Now, as you know, before NVIDIA,
I was at Intel for many years. I was the VP and CTO of Intel's networking business. My team built
DPDK, the Data Plane Development Kit, that enabled high-speed networking on standard server platforms
and the resultant disaggregation of hardware from
software. And much of that work also enabled the NFV and SDN transformation for the last decade.
Now, more recently, I led the 5G platform architecture, including the development of
key cloud-native technologies for network applications. Now, I'm new at NVIDIA, but I'm
super excited to be at a company that is changing the computing landscape. We are innovating and delivering accelerated compute platforms for AI and cloud computing and also networking in 5G.
And by the way, it's great to connect with you again after a couple of years.
So it's good to be talking.
Now, Rajesh, the topic, as I said, was 5G deployments and the status of 5G.
Obviously, 5G has been talked about in the industry for a long time, and you have been
in your various roles very involved in the development of 5G. Where are we at today,
in your mind, with the deployment of 5G around the world? And how is the confluence of 5G,
cloud computing, and AI coming together in your mind?
Yeah, that's a great question, Alison. The 5G
deployments have been accelerating around the globe, and many telco operators have already
rolled out 5G services, and they're expanding rapidly. In addition to the telco operators,
there is significant interest among the enterprises to use 5G to set up their private networks,
leveraging the capabilities of 5G, like higher bandwidth,
lower latency, new technologies such as network slicing, and then millimeter wave and CBR spectrum.
Now, in my view, the 5G build-out comes at an interesting time. If you look at over the last
two decades, cloud computing has matured and has become the playground of choice for developers to
build their applications. Cloud offers many advantages, mature software tools, automation orchestration,
business agility, and more TCO.
Furthermore, applications in every segment, such as industrial robotics, cloud gaming,
smart cities, autonomous driving, smart farming,
they're all increasingly using artificial intelligence, AI, to enable transformative experiences.
This confluence of 5G, cloud computing, and AI is super exciting.
In my view, it will drive many new innovations over the next decade.
But I would contend that 5G radio access network is somewhat of a weak link today, and it has not kept pace with the AI and cloud computing.
We need to innovate faster and build a more performant, scalable, programmable,
and automated radio access network. And I think the timing is right. Virtualization of RAN,
the technology is maturing, and now we are at a tipping point to drive faster adoption of VRAN, virtualized RAN, and perhaps even move faster towards deploying the entire radio access network in the cloud, which I'll talk about later.
Now, a quick word about NVIDIA.
We have a strategic investment in 5G RAN platforms.
We've developed a platform called Arial.
And the NVIDIA Arial platform is a software-defined full 5G layer 1 offload that's implemented as an inline acceleration in an NVIDIA GPU.
So this NVIDIA Aerial platform, which we'll talk about some more in this podcast, is a key technology foundation for building virtualized RAM.
And it implements all the 3GP and O-RAN compliant interfaces.
So at NVIDIA, our goal, therefore, is to deliver a full platform with cloud-native software that serves as a foundation for 5G, AI, and edge applications.
Now, I'm glad you brought up VRAM because it's been a big area of focus in the industry and somewhat of a holy grail in terms of being able to actually virtualize that radio access network.
Can you explain the motivation behind wanting to virtualize RAN and what is the status of disaggregation of RAN?
This is a great topic.
There is a lot of excitement around 5G and the economics of deploying 5G has been challenging, particularly on the radio access network side of things.
So 5G is driving a significantly higher RAN capex growth as compared
to the previous generations of wireless LTE and 3G. The number of cell sites for 5G are expected
to nearly double over the next five years. And consequently, the RAN capex as a share of overall
TCO is increasing from 45 to 50% that it used to be up to 65%.
So it is also well known that traditionally the RAN is provisioned for peak capacity,
which leads to significant underutilization of precious compute resources.
The bursty and time-dependent traffic means many traditional RAN sites
are running at below 25% capacity utilization on average.
So as a result, it is really important to disaggregate the radio access network
and drive more centralization and drive better utilization
by pooling the RAM resources, the compute resources in the radio access network.
This is where the work that ORAN Alliance is doing,
the Open RAM Alliance initiative to disaggregate traditional
radio base stations into what are called as the NRUs, the remote radio units, the virtual DU and
virtual CU instances with very defined interfaces between them. That's great progress. And it's
resulting in a larger RAN ecosystem with more vendor choices. So as our industry accelerates
the 5G deployments, scalable and flexible solutions are very much needed to realize the full business value of 5G.
So disaggregating RAN software from the hardware and making the software open and automated for deployment on private, public, or hybrid cloud infrastructure. redundancy, and it's optimally designed for mobile network evolution over the next few years,
including the next generation radio technologies, such as saving the path to 6G.
So to answer your question, RAN disaggregation, centralization, and cloudification are inevitable,
and we are seeing some good progress in that direction, but it could be faster,
and it needs to be faster. And by the way, like I said, NVIDIA's Arial platform is fully 3GPP and O-RAN compliant
and it's a great solution for the virtualized RAN
and Cloud RAN deployments.
The solution is mature
and we are actually driving some field trials at the moment
and commercial deployments later this year.
Now you've mentioned Arial a couple of times.
Can you tell me a little bit about how that was designed
and is this a cloud-native
architecture from NVIDIA? Yes. So, Arial is a full platform for 5G virtualized RAN and cloud
RAN deployments. It utilizes NVIDIA converged accelerators with our Bluefinet DPU, the data
processing unit, and A100 class GPU. Provides full RAN layer one inline acceleration and offload.
It's also software defined and supports all configurations from 4D4R to massive MIMO 32T, 32R, and 64T, 64R configurations.
So unlike other solutions in the market that hit one sweet spot, like 44R, but it doesn't necessarily scale for 32T, 32R, and 64 to 64R,
the NVIDIA Aerial solution is sort of like completely software configurable.
So the same platform can be configured in many different ways,
depending on the use cases.
In addition to what I said about the Aerial platform,
the Aerial software stack is designed ground-set as a cloud-native software.
So the Aerial architecture facilitatesitates RAN functions to be realized as microservices in containers orchestrated and managed by Kubernetes. Now this modular software supports
much better granularity and increased speed of software upgrades, releases, and patches,
independent lifecycle management following DevOps principles and CICD, independent scaling of different RAN microservices elements,
and application-level reliability, observability, and service assurance.
So for a true cloud-native RAN experience,
the cloud, the edge platform, and networking, they all need to evolve.
And in my view, there are some requirements that are critically important
for the cloud-native and containerized RAN software stack to be commercially deployable.
So things like time synchronization, CPU affinity and isolation, topology management.
You always need a high-performance data plane and lower latency, quality of service guarantees,
and high throughput, zero-touch provisioning, and so on.
So one other thing is, as you know, the Kubernetes framework allows for something
called as operators that allow you to discover the acceleration capabilities and schedule
workloads at the right nodes.
Because if you do that, then that gives you a better performance for what and performance
per dollar.
So NVIDIA in our aerial platform, we've developed two key Kubernetes operators, the NVIDIA GPU
operator and the NVIDIA network operator for VRAM deployments.
So as you can see, the NVIDIA aerial platform, it's built ground up with microservices, cloud-native architecture, and it provides a solid foundation for building and deploying the 5G RAM completely in the cloud.
That's really cool. You mentioned edge, so I've got to bring it back to edge. And
we've been debating on Tech Arena, the definition of edge. And it seems funny because we've been
defining edge, I feel like, for years now, but I don't feel like we've coalesced around a definition.
So I'm very interested in what NVIDIA's vision for edge is and how you look at defining edge. What are the challenges of growing that edge footprint
and how NVIDIA plans on investing
and engaging in the edge?
Yeah, so let's unpack all of those questions.
Let's first start with the definition of edge, right?
So I think the need for edge computing
is fairly well-established at this point.
We need the simplicity, composability, and automation of the cloud's native architecture,
but we also need to support distributed processing.
What I mean by that is processing closer to where the application is and where the data resides.
So moving everything into a public cloud, that will just not work for today's latency-sensitive applications
that need faster decision-making based on AI and
machine learning algorithms. So the cloud fundamentally has to be distributed, and it
has to come closer to where the application is. And to me, that is the essence of what
edge computing is all about. So it's not so much about location, it's really the flexibility
and the scalability of applications, right? Distributed edge applications. Now, if you look at the computing and connectivity landscape, AI is becoming very pervasive and we
are seeing a tremendous growth in AI and machine learning in every application segment, including
many edge use cases and applications. But if you look at the compute performance, it hasn't really
kept pace and Moore's law has reached its limit. So what we really need
is an accelerated computing infrastructure that can keep pace with the needs of modern applications.
Now, similarly, if you look at wireless connectivity, right, going from 4G to 5G
and eventually to 6G, we need 100x or more generational improvement in increasing performance
and reducing latency.
It's difficult to deliver this in a standard CPU-based implementation.
So what's happening?
So as a result, what we are seeing is some vendors are building fixed function acceleration in ASICs
to supplement the lack of CPU performance.
In my view, that's a completely wrong approach,
and it sets us behind many years to the old era of fixed function appliances.
This is the very problem that we've been trying to address with RAN virtualization on standard COTS platforms.
If you think about it, the whole virtualized RAN is that hardware-software disaggregation.
How do we build radio access networks on standard server platforms. Now, this is where NVIDIA's GPU come into picture because an
accelerated general purpose compute platform with NVIDIA GPUs delivers where more slow cannot,
and it can be a great solution for both AI and 5G applications. So that's the first changing
compute landscape and why we need the accelerated computing infrastructure. Now, there are a few
other challenges that need to be addressed for the edge computing infrastructure. Now, there are a few other challenges
that need to be addressed
for the edge computing to become pervasive.
The biggest, in my opinion,
is how do we provide an easy button environment
for developing edge applications?
And this is another area
that we at NVIDIA have been working on.
So NVIDIA AI Enterprise
with our base command and fleet command software
enables the enterprises to run their AI applications
in the NVIDIA GPU cloud,
leveraging all the pre-built and hardened software
for various vertical segments.
I'll give you some examples.
NVIDIA Metropolis, for example,
for video analytics and IoT applications.
NVIDIA Merlin for recommender systems.
NVIDIA iSEC for robotics.
And NVIDIA Nemo for natural language processing,
speech recognition, and text-to-o for natural language processing, speech recognition,
and text-to-speech synthesis models. Think about how powerful it will be for the 5G connectivity
to be available as a containerized solution for enterprises to deploy in the cloud on the same
infrastructure that runs all these AI applications. That will be truly game-changing. That will
transform how the world thinks about wireless connectivity. It will truly make 5G a cloud-based service that can be deployed on demand. That is our vision. That value when it's run fully in the cloud.
In this context, though, we see a lot of challenges with RAN due to timing, synchronization,
latency requirements. What's your perspective on how we solve that and how does Cloud RAN fit in?
I'm really passionate about this topic of Cloud RAN.
After I joined NVIDIA in September last year, I've been leading the Cloud RAN architecture at NVIDIA.
I have to say, recently I've been observing that there's been a lot of discussion in the industry
about Cloud RAN, and as the industry leader in accelerated computing platforms and cloud computing,
NVIDIA has been at the forefront of cloud RAN innovations.
However, my observation is that many industry leaders are using the term cloud RAN to simply
mean a cloud-native implementation of RAN. Now, while the use of cloud-native technologies or
building RAN solutions is stable stakes and it's much needed, the real question is, does cloud RAN
just equate to using cloud-native technologies? And I contend that it is not. The real question is, does cloud RAN just equate to using cloud-native
technologies? And I contend that it is not. I truly believe that a cloud RAN has to have all
compute elements, the virtual DU, the virtual CU, and the distributed UPF, all completely deployed
in the cloud. So therefore, from an NVIDIA perspective, we are changing the nomenclature,
and we are encouraging the use of the term RAN in the cloud instead of Cloud RAN to describe 5G radio access network that is fully hosted as a service in a multi-tenant cloud infrastructure.
Now, you may ask, why is this distinction important, and what is the motivation for RAN in the cloud?
So, like we discussed earlier, RAN constitutes the biggest CapEx and OpEx spending for telecom operators.
And it's also the most underutilized resource with most radio base stations typically operating below 50% utilization.
Moving RAN compute completely into the cloud brings all the benefits of cloud computing, pooling and higher utilization in a shared cloud infrastructure, resulting in the biggest CapEx and OpEx reduction for telco operators.
Now, Cod's platforms with GPUs can also accelerate not just 5G RAN, but it can accelerate Edge AI applications. And telco operators and enterprises today are already using NVIDIA GPU servers for
accelerating their AI applications. Also, it gives them an easy path to utilize the same GPUs for
accelerating the 5G
RAN connectivity in addition to their AI applications, which basically means it reduces
the TCO and provides the best path for setting up enterprise 5G networks. In addition to all this,
cloud software tools and technologies have also matured over the years and are now delivering
the benefits of at-scale automation, reduced energy consumption, elastic
computing, and auto-scaling on demand, in addition to better reliability, observability,
and service assurance.
So overall, the value proposition really is, how can we shift CapEx to OpEx and make the
RAN connectivity completely as a service offering in the cloud?
The OpEx benefit that are delivered with auto-scaling
and energy management kind of capabilities,
and the overall TCO benefit because of multi-tenancy
and using the GPU-based accelerated infrastructure,
not just for RAN, but also for running the AI applications.
One last thing, actually, like I said earlier,
some VRAN vendors in the market today
are designing ASIC-based fixed-function accelerator cards
for RAN layer 1 offloads. Now, ran built on these ASIC based accelerator is akin to a fixed function
appliance in my mind. It can only do ran processing. It is a wasted resource when it is not being used
like in the nighttime weekends when the utilization is low. So the NVIDIA's aerial platform with
general purpose GPU accelerated servers deliver a truly multi-services and multi-tenant platform, which can be used for 5G RAM, enterprise AI, video services, and other applications deployed in the cloud with all the benefits that we talked about.
That's really interesting.
And I'm so excited that we are entering Mobile World Congress season
and we're going to see what the industry is doing
with these technologies.
You covered so many, Rajesh,
from Edge to RAN to Cloud RAN.
What are you most excited to see at MWC this year?
And obviously, NVIDIA is going to be there.
Are there any highlights
that you're looking forward to from NVIDIA?
Yeah, we live in interesting times.
Connectivity has become akin to oxygen today, right? It's impossible to even spend a few minutes
in today's world without connectivity. And as we discussed earlier, most modern applications will
require distributed processing at the edge to meet the latency and quality of service requirements
of these applications.
And this confluence of 5G, AI, and cloud is super interesting.
And it's, in my mind, game-changing.
I think it's really transforming the way we live, quite frankly.
So like I explained earlier, to some extent, 5G RAN is a weak link,
and it's not keeping pace with AI and cloud computing. And the approach that many VRAN vendors have taken with a fixed-function ASIC-like acceleration
to get around the limits of Moore's Law
sets us behind in our vision to drive
a fully programmable and software-defined infrastructure.
What we really need is a general-focus acceleration platform
that can bend the performance curve where Moore's Law cannot.
And this is what we are trying to do at NVIDIA.
So bring the GPU-accelerated computing
and the virtues of standard high volume server platforms
to transform the RAN and the edge.
I also believe that RAN in the cloud is the future.
It's a natural evolution
and the next step for the wireless market.
A virtualized RAN built using cloud native technologies
is definitely a necessary first step.
However, if you can get to realizing the cloud economics
for 5G RAN
and increase utilization of the RAN infrastructure
and to drive the co-innovation of 5G
with edge AI applications,
we must embrace the principles
of a full RAN in the cloud.
This is our focus for NVIDIA's aerial platform,
which delivers a fully programmable,
scalable, and cloud-native software architecture
as a foundational technology for RAN in the cloud.
We are actually going to be demoing some of this at the Mobile World Congress in Barcelona next month.
You'll hear us talk a lot more about not just AI and cloud computing,
but also what we're doing with 5G and, in particular, RAN in the cloud
at NVIDIA's event, the GTC event that we have in March.
There's a lot in store over the next couple of months. Just in the broader context, you know me,
actually, I'm a dreamer, and I'm truly excited at what the future holds for all of us as we bring
5G, AI, and cloud computing together at the edge to build the applications of tomorrow. Rajesh, it's always lovely to talk to you.
And I learn so much every time we do.
Thank you for being on the program today.
I have one final question for you.
Where can folks reach out to learn more about the aerial platform and other things that
NVIDIA is delivering to service the network and edge?
And how can they engage with you?
You can always connect with me on LinkedIn.
And as far as more information about Aerial and what we're doing to transform the radio access network,
you can look at developer.nvidia.com slash nvidia-aerial-sdk-early-access-program
or simply Google NVIDIA Aerial
and it should actually take you to this landing page.
Fantastic.
Thanks for being on today.
It's a pleasure.
Thank you, Alison.
It was great talking to you
and look forward to connecting with you at MWC
and perhaps also at GTC.