Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 3x03: Platform Considerations For Deploying AI At Scale with Tony Paikeday of NVIDIA
Episode Date: September 21, 2021Enterprises are working to simplify the process of deploying and managing systems to support AI applications. That's what NVIDIA's DGX architecture is designed to do, and what we'll talk about on this... episode. Frederic Van Haren and Stephen Foskett are joined by Tony Paikeday, Senior Director, AI Systems at NVIDIA, to discuss the tools needed to operationalize AI at scale. Although many NVIDIA DGX systems have been purchased by data scientists or directly by lines of business, it is also a solution that CIOs have embraced. The system includes NVIDIA GPUs of course but also CPU, storage, and connectivity and all of this is held together with software that makes it easy to use as a unified solution. AI is a unique enterprise workload in that it requires high storage IOPS and low storage and network latency. Another issue is balancing these needs to scale performance in a linear manner as more GPUs are used, and this is why NVIDIA relies on NVLink and NVSwitch as well as DPU and InfiniBand to connect the largest systems Three Questions How big can ML models get? Will today's hundred-billion parameter model look small tomorrow or have we reached the limit? Will we ever see a Hollywood-style “artificial mind” like Mr. Data or other characters? Can you give an example where an AI algorithm went terribly wrong and gave a result that clearly wasn’t correct? *Question asked by Mike O'Malley of SenecaGlobal. Guests and Hosts Tony Paikeday, Senior Director Senior Director, AI systems at NVIDIA. Connect with Tony on LinkedIn or on Twitter at @TonyPaikeday. Frederic Van Haren, Founder at HighFens Inc., Consultancy & Services. Connect with Frederic on Highfens.com or on Twitter at @FredericVHaren. Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett. Date: 9/21/2021 Tags: @TonyPaikeday, @nvidia, @SFoskett, @FredericVHaren
Transcript
Discussion (0)
I'm Stephen Foskett.
And I'm Frederik van Herren.
And this is the Utilizing AI podcast.
Welcome to another episode of Utilizing AI,
the podcast about enterprise applications for machine learning,
deep learning, and other artificial intelligence topics.
This week, Frederik and I are talking about the many ways in which different platforms are challenged by AI applications, and the fact that AI requires a completely different set of
infrastructure and resources than conventional applications. Yeah, indeed. I mean, AI really is based on a bunch of software
frameworks that are heavily mathematically based. And the needs for mathematical multiplications
and executions per second has grown so fastly that the traditional concept of a CPU hasn't
worked. And so there's a need for a much faster capability to process all those multiplications.
And the GPUs obviously are a perfect solution to solve those problems.
We've talked in previous episodes about the need for much more networking bandwidth and storage capacity, storage resources in terms of performance, memory, and of course, as you mentioned, GPUs.
And of course, any talk of GPUs leads to the monster of the GPU market, which is NVIDIA.
So we're very pleased to be joined today by somebody from NVIDIA. Tony Piketty is somebody
that we've talked with previously about all the interesting aspects of, well,
the changes that are coming in platforms to support enterprise AI applications. Tony,
why don't you introduce yourself a little bit to the audience?
Hi, thanks for having me. I'm Tony Paikde from NVIDIA. So I'm responsible for AI systems,
product marketing at NVIDIA. We have a portfolio of enterprise solutions called the DGX system.
So a lot of my team's charters are around helping enterprises around the world, you know, kind of democratize access to AI and AI infrastructure and build incredible applications to help power their business? It seems to me that the key to
understanding the DGX systems is not that it's some kind of, I don't know, special configuration.
It's that it's all about balance. It's about balancing the system resources to support the
GPU in AI and other GPU heavy workloads. And that to me is the key here, because I think a lot of people
think, well, if I just throw everything at the problem, then everything will work great. And
that might be true, but it strikes me that DGX really isn't about throwing everything at the
problem. It's making sure that the system is ready to keep the GPU busy. Is that right?
Yeah, that's certainly, excuse me, a very important part of this.
But I would actually say that the way we've looked at the problem in enterprise is around
how do you simplify how enterprises can deploy and manage infrastructure specifically for the
purpose of running AI workload. And so when you look at it
from that perspective, there's a lot of layers in the equation. Certainly the GPUs that are there,
all the things that surround it from an IO, bandwidth, memory, storage, network fabric,
all those things matter certainly. The architecture that we're talking about, this design balance is very important.
And I think there's a lot of organizations that oftentimes try to piece this stuff together. And
sometimes you have the expertise, sometimes you don't necessarily in terms of striking the right
design balance to ensure that GPUs are kept fed with data during a training run.
But even beyond that, what we found at NVIDIA is just as importantly or even more importantly
than the hardware is obviously the software.
We spent a lot of time within the DGX business unit
and NVIDIA at large optimizing a complete software stack.
And what we've realized is that everyone
from data science developers, practitioners,
to people who manage IT infrastructure, stack. And what we've realized is that everyone from data science developers, practitioners to
people who manage IT infrastructure, they essentially need the right tools and platform
such that they can actually operationalize AI at scale. And what I mean by that is being able to
see more of their valuable intellectual property in terms of viable models and prototypes actually deployed in production.
This is a classical problem that's been solved in conventional enterprise apps,
but a lot of businesses are right now struggling with how to manage and scale workflow
that can allow data science developers to do incredible things on one end
and have it realized in production applications at the other. So we spend a lot of time thinking about the tooling and the software
that needs to enable that. And then in combination with that, making expertise available to our
customers such that when they have a question about a framework, about a model type, about a
use case, or about things like drivers, libraries,
and communication primitives,
that they have someone that they can talk to.
So really for us, the approach has been full stack
in that respect to help organizations scale AI.
Yeah, I totally agree with that.
I mean, not so long ago, you needed like an MBA
and a bunch of PhDs to set up an AI environment.
I mean, let alone the complexity of the hardware.
So I do really agree what you said, that there is a need for, first of all, a complete software stack, right?
So many more people do AI today or have to do AI to stay competitive.
And they need not only the tools tools but also the support in order to
make this happen. A lot of people talk about the democratization of AI where you provide the
hardware and I think there's one key component you brought up which is the software. The hardware by
itself is not the end solution, it's the complete stack and the ability to help end users.
And I also want to point out that NVIDIA is not necessarily known
for contributing a lot in the open source community,
but the reality is they do, right?
And it's just to enable not only the hardware,
but the ability for people to do AI that don't necessarily know all the different components.
Yeah, I'm so glad you brought that up, Frederick, because if you think about applications like NLP or recommender systems or any number of foundational AI use cases, many organizations, maybe the most, lack the data science expertise or the
access to data sets or experience building and training models to be able to do that
stuff from scratch.
So we really do folks a disservice if we don't give them ready-made tools that allow them
to pick, for instance, pre-trained models off the shelf
and then do fine-tuning around the edges to optimize them for their unique vocabulary or
their unique set of problems, right? So I think increasingly the state of the art is one that
reflects what you just described, namely offering more and more ready to kind of plug and play type applications delivered in, again, pre-trained
models, scripts, and other content that lets developers exert a lot less effort doing the
foundational work and simply being able to now plug and play into their enterprise.
Right. I would even say that customers today are basically saying, I want a solution.
They don't want to detail all the different little items.
It's to the point where people understand that the expertise is with NVIDIA from a software and a hardware stack. And they're just trying to, you know, let's buy a solution that works for us.
Yeah. You know, you know, DGX has been around for five years. And if I look at our own trajectory, which I think reflects what we've seen in the broader market, so much of the early work that we saw with AI pioneers was born on the backs of like what you'd consider hyperscale type organizations who had deep pockets and incredible bench of expertise to build incredible infrastructure to solve really complex problems,
right? And you'd expect them to do because they have the capital and operating budget to do that.
And a lot of what we'd seen over the time might even classify, if you're looking at enterprises
as potentially shadow AI, where you'd have data science teams or business units building what
they needed to build outside of any kind
of IT governance and certainly outside of any kind of IT shared infrastructure.
And now we're seeing more and more CIOs and IT leaders wanting to define the infrastructure
strategy because they see, in some respects, a lot of costs running out of control as developers run up OPEX trying to do DIY platform instead of having an IT
sanctioned environment that centralizes all of that. So this has now put IT in the lens and
forced us all to think about how do we make it simpler for IT teams to manage.
Yeah, so we're talking about the DGX. I mean, the DGX, I think GPUs. I mean, one other thing which is important is, as Stephen mentioned, there is not just the GPUs, which you could consider that compute, but there's also the network and the storage, right?
So from a network perspective, most people know that you have acquired Mellanox.
So you kind of have the two pieces from a hardware perspective. And then with storage, I assume you're partnering with industry leaders from a storage perspective.
And then also the announcements about ARM was also very interesting, where you kind of tried to optimize your infrastructure.
So is that a trend you see from enterprises where you kind of feel the need to fill the gap, so to speak, like with Mellanox and other components?
Do you see yourself kind of selling not just the DGX, but a solution where you provide NVIDIA stamped network, and storage? Yeah, this is really driven by the problem statement,
how do you achieve the fastest time solution on the most complex AI problems an enterprise might face?
And the challenge has been when the essential resources
are disaggregated from each other and treated in almost piecemeal,
you have a real challenge in terms of how do you parallelize
the problem,
the computational problem, in a way that you can effectively scale and shrink that time to solution,
right? And when things are not necessarily cohesive, or there is a lot of time and distance
between where the data lives and where the compute lives, and there's a lot of latency in the fabric connecting those
things, you stretch time to solution out and your ability to distribute the problem, let's say on a
training run across more and more nodes because you need to scale compute capacity, it escalates
or exponentially increases out of control whereby language, you know, language model that, you know,
you're trying to build might take weeks to months, but with the optimized architecture,
and if you are able to, again, shrink time and distance between all these resources and make
them appear as one computational unit, one really big processor with really big RAM and commensurate storage, then you
now are able to take that problem that was solvable in weeks or worse and now deliver
an answer on a training run in potentially minutes or hours.
And that's kind of what's forcing all of this is that we see where the state of the
art is going with the most important use cases
that enterprises are trying to tackle. And we're seeing the need for a purposeful approach in terms
of the right kind of network fabric with minimal latency and highest bandwidth, with having the
right kind of storage subsystem that's compute proximate and compute that's data proximate, all those things coming
together, forcing you into really this optimal architecture, kind of like we started out with
this design balance problem, right? Right. Yeah. And then I would like to talk a little bit about
scalability, right? The funny thing about an AI project is a successful AI project will actually
generate even more data, right? So your problem actually becomes worse and
worse over time if you don't have a scalability model. Is that something that you see happening
in the future as well where people keep on adding more and more data or do you see technologies like
transfer learning kind of helping you taking some shortcuts left and right in order to kind of not start from scratch and always having to add data, but to start from a baseline?
Yeah, I think a number of these techniques will help mitigate certainly the amount of data coming into the enterprise that's either fueling initial
model development and prototyping or simply operational data that comes in every day,
every minute, every second being processed by models at the edge of the enterprise.
I think what all of this is forcing is a rethinking of where we situate this infrastructure
relative to the data.
We think a lot about data gravity.
That's been top of mind for us and a lot of folks in our ecosystem
because we see what's happening and we see that a lot of organizations
are spending a lot of time and effort fighting data gravity.
And it's like fighting planetary gravity, you know, you spend a lot of time and money fighting data gravity. And it's like fighting planetary gravity. You know,
you spend a lot of time and money escaping it or working against it when really what you want to
do is bring your resources to where the data is, bring your applications to where the data is. And
this is why we see a lot of organizations, for instance, repatriating workload in close proximity to where that
data is being created.
I heard this bumper sticker version of this years ago, train where your data lands.
And I've kind of assimilated it and made it my own because I definitely feel that that's
true.
And it's true for a lot of our customers as well.
And in terms of their mentality around what they need to do as far as their infrastructure and resource strategy.
Yeah, I think data gravity is,
I mean, if you ask a lot of enterprises,
they will bring that up
as one of the issues they're having.
But I think also data gravity
has to do a lot with architecture, right?
So how it's designed, you know,
with AI, typically you can start, you can start small.
If you don't have an architecture that scales, nothing, nothing is going to help you with that.
Right. And that's why I was mentioning earlier, the solution where a solution where the architect
is kind of built in will help enterprises be more efficient without having to think about it. Right.
You don't want to, you don't want to make the same mistakes over and over and over.
Yes. Yeah, absolutely. Yeah. That's what I was kind of trying to get at the first year when I
was talking about the balance of the system, because it's not about delivering maximum GPU
throughput, because of course you can't do that unless you can deliver a system that can keep
those things fed. And I think that maybe that's one area that people
sell NVIDIA a little bit short on is because they look at the company as basically the GPU company,
and they don't think of it as the systems company. And I guess that must be tough for you,
because it's like, you're the systems guy, right? I mean, that's, that's your thing.
Yeah, definitely. So, you know, traditionally, or historically, people have not thought of us as an enterprise
systems company or a provider of IT infrastructure. And for the very reasons you described,
and rightfully so, if you look at kind of our heritage, where we came from,
that doesn't surprise me at all. I think gradually, we are starting to change that mindset, but, you know, enterprises
also have expectations in terms of, you know, we love hero stories around big science and a lot of
the incredible work done at the leading edge of research. But, you know, I think what we've worked
on intently over the last few years is with our customers, with industries
helping to showcase, I think the less sexy, but maybe more boring, but fundamental pragmatic
use cases that are powering businesses today, especially in like challenging turbulent times,
things that are helping them to improve customer intimacy
and enhancing every customer interaction, streamlining operations and reducing costs
and delivering competitive agility.
These are things that almost every business we talk to cares about.
And coincidentally, they're now talking with us about how to deploy you know, deploy the right kind of AI infrastructure to do these exact kind of things, which AI is obviously really great at.
Yeah, without making it too much about you personally, I think the fact that NVIDIA has people like you who come from more of an enterprise, you know, data center background instead of people who are just more, you know, GPU focused. I think
that that actually really helps the company. And frankly, that's one of the things, in my opinion,
that they got with Mellanox as well, is that you've got a group of people who are used to
selling not just into HPC, which of course they are, and cloud, but into enterprise environments
as well. Did you find that it was a challenge though? I mean, is this a continuing challenge in the core DNA of the AI compute architecture,
right, of these AI compute systems. And I think there's kind of two key modes of operation or
two kinds of IT essentially that we see flourishing now in enterprises that are really leaning into AI.
And on one end, you have infrastructure that customers need that is
purpose-built to only do one thing and one thing only, and that is to take the most complex model
and shrink it down into a, you know, a shortest amount of time possible to deliver an answer,
right? Shortest time per training run and the most complex AI problem that they've got.
And they come to us for that, right?
And they know, for instance, that a DGX system,
it's got one purpose in life.
It has one purpose in life is to execute that training run
as fast as possible and iterate as fast as possible.
But on the flip side of it,
on the other side of IT,
when it comes to deploying a tuned model, one that's ready for inference, one that's ready to be used in operation production, enterprises have a lot of viable solutions that embed our technology, but in what we would consider approved servers or certified servers, we call them NV certified, but essentially servers that
incorporate the best of our technology, but offered by our ecosystem partners. And so we see this
duality coming together and most enterprises kind of need both. They need the development
infrastructure, they need the deployment infrastructure. We're kind of in all of it.
And, you know, this also brings our valued server partners into the equation as well, because
they enable this, you know, our customers have made a huge investment in a lot of household names
that are pervasive across their data center. And they want to be able to leverage that same
investment to be able to run these applications and deployment at scale. And so we want to enable
that. And that's important, of course, because as you
mentioned, there are companies that NVIDIA works closely with and partners with, and certainly
you're not going out there and fighting with any of these big ISVs. You're partnering with them
and enabling them to sell solutions as well. Yeah, absolutely. And that's, you know, also part of why we build
the platforms that we do. Oftentimes, what we see missing in the marketplace is a gap that we can
step into and build kind of the proof point or the blueprint, offer that blueprint to our ecosystem,
our partners who know how to do this stuff at scale can take that
and then build solutions using it. And that's kind of the mission of DGX. It's the mission of a lot
of the things that we build that essentially provide that blueprint to ISVs around the world
who do this really well at scale and let them be successful with it. So yeah, you're absolutely right. Yeah, you talked a little bit about corporate IT. I think one of the trends we have
seen in the past is that typically GPUs was just for the research crowd, right? The people that
were really technical, had an understanding of what was going on. I think nowadays, because AI is such a dominant factor, that it's more moving towards the corporate IT world where it's becoming more and more standard.
However, there is still a technology gap between the hardcore research people, if you want, and corporate IT.
How do you see enterprises have conversations with NVIDIA?
Do you feel that the conversations are more technical or more strategic?
It's really both.
It's evolved over time.
It started off, and it still is the case that, you know, developers are our best friend.
They grow up, you know, from the earliest stages and starting in school using our toolkits
and our software.
Many of them cut their teeth on CUDA
and learn how to work with our GPUs from there. And then they eventually land in enterprise and
building incredible things. So over the years, obviously, we've courted the developer community
because we want them to have an incredible experience using our software to do what they
need to do, their life's most important work.
On the flip side of it, obviously, increasingly, we're getting on the CIO's radar.
And we have regular roundtables with CIOs around the globe meeting with our executives.
And we have this ongoing dialogue with pretty much every enterprise focused AI business that's out there.
And a lot of times those conversations turn to, you know, how do I take this stuff that used to
be science projects and now operationalize it at scale? And those conversations very much are
around strategy. They're very much around things like MLOps and how to manage this kind of
infrastructure and why do I need purpose-built infrastructure and what is the ecosystem and the
whole offer look like beyond a really fast computational box? What is the complete end
architecture look like? So it's really kind of hitting both ends And this is why you see, for instance, our partnership with companies
like VMware, right? And those in the storage community, we work with them because we know
that a lot of our partners are the trusted names with leadership. And by us working together,
we're simplifying a problem for our customers and enabling them to onboard this infrastructure much quicker.
They all need to do this.
And many of them are doing this because they need to scale.
They need to solve the shadow AI problem, you know, development silos spawning across their enterprise.
They want to consolidate people, process, and platform.
And so centralized shared IT infrastructure enables to do them. And
so we see ourselves sitting at the table with them trying to help them with that.
Right. I mean, it seems like you're helping customers with their AI roadmap, not necessarily
the NVIDIA component of it. It's just like a roadmap and how NVIDIA can help and point out which partners can help them be successful.
Yeah, very much so. I think the most valuable guidance we can give in the conversations we want to have with these customers,
you know, on the development side of it is to share with them the best of what we know, whether that's commercialized or not. Oftentimes it's not simple. It's oftentimes just simply a matter of connecting, you know,
a developer researcher with one of our own, you know, PhDs or scientists or researchers and
letting those conversations happen because we've made it a point to onboard some of the best talent
we can find out there from science and academia
and enterprise and make those resources available to our customers so that they can benefit
from the same things we're figuring out at the same time.
I mean, we run a fairly large R&D shop and we are spending a lot of effort trying to
move the ball forward in some really critical places for enterprises.
We want them to be a part of that exploration and share in what we figure out along the way.
I wonder if we can turn a little bit here, a little bit toward more of the specifics of
what AI requires from a system. So you've been intimately involved with developing these
specialized solutions for AI applications. And of course, this is utilizing AI.
So maybe, can you talk a little bit about the various aspects of the system and maybe anything
you've learned over the years that is essentially, you know, really a critical component to building
a system to support AI applications? Yeah, it's a great question, Stephen.
The reality is that AI is unique in how it consumes resources, unlike typical enterprise workload. And achieving, as I was saying, fastest time solution on a model requires having enough
of those computational power of those resources in combination with ultra high performance storage, really high IOPS,
really low latency to feed those data sets during a training run with everything connected over a
very high speed, low latency network fabric, which increasingly obviously is InfiniBand when you're
talking about multi-system, large cluster implementations. This is typically what we've found is needed to tackle
the largest AI problems or models that need to be ultimately parallelized over multiple systems,
if you want to have a reasonable timeframe within which to train those models.
We talked about design balance. One of the challenges and the things that we solve for is ensuring linearity of performance with system scale.
And this is at multiple levels. connected GPUs and I should be able to distribute my problem and get faster and faster performance
as I string together more and more of these systems. But we know that that breaks down
because as your problem gets larger and you try to parallelize it over more and more systems,
you incur a lot more communications overhead to distribute the problem across all those systems. And so you get diminishing returns
as you scale, if you approach it in the classical way of just, you know, a lot of PCIe connected
GPUs with a pretty standard ethernet fabric and multiple systems. And that's why a lot of what
is in the core DNA of our systems, things like NVLink, which offers a high bandwidth inter-GPU bus, if you will,
to make eight GPUs seem as one, right? And NVSwitch is the other part of that technology
that creates that inter-GPU fabric. Combined with the InfiniBand fabric connecting multiple systems, when you go from one to two to four to
140, which is, you know, the ultimate in scale that I think we've, you know, been able to
demonstrate with what we call a DGX super pod. This, you know, having this approach at the core
system level, and then at the scale out level and being prescriptive around what the
network topology looks like across all those systems has enabled us to demonstrate that
linearity of performance such that there is no or minimal drop-off as you get to your 140th system. And it used to be, I would be floored by the idea of anybody stringing together 140
systems over a single network fabric. You know, a couple of years later, I'm actually not that
surprised because there are so many organizations who are doing exactly that, especially in areas
like NLP and large recommenders and autonomous vehicle system
development this is the kind of scale that they're operating so at such that they can iterate fast
and and and um get an answer to a training run in you know in hours and days instead of weeks
and months kind of thing yeah i agree a couple of years ago when when two two gpus in the server was considered
like the top and i said we wanted more uh they called us crazy and then not not that long after
you know there were more more gpus so it seems like nvidia i mean first of all ai is all about
bottlenecks right you always have bottlenecks you try to solve the bottlenecks with NVLink and so on. So you kind of keep on solving the bottlenecks you see on the road. Is there
any particular bottleneck you see today that NVIDIA really focuses on and believes that it will
accelerate and remove a lot of roadblocks? Yeah, if you look at, for instance,
GPU direct storage and MagmaIO,
we're solving for the problem of the inherent latency
incurred when the data path has to move through
this host CPU before it gets to the GPU.
And essentially short-circuiting that
and offering a streamlined path from the data store
in like external storage through a NIC in the in the server that's obviously optimized with what we call a DPU a data processing unit direct to the GPU.
It speaks to the very thing you raised Frederick namely this idea of eliminating every bottleneck as we see it, that's inhibiting larger and larger levels of scale.
So GPU direct storage is probably the latest example I have, you know, in combination with the DPU,
that's allowing us to now ensure that there is minimal latency, minimal speed bumps between where the data lives and the computational power that needs to act upon
it, right? So that's very much the way you described it is perfect. Namely, it is really about
eliminating those bottlenecks, especially when you talk about distributing a problem over multiple
systems. Yeah. So you're talking about reducing the impact of the host? How about eliminating the host completely?
I mean, isn't that what the ARM ID was,
is to kind of provide a bootable GPU
that didn't require a host?
I actually don't know that we would,
I wouldn't look at it necessarily as eliminating the host.
I see it as an adjunct
and there is a natural bifurcation
of what functions exist on one
kind of processor versus the others because um essentially you know we're always going to have
like mixed workloads except in in the realm of you know training where i think if you're trying to
implement infrastructure to train uh very challenging models very complex problems
you're going to have you're going to be very purposeful
and very singular in what kinds of workloads run on that infrastructure. But increasingly,
there is kind of this deployment infrastructure that needs to handle a much more wider palette
of mainstream acceleratable applications. And in those environments, you need to have a way to still support applications that depend on
traditional CPU, but also can offload as much of what doesn't need to be done on a CPU onto devices
like a DPU, as an example. So we see in kind of those heterogeneous environments, you're still
going to have kind of this multiplicity or duality of processor types. So it sounds like really NVIDIA
is not only the GPU company that everybody thinks, but also a major player in enterprise AI
applications. And I think that that may not come as a surprise to a lot of our listeners, but
maybe some of them it might, because many of the people who are just starting to look at deploying
AI applications are starting to ask themselves, what kind of system am I going to need in order
to support this? And quite frankly, the answer is that, you know, NVIDIA has already answered
that question for you with the DGX systems and in partnership with many of the familiar names
that you're probably already working with today. So you can find yourself a balanced system that not only supports small AI and ML applications,
but can scale up to really massive proportions here.
And they got that covered for you, especially now that they're rolling out new products
and technologies.
So the time has come, Tony, to move
on to the fun part of the podcast, where we talk about some things that are a little unexpected.
And that leads us to our famous three questions. This tradition started back in season two,
and we're now carrying it through to season three. But, you know, we're adding a little twist here.
So our guest has not been prepared for these questions ahead of time and we're going to get their answers off the cuff right as we speak.
The difference this season is that I'm going to ask a question and Fred's going to ask a question, but the third question actually comes from a previous guest on the podcast.
And of course if Tony has one he can pay it forward here and ask a question of a future guest on our podcast.
So let's kick things off. Fred, do you want to take the first question?
Sure. So how big can ML models get? I mean, today there's hundreds of billions of parameters for a model, which might look small tomorrow.
You know, is there a limit, you know,
can it keep on growing?
Yeah, it's a, it's a great question.
I am always careful not to try and define an upper bound because when I thought 8 billion parameters on a language model was a big deal,
lo and behold, you know, a few years later, we were here in a GPT-3, right?
So I, you know, a few years later, we're here in a GPT-3, right? So I, you know, all of this
will evolve and continue to scale in response to the infrastructure and the tools ability
to enable models of that size. I can definitely see that the use cases and applications will only
continue to drive us to larger and larger models.
So I'd really say that there isn't a conceivable upper bound if we're kind of keeping our imaginations open to the art of the possible or the art of what could be.
Excellent. Now for something a little bit more fun.
So in Hollywood, they love to show us artificial
intelligence that's basically an artificial person, like Mr. Data or somebody like that.
Do you think we'll ever get to that point where we'll have just sort of a general artificial mind,
somebody that we interact with walking around that's AI? You know, I think about this in two
ways. One is, if I really, you know, put on the science fiction
hat of things, the idea of a sentient being that is aware of itself and you can interact it
like in a truly human way, that part of it, you know, kind of freaks me out to be perfectly honest.
I mean, I don't know that any of us are really prepared for it, but maybe that's an eventuality that happens sometime way off in the distant future.
If you look at the trajectory of things, you kind of wonder, could we eventually get there?
And that's a really hard one to wrap one's mind around. But what increasingly is apparent to me is with the advent of these incredibly large,
as we say, big NLP type models, they are increasingly presenting themselves in a way
that you almost think behind the covers, there's someone incredibly smart. I recently saw a video
pitting three different generations
of GPT models against each other doing trivia questions and such. And it floored me how
quickly they could deliver answers to some of the most, you know, arcane type questions and details.
And I think that that level of intelligence backed by essentially an algorithm that knows how to connect the dots from oceans and oceans of data in milliseconds.
I think that's something very real, very real and very possible.
And we're already kind of seeing that hit the doorstep of enterprise, just to be quite honest.
Our third question comes from a guest on season three, episode two. Take it away, Mike. This is Michael Malley, SVP of marketing and sales
for Seneca Global. And my question is, can you give an example where an AI algorithm went terribly
wrong and gave a result that clearly wasn't correct? I'd love to hear that. Yeah, you know, one example that's probably a lesson for all of us is I've seen where
we've had NLP-based chatbots in, you know, engaging the Twitterverse and basically over
time, you know, evolving to give answers that were really off color, really inappropriate,
really bad for general consumption, but the algorithm was simply doing what it was programmed to do in response to
the input received. And I think it's also a lesson in how, while this technology is incredibly
powerful, there needs to be careful governance and thought around the data fueling these things and
looking for things like bias and looking for explainability and understanding how the answer
to the question is derived, such that we don't have AI that essentially goes completely off the
rails and says or does a bunch of things that could really embarrass us or worse.
Thank you so much, Mike. And Tony, thank you very much for joining us today. We look forward to
hearing what your question might be for a future guest. And if you, the listeners, want to be part
of this, you can. Just send an email to host at utilizing-ai.com and let us know you want to be part of our
three questions segment.
Tony, thank you again for joining us today.
Where can people connect with you and follow your thoughts on enterprise AI and other topics?
Well, first of all, thank you for having me, Stephen and Frederick.
I had a great time.
You guys can find me at Tony Paikaday on Twitter, and I'm on LinkedIn as well.
Great. We'll include that in the show notes. Fred, how are things going with you?
Doing well. So it's funny, we're having this conversation with NVIDIA because I'm working
on a project with a super pot. So I'm learning all the internals and the outsides of the super
pot. So I'm looking forward to doing that.
Excellent.
And as for me, I've been really enjoying following some of the announcements coming out
of all the various exciting AI products.
And we've been covering a lot of that
on the Gestalt IT Rundown on Wednesdays
at gestaltit.com.
So thank you everyone for joining us here
for the Utilizing AI podcast.
If you enjoyed this discussion,
please do shoot us a
rating review on iTunes because that sure does help. And also please share the show with your
friends. This podcast is brought to you by gestaltit.com, your home for IT coverage from
across the enterprise. For show notes and more episodes, go to utilizing-ai.com or find us on
Twitter at utilizing underscore AI. Thanks for joining and we'll see you next week.