Utilizing Tech - Season 7: AI Data Infrastructure Presented by Solidigm - 2x13: AI Needs Non-Traditional Storage Solutions with James Coomer of DDN
Episode Date: March 30, 2021AI applications have large data volumes with lots of clients and conventional storage systems aren’t a good fit. In this episode, James Coomer from DDN talks about the lessons they have learned buil...ding storage systems to support AI applications. Inferencing requires terabytes or petabytes of data, often large files and streaming data. For example, autonomous driving applications generate hundreds of terabytes of data per vehicle drive, resulting in petabytes of data to ingest and process. DDN’s parallel filesystem goes a step further than NFS with an intelligent client that directs I/O to leverage all network links and storage endpoints available. Deep learning loves data, and a smart client can make the whole application faster. Because data is the biggest AI challenge today, an advanced storage solution can really help deliver AI solutions in the enterprise. Although most companies realize that finding expertise (data scientists, etc) is a major challenge, building infrastructure to support them is just as critical. Guests and Hosts James Coomer is Senior Vice President for Products at DDN. Connect with James on LinkedIn or learn more on Twitter at @DDN_Limitless. Andy Thurai, technology influencer and thought leader. Find Andy’s content at theFieldCTO.com and on Twitter at @AndyThurai. Stephen Foskett, Publisher of Gestalt IT and Organizer of Tech Field Day. Find Stephen’s writing at GestaltIT.com and on Twitter at @SFoskett. Date: 3/30/2021 Tags: @SFoskett, @AndyThurai, @DDN_Limitless
Transcript
Discussion (0)
Welcome to Utilizing AI, the podcast for machine learning, deep learning, and other artificial
intelligence topics. Each episode brings experts in enterprise infrastructure together to discuss
applications of AI in today's data center. Today, we're discussing the challenges of
building scalable storage systems to support AI applications. First, let's meet our guest,
James Coomer from DDN. Hello. Yes, I'm James Coomer. I'm Senior Vice President of Products at
DDN. So go to ddn.com and you'll find the website for our company. We've been around for over 20
years building storage and data management solutions for really tough data challenges.
So when the capacities get big, when the performance challenges are very large,
then DDN is the company which our customers come to. I am Andy Thurai, founder and principal at
thefieldcto.com. You can find me on Twitter at Andy Thurai and on LinkedIn. You can also check
us out at thefieldcto.com where we do a lot of emerging tech consulting work, AI, ML, and on LinkedIn. You can also check us out at thefieldcto.com, where we do a lot of
emerging tech consulting work, AI, ML, and the cloud. And I'm Stephen Foskett, organizer of Tech
Field Day and publisher of Gestalt IT. You can find me right here every week on Utilizing AI.
I also am organizing the AI Field Day event, which is coming up soon. And you can connect with me on Twitter at S Foskett.
So James, I'm very familiar with DDN
having been in the enterprise storage space myself
for quite a while.
And I know that DDN has tackled some really challenging
high volume and high capacity, high performance storage
for the media and entertainment space in the past.
I was not at all surprised to see as well storage for the media and entertainment space in the past.
I was not at all surprised to see as well
that y'all are working to support AI applications.
So maybe we can just start there
with just a little bit of a foundation of, you know,
what are the challenges of supporting
these kinds of applications that require, you know,
tremendous amounts of data,
tremendous amounts of throughput,
lots and lots of clients
with simultaneous access. These are not conventional storage applications.
Yeah, that's right. So we have indeed been exposed to the real full breadth of tough data
challenges, whether that's in high-performance computing or media, streaming videos, etc.
But AI is a bit different. I mean, firstly, there's not just one
thing that represents AI data workflow. It's, of course, a big pipeline, and it all starts with
data coming in in the first place. And that data might be coming in from vehicles, recording with
cameras, LiDAR, radar, might be coming in from satellites, might be coming in from satellites might be coming in from life sciences instruments sequencing machines etc so that ingest phase is part one then there's a data labeling phase
then there's of course the the all-important deep learning phase where we're teaching our models and
training our models based on these big data sets then when we move into production we're performing
inference a whole different class of IO challenges right there.
And then that data moves on, typically not really to be archived, but to be reused and go back into these model retraining events.
And so we have this sort of endpoint, which isn't a traditional archive, but is a very, very active archive.
So data volumes get very large, but we still want to access them very, very regularly.
So that shape is genuinely
different from anything else we've seen. And DDN has been really working on our technologies and
our services in order to support customers with these large data volumes from which they want to
extract value at big scales. Okay. So you talked about a few areas in there where the AI has a data problem.
We're producing a lot of data, but also getting the data ready for AI is a problem, from the data collection some of the largest autonomous car vendors are using you without naming names.
But people don't realize when you have a live car going in, not just the instrument data, but the camera and the whole video and everything that comes in, in a matter of, you know,
hours, when you do it for days, and when you store it and try to make meaning out of it,
and trying to inference from that, you're talking about terabytes of data. So scaling enterprise
production AIs to scale to terabytes of data, or even petabytesabytes of data is not easy. Is that a common problem that you see with that?
And how are some of the guys solving this?
Yeah, you're right.
This is really problem number one, the most basic problem we try to solve.
So as you say, in fact, in this autonomous driving world,
these vehicles come back from a drive around the city
with between one and 200
terabytes of data each. And there's many vehicles and there's many cities. So it easily turns into
hundreds of petabytes. And that's a challenge because particularly in AI, you don't want to
siloize your data. You don't want to end up with 100 silos of 50 terabytes each or 50 silos of a petabyte each.
Really, you want to have a very large, scalable, centralized data storage, which is robust, etc.
And that's really where DDN comes in. So apart from the performance and all the other features, managing big data volumes is what we've architected the systems for. And the way we do it is by scaling the storage infrastructure
in a very different way from any other storage system.
And this is a general principle, really.
It's called parallel file systems.
And, you know, in general, we can look at traditional NFS file systems,
which companies, organizations might use for NAS
and maybe even to serve virtualization environments.
And they have a scaling property, which really doesn't help them stretch to tens of petabytes beyond. So the way
we get around that is by adding a bit of intelligence into the network and into the
compute systems. So to scale big, we need to scale the problem to include all those elements which
are also scaling, the
compute scaling, the network scaling and bigger. And by having an intelligent client on the other
end of the network, we can really help systems scale really limitlessly. And we do support some
of the biggest supercomputers, AI supercomputers in the world. So just to dig into that a little
bit, the reason that helps is now this intelligent client is sharing the problem of scale.
It's able to direct IO, direct the data and read the data from where it lies and compare that to traditional NFS.
There's no intelligence there. It has to go to something and then something else has to proxy for that. So in traditional NFS, you get this sort of big backend data movement,
which is ultimately not scalable
because you're trying to squeeze all that scaling problem
into the backend infrastructure.
By having an intelligent client,
we're kind of sharing the problem out a little bit.
And it allows us to scale by compute,
scale by network and scale by storage.
And that's what really helps us have the systems
which are 100 petabytes plus. First of all, there are 20 areas in there I want to double click,
but let's start out with the obvious one first. You talked about, you know, collecting 20, 30
terabytes of data per car, per city. And so some of them are already mapped out, meaning that,
you know, that the cars or the autonomous vehicles are able to inference based on what's being fed to them.
But the actual problem will come because, you know, video is a thick load, not only a thick load, but also it's unstructured load.
People don't realize that, right?
So it's hard to make meaning out of that.
When you're getting terabytes of data, which is unmapped, for example, if the car is driving in a territory or terrain, which is not already in that, how do you inference that? But more importantly, how do you
take that volume of data and update your model that you feed it back into all the other autonomous
vehicles that are coming into the area or in general overall map? That's a major problem, isn't it?
It is. And it's a very complex process that our customers have been developing over the past few years. So you're right, 100 terabytes of video and LiDAR and radar data can come out of a vehicle after a trip around a city.
And they do keep going around the same city many times because you need to see the city in different weather environments, you need to see these rare events, you want them to happen because you want to capture those rare events. You want them to happen because you want to capture those rare events. And what happens, of course, for some of these scenarios, there's somebody in the car
and they're labeling things live. So they'll be going around in the vehicle. They'll be labeling
stuff saying, I've seen a cat. There's a signpost which says stop. And they'll be adding some labels
as they go around to add some supervision to this process. So the data comes in pre-labeled,
but then has to go through a big process
to add more labels.
The critical thing in autonomous driving,
in fact, in all of AI,
is having trust in this AI.
And what that really translates to
is knowing the complete set of data,
of objects that you've put
through your deep learning program program so if you've seen
once a stop sign with snow on the top you want to be able to label that i've seen a stop sign
with it with snow on the top and get that into your model so this huge problem is this huge
uh challenge is complex and the data feeds in yes every day from each vehicle but it goes around in
a big cycle as they have re-simulation and they have
virtual simulation with our virtual environments and they have these virtual cars driving around
virtual environments adding more and more data so in fact you know while it seems a big data problem
by its nature deep learning loves data it just wants more and more and so it's more of a you
know for our customers it's more of a challenge of working out where the unique objects are to make sure that the mass of objects, even the unique rare events, is pushed back into that model.
It is a big challenge.
So this has been a perennial challenge in the storage space.
And I think that many people outside of the storage industry might not understand that storage clients, storage endpoints are really not intelligent at all.
And that always has held back the development of storage. So whether it's a conventional file
server or SMB or NFS or even cloud storage systems, for the most part, the client has a very limited dialect,
a very limited range of interaction, and basically just treats the storage system as a, you know,
something else, you know, hey, something else, store this data for me, rather than being actively
involved in data placement and data labeling and data classification
and data movement and all that. So what you were just describing, James, makes it sound as though
not only is your client actively participating in sort of where to put the data and how to get
the data there, but also actively participating in sort of categorizing data. Did I hear that wrong?
Well, we're assisting that process,
basically by relieving the infrastructure of undue load.
But you mentioned a bunch of things,
and we talked a bit about how an intelligent client with a bit more intelligence than an NFS client can help scaling.
But also, of course, what it can do is it can
protect data from corruption of the network. We've got the data from the application. We can make
sure it's arriving at the storage safe just as you sent it. We can help performance because we're not
handing off the problem to an NFS protocol. We're actively engaged in moving data over the network.
And in fact, our intelligent client, and in general,
these intelligent clients can have a virtual networking layer to optimize the data movement
through the network as well. So the fun thing is by having this intelligent client, we're not
just giving you fast storage. We're helping you make the most of the network layer,
because we're really using RDMA protocols or FAS protocols across that network.
We're helping you use your compute and apply labeling and compute intensive challenges because
our intelligent client is offloading the work, the network load from the compute system. So we
basically, not only do we help storage with our intelligent client, we really help the whole
infrastructure. And in fact, the biggest benefit for customers is that these intelligent clients make the applications
go faster, which is something people really often don't really measure. They'll think about a
storage system and think, well, I want this storage system because it says it's going to do 10 gigabytes
a second or a million IOPS or something like this. And of course, as we all know, the critical thing is how much faster is your
application going to go? And often the application isn't actually held up by the backend storage
system itself. The storage system is wonderfully powerful, but it's everything in between the
storage and the application that is not allowing that potential to get through. So by spreading this storage loveliness right to the application,
we can track the data, we can accelerate the data all the way
into the application.
So where we really think the intelligent clients are important
is accelerating workflows.
And we really mean that because we're interfacing directly
with the application.
We're not handing it off to NFS and handing off to a network protocol we're interfacing directly so we can push the data right into where it's needed and with ai like we're talking about here
the optimizations we've put in to cope with tensorflow to cope with pytorch to cope with
the sort of behaviors they have we can can put in those fixes, those improvements,
not just the storage layer, but the network layer and the client layer as well. So we can kind of
fix the whole data path end-to-end into the application and out again.
So there are a couple of areas in there I want to double-click. But first of all, I love the concept of, how should I put that?
Feeding dumb data to intelligent processes, right?
Because most of the intelligent applications
and processes are held back by not,
as I said, it's not a dumb data.
It's a smart, well, data is data.
There's no smarter dumb data.
But you know, how to make meaning out of it
and get that right data in place, right?
So that's the problem.
So did I hear you say that, you know, regardless of the framework, whether it's PyTorch, TensorFlow, doesn't matter
what it is, on the fly, you'd be able to figure out what data is needed when, and then you're able
to assemble the data on the fly using AI and then provide them? Yeah, so you open up a couple more avenues to discuss here. One is a data platform
in general really needs to be flexible. You know, these companies are investing a lot of money in
the data itself, in the data scientists, in the AI processes. And the last thing they want to happen
is for suddenly to find a roadblock in their storage system. So flexibility, protocol support
is important. And, you know, we put in optimizations for individual AI frameworks. So flexibility, protocol support is important. And we put in optimizations for
individual AI frameworks. So you're right, TensorFlow, PyTorch, we run these in our labs
and make sure we can cope with these, the sorts of IO. So just an example, some of these
applications like to use MMAP as their primary POSIX call. That can be particularly troublesome
to some storage infrastructures.
So we've optimized the MMAP call, even though it sounds like a minute detail, it can have a huge impact on the performance of these AI frameworks.
And then you mentioned about finding the right data. Typically, the user decides the data set they're going to train on. So just like the autonomous driving or in
life science or in finance, you want to basically choose the data set so your model is holistically
trained across the correct span of elements that you want to train upon. What we do, so they already
know that, we don't have to help that problem, but what we can do is we can look in this very
large scalable storage system in the background
with kind of intelligent algorithms you can find the data that tends to be hot you can find some
tendencies and optimize that data in the right place for quick access so that means you know
you might have flash you might have hdds and of course when we're talking about tens, hundreds of petabytes, HDD is still, you
know, king when it comes to price per terabyte. But we can sit there in the background, we can
juggle this data around and optimize the hot data into flash at very, very large scales,
the sort of thing that's easy to do in a tiny system for enterprise, but very, very difficult
to do when the system's containing massive amounts of unstructured data.
So, yeah, so we do all that stuff. So, A, we do have to have flexibility and support SMB,
NFS, S3, as well as our intelligent client. And then the background, we also do kind of clever
things to try and optimize the data placement. So it's on ready for ready for some reads or whatever so then let me
ask you a follow-up question on that um the the problem domain that you described that's actually
a major problem with most of these enterprises particularly with the model creation phase right
because they have this this probably unlabeled undocumented unstructured dark data tons of it
spread all over the place because not
necessarily all of them are centralized storage. It's a massive issue. But isn't the, particularly
when it comes to HPC, right, because they need to assemble all of it to, you know, things like
medical imaging, autonomous driving, combination thereof. But isn't that the same problem that some
of the data lakes are trying to solve? How are you unique and what is your differentiation?
So the intelligent client is part of that.
And the other part is we're POSIX.
So what we found is data lakes, which has traditionally, well, the past 10 years meant big data, which has also implied Hadoop.
The challenge is there because it's quite a particular storage infrastructure. It was built
for batch workloads. And, you know, it's not POSIX. It's its own special thing. And what we're finding
is, well, firstly, AI frameworks go massively fastest when using POSIX. It's an extremely,
I mean, it's people think of it as a complex protocol, but actually it's super fast.
So the fastest way of accessing your data is through POSIX, especially with an intelligent client such as the one that DDM provides.
So Hadoop's kind of out the window now.
It's Spark, it's TensorFlow, it's PyTorch.
Everything's maximally in memory, and they're all using POSIX for the fastest access. But not exclusively, so other protocols are still important. So we can still interact with the
legacies of batch workloads because we're POSIX, absolutely standard, and we can export and allow
people to access our data through NFS, S3 and SMB. So flexibility is key, flexibility at decent speed,
but the best performance is no question through POSIX, accelerated by these intelligent clients.
And I'll just jump in here as the resident storage nerd.
So when he says POSIX, what he's referring to is a set of IEEE standards that dictate how computers should interact with other systems. And so essentially, when you hear that a system is POSIX compliant,
what they're saying is that the system uses standard accesses for various components,
including storage. And so it's not some weird proprietary thing. They're engaging with the
operating system,
with standard operating systems like Unix and Windows
in a standard way.
Yes, thanks.
So the stuff on your laptop is POSIX file system.
The stuff in our global parallel file systems
is a POSIX file system, the same standards,
which basically mean the data is correct when you read it
and correct when you write it,
and there's a certain format that's all built in.
So let me pull you back. You guys are going too deep for me.
You know, all technical details. So let me pull you back a little bit, level up and ask you a question about that,
particularly with the look at the end of the day, major problem AI has is, you know, a data problem, whether it's a data collection, data storage, data labeling, providing the right data, all kinds of issues.
But so if they were to use a solution similar to yours, I mean, there are a ton of areas that you touched on.
You almost seem to cover every problematic angle of AI, data problem of AI, that is.
How does it help companies when I'm creating a model and start spending three days, I'll do it in half hour?
Or what's my advantage?
Why do I care?
Why do I use you?
Yeah, very, very good question.
So the first one, of course, is time to market. So when companies have invested so much,
they want to build a system that's going to scale and start quickly and then scale pretty easily.
The starting quickly thing is resolved by companies such as ours who work very closely with
the vendors who are creating the GPU infrastructure. So we've been working closely with
NVIDIA, very tight integrations. We've got multiple reference architecture publications. Our systems are plugged into Nvidia's largest supercomputer. So customers can really pick a solution off the
shelf, starting small, starting literally this large, and scaling to the largest supercomputers
in the world. They can pick that and see the performance they can expect to get from that
from a white paper. Not because we've made it all up, because we've really tested it in these
huge labs. So that's advantage number one. It's time to market with a solution that's going to ultimately scale. Now, you mentioned,
so why else would customers sort of come to DDN or come to a company like DDN? I guess
the other thing is we do specialize just in storage and data management. So that
specialization is always important to our customers because whenever they're deploying,
let's say there's so much risk involved around the whole space of developing a new AI strategy
and issues always happen. There's always funny things happen in the network. Applications behave
strangely for whatever reason. There's bugs.
Having expertise, which really knows networks, really knows compute, really knows storage, that's important.
And one of the reasons DDN is quite good at this is because we do span the network.
We work in the compute.
We work in network.
So the problem doesn't stop at our storage.
The problem really encompasses the whole environment. So any organization which will partner with these customers,
with these big strategies, with high risks,
would do well to find a storage partner who's got the expertise,
not only just in storage, but in network and compute,
because it's all super connected.
And optimizing really means optimizing everything,
not just optimizing storage.
And finally, you know, the shift is moving.
So these various questionnaires by various analysts basically ask CIOs where the biggest challenges are in AI.
And three, four years ago, it was actually in finding data scientists was number one, number one, and number two, number three. It's really shifting now as people are starting to implement, they're realizing infrastructure is actually
a big issue at scale. And it was always there, but now it's really come to the fore. It's
security, it's scale, it's performance. And there's two pieces there. One is people are
starting to understand that in order to develop an AI strategy, not only do you want to cover everything in your plan, but you want to have overhead.
You want to be able to build a space where your data scientists can really innovate.
Because it might be those people who actually, you know, find the golden egg from the golden goose and make your company really be amazingly competitive so you
need to have a system that's not only capable but also more than capable you want to be able to have
these data scientists have a lot of freedom to innovate and to de-risk the whole environment
by having expertise being used to dealing with these very, very tough data challenges. So all that comes together. And I think, you know, companies are in a difficult position, given the biggest risks and the huge
potential rewards. And so they need to think, is traditional storage going to work today and
in the future? And what kind of services and surrounding portfolio and people do I need around that
solution to help them succeed much, much longer term? Because one thing they've all got in common,
these customers, is they don't quite know what's going to be truly successful on day one. There's
always going to be surprises on that route, so you need to keep flexible.
I think another thing that a lot of companies are faced with is a lot of conflicting information
and kind of almost FUD in the marketplace about what do you really need in terms of
storage and so on to support AI applications.
And certainly we've heard people say, oh, AI applications, they need to be all flash,
for example.
There's no way you can build
AI applications using disk. It has to be all flash in order to support the performance requirements.
Or we'll hear somebody say, oh, well, you know, it has to be a distributed solution where the
storage lives on the clients instead of in a centralized server. Or it has to be object,
or it has to be NFS, or it has to have some proprietary it has to be NFS or it has to be to have some proprietary interface.
There is a lot of confusion over there and a lot of companies out there selling
basically what they've got on the truck instead of what the true answer is. So, I mean, how do
you answer that considering that essentially you're selling a pre-existing system, not one that was specifically
designed for AI? What makes this system better than others? Well, it's a very good question.
And the answer is that we actually have almost everything you mentioned, we do that. So we can
provide storage that resides in compute. We provide S3 object stores. provide hdd and all flash so we've got relatively little bias
when it comes to this area and when we're talking to our customers we've really got a very broad
solution portfolio that does everything at scale so the way and so given that you know it's never
totally objective but reasonably objective starting point we've got a broad range of solutions.
What do we say?
Well, we do say, of course, flash is best, but this economics is coming to play.
And when our customers have 100 petabytes of storage, there's really, you know, it's going to be another. 30 before even the lowest cost flash kind of gets into its into its dollars capacity competitiveness
against today's HDD is still quite a way off which is why we built these hybrid solutions
that really scale so of course we we love to sell flash to customers all NVMe is great
but when there's just literally an economic boundary there then we can at least give them the best for hdd for their
for their money so and we can also handle that so they mainly see flash performance but mainly pay
for cost of active capacity which is of course you know the the ideal place in there in terms of
objects you know we can talk to our customers about the benefits of S3 object protocols for a long time. We've been
using that for over 15 years. Where it comes into play in AI is really for ingest. You've got these
dumb systems out there, maybe they're satellites, maybe they're CCD cameras or whatever. They want
to push things into your network through S3. It's a very handy protocol when your devices are
scattered around and they're relatively dumb. So bring stuff in through S3. It's a very handy protocol when your devices are scattered around
and they're relatively dumb. So bring stuff in through S3, that's great, but there's no way you
can perform with S3. You can't do your inference with S3. The protocol is inherently slow compared
to POSIX. So that's, I mean, that's the argument. The argument about putting all your data inside
the compute, it's always got to come out. Unfortunately,
it's always got to come out at some point. And so you really just add complexity more and more by really mainlining on that route. It's all right to have an element of caching, if you like,
in the compute, but by mainlining on, I'm going to put my storage in my compute is, you know,
an endless disaster because the data's got to come
out at some point. And the manageability of data on storage devices inside computers is not very
good compared to managing data in proper storage systems. So we do cover all these elements. We
have thought about them a lot while we were architecting our systems for AI, which is really
why we've come around to the system we ship today, which we think is the best of all those worlds.
Well, thank you so much for that. And unfortunately unfortunately we do kind of have to wrap up here,
but before we do that I'd like to move on to the lightning round of bonus questions here at the end,
which is always a lot of fun. So a warning to the audience, James has not been given a heads up
about which questions I'm going to ask him and we'll see what he comes up with here as an answer for some of these. And of course,
it's all in fun. So here you go. Let's kick things off. Number one, would you say that
machine learning, deep learning, and artificial intelligence are synonymous, or do those terms
mean very different things? They mean different things. AI is the big wide picture.
Deep learning is the particular sort of machine learning which benefits from large data volumes.
All right, next question. DDN has a long history in the video space, storing video
and processing video. When will we have ML that's video focused and that operates the same way that
Siri or Alexa works with audio? Well, kind of now, I suppose, a lot of our customers are
doing real-time video inferencing, some really fun ones, actually. So in fact, there was a recent announcement in
London, I live in the UK, and that's been London about a new store, which changes the shopping
experience. And we've had three customers do this over the past three years. Very interesting
companies. So they have the cameras in the shops in the supermarket. And they don't look at you,
they're not they're not looking at your face and
recognizing your face. That would be bad. They're looking at what you put in your bag. So they're
seeing the cucumber, they're seeing the tin of beans, and they're tracking it all. And when you
walk out the shop, it says, hey, your phone goes ping, and it says, would you like to pay for this
now? And you go, yes, you pay for your cucumber and your tin of beans without having to go through
a checkout. So lovely example of real- time video streaming, which hopefully means that all those people who work in supermarkets can spend their time helping you find the cucumbers and beans rather than stalling you at your checkout experience.
So that's one of many. And there's lots of these rather nice video streaming inference examples often used in film work when you might be videoing outside and you want to
in real time blur out some advertising which you didn't want to display in your live stream and
broadcast to millions of TV viewers and again we'll have got these inference capabilities which
are going to blur out these adverts as you pan your camera around the city center. So you don't accidentally advertise for somebody else. Lots of great
examples. In fact, it's probably the most fun, the fun world is the video inference world.
I knew that you'd have something to say about that. Again, this is a, you know, you guys are
a big gorilla in that market. So cool. And then following on to that, you mentioned supermarket workers.
Are there any jobs, maybe not supermarket workers, but are there any jobs generally
that are going to be completely eliminated by AI, jobs that will no longer exist in five
years?
Five years.
Oh, damn it.
Are you going to give me 10? So I think the most likely candidates, well, firstly, as some elements of factory work,
you know, industry 4.0 is there's some things people are doing today, which kind of really
doesn't make sense on these manufacturing lines.
So that's a big area for improvement and optimization.
And the whole COVID thing has only accelerated that.
The need not to have people unnecessarily sitting together in factory environments is
an important area for improvement for manufacturing.
And then ultimately, self-driving cars, they're a big one.
It's not really five years, to be honest.
But we'll start to see point solutions.
For example, when those lorries come into the huge rate logistics park to pick up their parcels, as soon as they come through the gates, then they can hands off and the truck can drive itself into the right place, park itself in the right place, according to computer algorithm, according to the logistics system, without the drivers messing things up. So there'll be point solutions in vehicles outside of the
commercial, the open road, where we can put AI in charge to improve the process.
You almost stole a fourth one of our questions, because we do have a question about autonomous
driving that we ask a lot as well and your answer was definitely in
line with what we've heard from others on that question so in fact i would say that overall we
we've heard pretty much uh consensus on a lot of these and uh and some differing opinions on some
others but thank you so much for joining us uh james it's been great to talk with you and learn
a little bit more about how uh you and how d DDN are working in this new world of AI and
ML. Can you let us know where can people connect with you if they'd like to continue this discussion
and follow your thoughts? Yeah, so go to DDN.com. Take a look at the blog there. Myself and my
colleagues often put out our blogs there. You can email me at jcoumer at ddn.com,
J-C-O-O-M-E-R at ddn.com.
And if you want to hear from me
and three of my colleagues,
we'll all be presenting at GTC.
That's the NVIDIA conference on AI
starting 12th of April.
And we have four presentations going on there.
How about you, Andy?
What have you been up to lately?
Doing a bunch of work in the AI stuff,
as you might have seen.
Just published an observability report,
which is an external AI apps kind of thing
for GIGOM
and doing a lot of work in that space.
So you can check out most of my work
at the fieldcto.com.
As always, you can follow me on Twitter
at Andy Thurai
or connect with me on LinkedIn or find most of my work at bfieldcto.com.
And as for me, you can find me on Twitter at S Foskett.
You can also find me every week on Wednesdays for the Gestalt IT News Rundown posted to gestaltit.com.
And of course, every Tuesday here on Utilizing AI.
So thank you for listening to the Utilizing AI podcast.
If you enjoyed this discussion,
please do subscribe, rate, and review
since that helps our visibility.
And please do share this show with your friends
and share it on social media.
This podcast is brought to you by gestaltit.com,
your home for IT coverage across the enterprise, and thefieldcto.com. For show notes and more episodes, go to utilizing-ai.com or find us on Twitter at utilizing underscore AI. Thanks for listening and we'll see you next week.