Podcast Archive - StorageReview.com - Podcast #122: Navigating The AI Landscape: Real-World Insights And Challenges
Episode Date: September 14, 2023This week Brian sits down with our own AI expert, Jordan Ranous, to discuss… The post Podcast #122: Navigating The AI Landscape: Real-World Insights And Challenges appeared first on StorageRevie...w.com.
Transcript
Discussion (0)
Hey everyone, welcome to the podcast. We've got a great conversation today with our own AI expert, Jordan. You can tell by the beard he knows what's up.
We're going to talk about some of the hottest topics in AI, some of the things that we're facing as we explore AI in the lab and in the solutions that we're working with hands-on.
We've got a lot of new things to talk about there. And this is the first podcast that we're actually integrating live with our Discord.
So we're bringing in our Discord community to be able to interact,
ask questions while we're doing the live.
We'll try to get to those questions as we go.
But for now, Jordan, thanks for doing the pod.
Appreciate it.
Yeah, happy to be here and glad to be back on the podcast.
Yeah, let's start with FMS because
that's where you and I were last together in person, where the sloths were the talk of Santa
Clara. We were running a demo out there showing AI vision. And this is kind of one of the topics
I want to get into as we go, some of the undertones,
is that what AI means to one person is not a universal definition. We've got generative AI,
which is the hot thing with ChatGPT and DALI and some of these other things. But there's so much
more, so many things that used to be called business intelligence that are now rebranded as
AI. But the vision AI bit really had people talking
on the expo floor.
Talk about what we were doing there and what that means in your vision of what AI is.
Yeah, so that's actually, you started off with a really good point there.
There's a lot of misnomers that you see going around, floating around on the internet of
people thinking AI.
And in their head, what they're actually thinking
of most of the time is what's called AGI or artificial general
intelligence, they think the computer is actually thinking
and making rational decisions. But realistically, what we've
got going on today, with the generative AI side is basically
fancy auto completes or neural networks for generation. And
then what we were working on at FMS was
computer vision, which is a subset of the broader AI, right. And we had a model running a public
model running on our server doing some object recognition. So the way that those work is you
train your models on massive image data sets.
The name escapes me right now, but there's a couple standardized ones that have been out for some time.
You can do further fine tuning.
So like if you work in a manufacturing industry, let's say you work at an automotive plant and you make water pumps for your cars and you want to have some sort of AI quality control.
We've been doing camera style quality control for quite a long time.
The interesting thing that we're seeing now is the industry starts to adopt this stuff
more and more for production is we're able to kind of almost preemptively do the quality
control as the assembly lines go.
So you can have multiple things tagged in different ways using neural networks instead of kind of some of the more legacy logic.
And then another thing you touched on there.
Before you leave that real quick,
I mean, we've all seen the factory lines of like making canned peas or something.
And if the can's dented, the thing just like flicks it off, it gets rejected and goes into
the waste bin or repurposed or whatever. I mean, that I guess was an early primitive form of this.
Is this can properly shaped? Yes, no. And then kick it out. But when you talk about what we were doing with
the Vision AI bit at FMS, it was doing object detection. It was funny. I thought the funniest
thing was as we progressed through the day, it started picking up all the beer bottles at five
o'clock as soon as the bars opened up in the expo floor. So the model you were using was really general.
Is this a bottle?
Is this a backpack?
Is this a purse or whatever?
Eyeglasses?
We saw dozens of different things pop up on there.
But could you tune it even further if you wanted to and had enough camera resolution to say,
that can is a Coors Light.
That can's a Lagunitas if it was me.
Can you get that level of detail to really tune this for, you know, I want to know as
Anheuser-Busch at an event, what are people actually drinking? Absolutely, and that's kind
of where I was going with the business intelligence side of it, right? Where we're seeing, you know, we've evolved now from, you know,
my can is dented on the factory line to being able to classify multiple things
in an image and not just detect just if something is incorrect or not,
but also be able to detect and categorize different things.
So, you know, our model was more general.
So we had, you know, teddy bear, and we had can in bottle, of course, but with with proper tuning, like you said, proper resolution stuff, we're actually starting to see a lot of retail based companies for loss prevention, start tracking the things that come across like the self-checkout and tying that in
with the scale weight data and tying all those things together but they're they're investing
massive amounts of time in creating the training data sets for those models right so in in kind of
a optimistic kind of way you know saying something, all things are possible through AI as long as you have
a good enough, big enough data set and a big enough
GPU to run it on.
Which kind of
brings me to one of my next things that I actually
want to talk about, which is the actual creation
of training data and how I'm not seeing
a ton of stuff in the market right now that's
focusing on that. Everybody's talking
more about, you know, here's all the cool
stuff you can do with the model. I'm more interested in how do we get there? What's that roadmap looks like? And I
think that's kind of what we're seeing the off the shelf, chat models, like chat GPT enterprise,
for instance, was recently announced is getting, you know, a lot of traction, because it kind of
helps bypass and sidestep a lot of that initial training and other data creation and normalization that you need in order to be
able to work on these.
Well, that's, I mean, you're getting at one of my pet peeves now is that the AI world,
and I said it at the beginning, is like different things to different people.
When you talk about creating this data or creating the models or whatever, there's a giant chasm between the HPC
installs, the top Fortune 100 and what they're doing and what some small business, some small
retailer, some small manufacturer is doing. And there's quite a bit
of gatekeeping that I find moderately offensive. I understand it. I get it. But I can't tell you
the number. You've seen it on our social posts. We're talking about a workstation with a couple
A6000s in it. And we'll get some people saying, oh, that's not AI. That's whatever. It's like,
well, a lot of AI starts out on notebooks because most organizations can't dump a million dollars into.
Dell's got these beautiful XE9680 servers, eight way H100s, which, you know, you love that system.
We all love that freaking system, but that system is unobtainable for the gross majority of the enterprise. And so I think there's a conversation
to be had here around both what you're talking about, tools to democratize the on-ramp to AI,
but also how do we leverage the right tools for the job? And we're going to get into some of that
too around cloud GPU instances, around workstation, around actual dedicated GPUs. Like I said, the XE9680 or even
the two 4x systems from Dell that we just saw. We've got a nice 30-minute plus video coming
on YouTube for that. But I don't even know if I really have a question there. I just have a
frustration that I'm planting. And I think you're seeing the same thing as an AI practitioner. I mean, you're actually
in there doing this stuff. But is anything I'm saying sound ridiculous to you?
No. And I think that's a lot of challenges that a lot of industries are facing when they're looking
at this. And everybody sees it as, oh, new, exciting, shiny toy, right? Let's all get into it and let's roll it out and be the first one.
And, you know, I think Morgan Stanley
is trying to just put out a press release.
I don't know if they actually implemented it
or they were saying they were starting to implement
a chatbot for interacting with, you know, your portfolio,
which, I mean, I'm...
That's kind of a day, you know,
what, I'm really interested to see what they do with it. I
think that'll be more helpful, especially getting speedy
questions, coming from a background in finance and
customer service. You know, getting getting the answers to
the customer quicker is the name of the game. Even if it's just
getting them to the right person to be able to help them with their inquiry,
either on the phone or through a chat portal,
getting them to the right person as quick as possible
is paramount to improving customer experience.
So in those types of scenarios, yeah, these chatbots are great.
And if you can have natural language conversation
backed by routing logic, you know, you can get increased deflection rates in your contact centers, better handling times because folks get to the right person at the right time.
You know, you call the bank and you say, I got a problem with my debit card.
The machine hears, oh, and they send you to the credit card department.
Well, they can't help you with your debit card.
They had to transfer you again. Now you've just wasted a bunch of money.
That's everyone's biggest frustration, right? Is how many times you in the old days,
you would pick up the phone and mash zero until you hope to get to a real person that can help
with that decision tree. With a chatbot, the mashing zero doesn't necessarily work because
you can type representative or human or whatever a thousand times and the chatbot's designed to not accept that, right? It wants to try to resolve it.
And we've all had bad chatbots. And I guess, have we had good ones yet? Have you been impressed
with anyone's customer service bot to this point? I had some hands-on experience with AWS as a contact center plug-in for theirs that utilizes their chatbot.
And it had some pretty powerful connectors in there.
You could, of course, set up logic that the first couple of times people say, talk to a human right away, it'll kind of press them into trying to, hey, if you can just tell me what you're called, like what you need, I can try.
Give me something.
Yeah, give me something.
But yeah, I think there's always going to be that subset.
And I'm probably included in that subset of the population that I don't think any of them have ever been in up to the date. And I haven't had one good to interact with myself
outside of some sort of, you know,
open source kind of proof of concept-y stuff
or things like chat GPT.
I think you and I were playing around with it
when we were on the plane, actually,
and we got some AI running on the laptop on that plane
on the last flight we had together.
See, it can be done, you know, senior AI gatekeeper.
You can do this stuff on a notebook, but carry on.
No, absolutely.
But I mean, we saw, you know, getting just even just using the GPT-4 API and having some
sort of instruction following and some directive in there, we were able to get some pretty
impressive results by giving it a goal and then allowing it to kind
of go out and accomplish that in the same vein that i think the name is auto gpt is one of the
bigger ones uh out there i know there's a handful more i saw a guy is the other day that he uh he
caught his uh auto automatic i can't remember the exact name of it, but he had his own instance running and using the thing.
And he interacts with it by texting with the AI.
And I guess he had caught it looking at some adult websites
or something trying to do research.
And so, you know, these things,
that brings us kind of full circle back to what we were talking about,
about, you know, enterprises and their reluctancy to deploy these things and some of the larger
ones and getting them to stay on the rails and getting
them to do what they're supposed to do. You don't want
your credit card company chatbot
going off talking about the weather for 45 minutes or so.
No, I mean, this brings up such a big question that I'm reluctant to even
pause it, but I've been seeing more. I was just at an event a couple of weeks ago
looking at, or I was watching an AI panel and there was a professor from NYU that was speaking
about the inherent biases in technology. And as you would expect,
I mean, the typical things of, is technology racist? Is it bigoted? Is it whatever? Does it
favor one culture or society over another? And you and I were talking about that. And I think
it's kind of what you're talking about here is if you say to your AI model, go explore the internet and get smart and then come back and be aware
of these things, it's going to touch everything it can and pick up some flavor of culture
that could be hyperextended in any direction, right?
I mean, if there's no barrier to where it goes and the information it consumes,
then it could look at ESPN and be a sports jock kind of AI persona
or some other media.
It doesn't make any difference.
But what you're getting at is there is a need to make sure that the data going in is
quality.
And I think that's part of the concerns that enterprises have is I want my chat
bot to be good. I want to expose it to relevant information,
but not too much information where now I missed some sort of security
check and it's sharing Jordan's account information with mine because we have
similar interests and that's not good either. Right. That's, I mean, security threats. I've,
you know, I've worked in that field and seen, done a couple analyses of how that plays out
through long context conversations. So there's of course that. But touching on what you were saying before with inherent biases in AI, that's a hot topic right now. And getting the training data properly together and properly put in there is, you know, paramount. You need a very diverse group working on doing that. And, you know, I'd say this being fully self-aware of
we work on the internet and our platform is the internet.
The internet is a garbage cesspool of nonsense
most of the places out there.
Yeah, I've been on Reddit everywhere.
Yeah.
And so if you don't take care in your selection,
which is an absolutely mammoth undertaking for something like an LLM,
if you don't take care in your selection,
getting the proper training data
or even that proper fine-tuning data is a real challenge right now.
You know, being the one man band here working out of the lab
with just our silly projects, I mean,
we've seen some pretty,
some things we probably shouldn't talk about,
but we've seen some pretty interesting results come out
of just finding random data sets or hitting a Reddit API
that yanked down a bunch of information.
We've seen some pretty interesting things happen.
Well, no, let's talk about it.
Part of that FMS demo was running an API Doom server
with a ninth system on a laptop logged in.
So this was one system that was doing all of these things.
And I don't know exactly, you tell me,
what you told it to see
for doom video game players but it wasn't long before i'm sitting there at the booth i'm looking
at the chat on the uh on the laptop that's observing the game and the eight ais are talking
so much trash to each other we had to we had to turn it off for a little bit because it was a little uncomfortable,
to say the least, with what these...
In my defense, it was really good.
They're invisible guys!
They're invisible AI gamers fighting each other and making anatomy jokes.
But yes, in your defense, what?
In my defense, it was really well implemented Doom online chatter.
And for those in our audience who played that.
I'm not going to disagree that it executed the mission.
It understood the assignment as it were.
Okay.
So that's part of the thing too is making sure that the assignment is right.
Yeah.
And we may have been a little too fast and loose and that kind of comes up to the conversation of ethics and ai right you know you play a little too fast and loose with
your assignment with your rules and you end up with stuff like our ais and booths from thousands
of people telling each other to plank and blank themselves i was i was. I was both proud of our deployment and somewhat embarrassed at the
same time. I mean, it's like we said we were going to set up a doom server, which we did,
and it executed it quite flawlessly. But that does highlight the point of,
you know, it's not as simple as garbage in, garbage out, but there is definitely an underlying tone of what are the limits and where do I want this thing to go?
And I think that's part of why most, as I was saying before, most of our chatbot experience has been either rough, you know, somewhere between rough to terrible.
I think there's a general reluctance by the enterprise to give these things more information because of the unknown. The fear
of the unknown is going to hold back public interfacing AI in the enterprise, I think,
for quite some time. I've had conversations with other enterprises about, you know, internal
use chatbots, you know, a chatbot that knows all the policies and procedures,
and you can ask it a quite, if you're a marketing guy, you can ask it a question and find out if it goes for or against your company policies. You know, that's the dream, right? To be able to
have an assistant to sit there and help kind of expedite, especially as organizations get
massive, like so many are, those things and those folks can get hard to track down.
Who do I go speak to about this particular thing?
And having all the data sets, you know, containerized into an AI model or a model that has access
to be able to search those.
And where the conversation ultimately leads is, yeah, but there's some stuff that, you know, SVP of marketing can have access
to and know about, but there's things that, you know, a remote support phone agent should not be,
you know, able to ask questions about, you know, finance protocols for, you know, expensing
private flights, or just something completely arbitrary, right?
But that's ultimately where the question always leads. But role-based access, yeah.
Right.
We're going to have to have a whole new category in Active Directory
for your level of AI access within an organization.
Seriously.
And that's exactly where I was heading with that,
is it's not just as simple as Active Directory role-based right now
and getting to somewhere like that would be great.
I mean, having
I think the
path on that would be,
path of least resistance at least on that
to explore would be having
the initial
interaction give the
AI as part of the initial
prompt your levels of access based on some
Active Directory stuff and have it work that way.
But even then, you start talking about things like prompt engineering.
So we start looking at stuff like Nemo guardrails, you know, getting that to be into a spot where
it can handle permission-based stuff would be a really
cool thing to see.
Even on the prompting bit, I didn't know this until just recently, in organizations that
have aggressively invested in AI, that's a job.
Just writing the prompts is a full-time job that never existed before.
I don't know how common it is, but there are certainly people out there that are paid full-time
to help their organizations or help their people in the organizations write well-constructed prompts.
And it's not just like, we've all interacted with ChatGPT by now, I think
most of our audience has. And you can say, write me a sonnet about this person, this person that
talks about flowers a lot, and it'll go do that. But if you're trying to get at actionable business
insights and having it create something for you, you have to, I don't know if it's talk to it like
a six-year-old, but there's some sort of level there where sometimes you even have to be repetitive or ask it questions to make sure it is giving you back what you think you're giving to it.
It gets to be a lot more complicated than just write me a poem.
Yeah, I think your chat GPT thing is really actually a good metaphor.
It's representative of kind of where things
are heading, especially less complex models, right? So yeah, I spend a lot of time with it for
help with everything from, hey, you know, how do I make this email make me not sound like a jerk to
I need some help with this code, because I'm getting this air and I don't know where it's
happening. And, you know,
depending on exactly how you ask it a question or exactly how you prompt a
question, right? So if you don't, if you think of it, um, as a,
as a task completing fancy auto correct,
and you start and think of it less as a, like I touched on early,
there's so much confusion over, Oh, chat GPT is AI.
And people think it's an AGI, like data from Star Trek
or the computer from Star Trek, and it's really not.
So if you fundamentally understand something like that,
it becomes a lot easier to work with.
And if you have context about your company's own model,
whether it's a chatbot or some sort of LLM
or some sort of generative AI,
and you're the prompt engineer,
the reason why you're seeing these jobs getting posted
for obscene amounts of money
is because it does take a long time
to understand how those work under the hoods
to be able to get them to do what you need to do.
Yeah, yeah.
No, it makes sense.
And the world of challenges that AI is opening up in organizations, again, which is why I
go back to some of my frustration that you talked about some of NVIDIA's tools.
They did a release last week on some new LLM support with software.
They're open sourcing, and we can get into that a little bit if you want.
I don't think you've had time to play with it hands-on yet.
But still, I think as an industry, we could do better to help democratize these tools.
I think a lot about ChatGBT, again, because it's such the obvious one,
that a lot of organizations are reluctant to use it because there's no privacy there.
If you use the public version and you pump corporate data into it, there's no guarantee of anonymity or privacy with that data.
Right. So it's hard for an organization to go hard on that.
Plus, it stops its data set is what 2021 or something.
So it's a little bit older and it's trained to do what it's been trained to do.
Not necessarily what you want it to do. So how, what is the,
the on-ramp for war for some small business,
small enterprise that wants chat GPT like functionality across
its internal assets with maybe some public websites or whatever that
it identifies as relevant. What is the on-ramp for an organization that wants to do that? And
do they have to go staff up a couple million dollars in an AI department of people, which are
hard to find, to go run something like that? So I think you actually hit the nail on the head there.
The on-ramp looks like getting something set up internally
and interacting with something like chat GPT-4,
having a few really good initial prompts set up for it,
for how it's supposed to help you
and what it's supposed to do.
You know, a well-skilled veteran engineer
could, you know,
get an Azure instance in GPT-4 and have it plugged into a Slack bot
where anytime you mention a Slack bot in a channel,
it's able to respond and answer questions and help you do stuff.
Even if it's just as simple as, you know, getting, you know,
help me reformat this Excel table or help me clean up this company wide announcement, getting, easing into it and having everybody make sure they have a full understanding of what they're getting into. I think with something as simple as that as a Slack bot is far more valuable than trying to dump in head first, right? Because you quickly start to understand the limitations and the real
practical uses of these things other than yeah write me a poem about Brian's
wonderful salt-and-pepper hair well out there you so what what's the practice
well so think about it this way if I'm using Salesforce.com, most organizations use that for their CRM
and sales funnel. If I want an AI model that analyzes that data and comes back with a... Because
we know salespeople are notoriously awful at making their own sales forecasts, so that's
why it's a lot of lick finger and stick it in there and hope for the
best. But an AI model should be much more rigorous or could be much more rigorous in that process,
assigning probabilities and coming up with a salesperson one, your predicted sales target
this month is $28,325. And we can give it that rigor. Is it on the business to be able to come up with that model on
their own, to be able to take advantage of that and to make their sales process smarter? Or is
that something that we should expect salesforce.com to create and say, give us your parameters. We'll
certainly not value add. We'll charge you $10 a head a month for this AI tool that all it does
is help you understand and better predictively analyze your sales funnel. Where do you think,
based on what you're seeing, where's the pressure or where's the innovation going to come from?
So I think that's kind of a complex question, right?
So you've got to look at your own particular use case as a business.
If I'm a, you know, 50 person shop selling repair services across, you know, like a Cincinnati area or something like that.
Yeah. Use the Salesforce built in one, use Google's, uh,
Google sheets built in stuff, right? Get your feet wet, get in there. I don't think everybody
needs a custom, you know, custom model built out right away. But if you're somebody like a
financial institution, I don't think you want to wake up one day and see a headline of chase
accidentally bankrupted, you know, because they turned on. That's the fear, right? We talked about that.
Yeah.
Yeah. It'll be interesting. I don't know why I'm so stuck on this, but as a practitioner yourself,
what do you think small businesses can do? Where do they start? If they've got some IT generalists,
they're not ready to go all out on AI,
but they want to understand better
how it can impact their business.
What's step one?
Is it certifications?
Is it training?
Is it downloading Lama and hoping for the best?
Like where do you even start to peel back this onion?
I think a really good place to start,
and I'm kind of harkening over to the chat here. One of our readers has said that, you know, they started using chat GTT for their work. And I don't know if they're paying for it out of their own pocket or their company is but I mean, I would encourage a lot of companies to give your IT guy or give your dev a budget for open AI,
the APIs, right?
So every time you make a request, it costs money.
Luckily, you don't pay too close attention to the MX bill,
so we haven't had that problem yet.
But yeah, you get even the free tiers.
Before you go too deep on that, I want to talk about the levels of access to ChatGPT because I think you're right.
Maybe the easiest on-ramp is to start by consuming the tools that are publicly available in a safe way so that your data is not exposed.
So everyone knows that there's the OpenAI ChatGPT that's what, up to 3.5, that's publicly available.
You log in, you can use it.
It doesn't cost you anything.
Still has certain limitations.
What are the paid versions of ChatGPT?
Yeah, so you can pay, I think it's about 20 bucks a month and you get access to GPT-4,
which is far, far, I mean,
it's almost like the difference between talking to a toddler and at least a 10 year old, right? It's, it's obviously smarter than that,
but that's kind of how I, you know,
how I would describe it is it's that next step of being able to follow
instructions, being able to do things. Now, with that being said,
there's still stuff like I've got access to, you know, the API
side and I can call, there's different models.
When you start getting into the API side, you had a credit card, monitor your billing.
That's the first thing I'd say, because I definitely do some things.
Yeah.
Yeah.
And it's exactly that.
You're paying per, you're paying per token on there and it can add up really quick on some larger
projects, especially if you're automating the interaction with it.
When you get onto the API side of things, you get access to a handful of models from
OpenAI, which are pretty decent.
They're good at different things.
There's some that are better at writing code.
There's some that are better at writing code. There's some that are better at being creative. There's some that are better, you know, like GPT-4.
You have a API version of it, which I found to be far less restrictive
and more compliant with my requests, if that makes sense.
Let me clarify that.
When I say more compliant with the requests,
if I ask the chat version of GPpt4 through the web browser hey um can you
i need to write some code to um you know print out a list of uh you know 35 grocery items it'll
kind of give me sometimes it'll give you like a start of oh here's how you start that in python
and then it'll put like an ellipsis and a comment that says, and keep going
here. I found like the Yeah, I know. It's like, what the heck am I asking an AI for?
Like, the first time I ever saw that and kind of ran into that brick wall, it was really frustrating,
because it was like, what am I? What am I doing here? But when you move over to the API side,
you start getting into a lot more
freedom with it and then you can start using vector memory databases i use pine cone a lot in our lab um to help kind of with the long-term memory can you give it more data sources because
like the native as we said the native chat gpt databases stops at a time in 2021. Yep. So then, then you're kind of your next step would be going and getting
an Azure GPT for instance.
So that's, so that's, I'm just thinking about like in a small business,
repetitive tasks like marketing emails or, or even in our own, we do a weekly
newsletter, right?
So right now we go in and we manually get the headline
and a summary and a link and whatever.
It's repetitive.
It's one of those tasks that's really ripe for automation.
If I can't go to ChatGPT public free
and say, make my newsletter for the week
with a sassy attitude
because it doesn't know that our content exists and I can't
tell it to go crawl the site. What level of integration do I need with OpenAI to enable
that kind of activity if I really wanted it to go make me the weekly newsletter in a sassy tone?
So I think you got to look at that as a couple different things, right? So the content generation side of it,
that's where you would go to use the open AI,
the AI or either your private instance on Azure.
This is something we see a lot too,
is the over-implementation, and overuse of AI.
You know, the only tool you have is a hammer.
Everything starts to look like a nail, right?
Sure.
But where that starts to get interesting
is if you look at it from kind of a full-end perspective,
now we've got a weekly batch process
that runs on our server
that goes and pulls all of the links and everything,
organizes it into a preset prompt of goes and pulls all the links and everything organizes it
into a preset prompt of here's all the links and it pastes those so you're not doing all of that
on ai but you can start automating your process of you know collect all the links programmatically
with some sort of bot or auto assist job type job type thing, send that through the API with the request,
and then you get back your text,
and that gets delivered to your inbox for approval
and then blasting.
So that's a really good use case,
something I hadn't fully thought of.
That would be how I would do it.
Well, that's the other thing, too,
is there are many ways to go after these things.
So we've talked, gosh, a lot about all sorts of topics today. One that I don't want to neglect
though is hardware. And we're obviously hands-on with a lot of this stuff in the lab. We've got
a project right now, we're working on workstations that have GPUs inside and need to access the data.
And that's one of the big things, right?
And not even just not even GPU direct storage.
I mean, that's cool and all, but that's a step even further down the road of integration from a infrastructure standpoint.
When you think about what workstations are doing, where some of these data scientists are doing that initial workload, the systems themselves don't have a ton of storage, despite the one that we
just built with 200 terabytes or 300 terabytes. That's very rare. Getting access to more data
though helps train the models more faster. Yeah. What are the challenges you're seeing
there as we're exploring this
in real time? Yeah. So we're doing like, you know, we've got that, we've got a piece coming
out about this. And so there'll be a lot of detail in there. We're doing kind of a free,
you know, a little free tier of that, right? Where we're using single GPU workstations and
doing kind of some research level model training and model inferencing for testing and validation. So when you start talking about being able to keep your GPUs fed and keep those, keep your ROI going, and having either shared or leased terms through your developer team to some really powerful GPU workstations or GPU servers,
keeping that stuff fed is one of the most important things.
You don't want your GPUs sitting idle,
especially with the cost of something like an H100.
Right, right.
So when we start looking at something like,
I think our box is something like 80 terabytes
of Gen 5 PCIe and vme in there uh and we have that
shared at line speed uh we're at you know what would be the equivalent of a gen 5 drive in each
of those workstations um at 80 terabytes so you have massive access, instantaneous access to all of these, you know,
all of this either training data or validation data or data that you want to go and do an
inference on to do to do the validation of your models. And having that not only centralized for
things like version control, or making sure your team's all working on the same things.
But also being able to parallelize, you know, I'm sitting here working on a model that's
slightly different than a model that you're working on.
But we're working off the same data set to see who can come up with the same thing that
iteration is getting in all of that stuff.
And especially like the really compelling thing about the E1S is, like,
that's one U in the rack.
Each GPU server, when you start sticking in, you know,
like four, eight, whatever GPUs, you're, I mean,
minimum two U, right, for the Dell liquid cooled.
If you're not doing liquid cooling, you're at to four U
in a lot of scenarios to get more than one GPU in a rack.
So each instance, you're sucking up a lot of space to get these parts in there.
And I think the XE9680, is that a 6U? I can't recall exactly.
It's a 4U for the GPU server and a 2U essentially sitting on top of it.
So now we're talking about, you know, we're sucking up so much rack space just to get the compute going.
What are we storing on it?
It would be extremely expensive to outfit all those servers with, you know, 80 or 100 terabytes worth of storage space.
And then you're still talking about...
They don't even have the slots, right?
Because that's the other thing is the GPU heavy servers often give up storage.
Or you can look at a general purpose server with 24 bays,
but then you're restricted on two add-in cards because you've got no power envelope.
I mean, the hardware right now is a series of, I would say,
intelligent trade-offs to understand what you need,
how much power your rack can even
support. You mentioned Liquid on the 9640. And again, I'll plug the upcoming video. We have a
monster video diving into Dell's latest GPU servers. Kevin and I were down in Texas last week
looking at those guys. But ultimately, you're right. Having fast shared storage so that you can have direct links to your GPU
servers, to your workstations, to whatever,
I think it's going to be a pretty compelling story. And the, uh,
the server guys, the software and the SSD guys are well,
and then video with the Knicks too,
are really all trying to figure out how to,
how to bundle that and
communicate that to the market right now. And just what you're seeing on the E1S drives are actually
Gen 4, but they're still extremely fast and wickedly dense. And the servers are so powerful
now that we can keep inferencing cards in there. We're using an A2, but you could put two of them
in there, I think, a couple L4s if you could even find them, which is the next challenge with
any of the NVIDIA cards. But yeah, now we can inference on that data without moving it again,
which is pretty conceivably powerful stuff. Yep. And not to mention,
we had talked a lot about
creating the training data early on
is a big challenge.
Having that kind of stuff unified
in one spot and a server that has
power to do the normalization
and standardization of that.
You can't just
hold up a PDF and say, here, AI
process. There's stuff you got to do to that. You can't just hold up a PDF and say, here, AI process. There's stuff you've got to do to that.
So being able to do that on the file system side
is getting really interesting. I think Vast actually
was talking about some of that in one of their recent releases, and
that's going to be hugely important.
They've definitely set their sights on solving the holistic problem of AI data, right? And
they're going well outside of the scope of storage, I would suggest at this point and trying
to do much, much more. And you can go on to your next topic. But I do think before the day is out, we need a fresh meme for this with Denzel instead of training day, training data and see what you
can work up on that. I'm sure a Discord will get right on it.
Yeah. Actually, I know I just said you could go again, but just a reminder for anyone listening
in or watching the video on YouTube, we're also streaming this live right now to our Discord audience.
So if you want to participate in the conversation, if you want to help tune my conversation with our interview guests, absolutely.
You can submit questions, interact in real time, and we're doing another podcast pretty much right after this one that we'll be doing that.
So keep up the conversation.
Jordan's already peppered in a couple questions.
It really does give us a new flavor for these podcasts.
We're excited about that.
But carry on, Jordan.
Yeah, I'm loving getting the real-time feedback and getting to talk to everybody.
Where are you totally distracting me?
I need a notepad when we do these from now on.
We did this last time, too.
We're talking a lot about hardware.
Oh, yeah, no, the density, right, and the shared storage.
So when we look at our specific configuration,
this will make a lot more sense to everybody when they read the article
about this Keoxia
storage and our GPU workstations
being moved into
the data center.
And you start talking
about the massive amounts of
checkpoints and different model files
and different iterations
that you can work through.
Getting that ultra high-speed performance
to be able to save that and then share it out to the team,
that's the other thing that's extremely valuable.
We mostly work in a vacuum, so to speak, right?
So it's me and Kevin in the lab,
banging our heads against servers.
We've got a new intern.
Sometimes code.
We've got a new intern today, so you got that guy to build our next.
And every once in a while, some usable code comes out of that process.
So when you start looking at this and thinking about it from an enterprise scale and version control,
and, hey, you know what, that model checkpoint that you made three days ago,
whatever you did there was way better than the crap we're making today.
Let's go back to that and work on that.
And just having that shared across in speeds of that caliber is awesome.
And I kind of touched on this at the end of that article, too.
Because of the speed that the ConnectX cards provide
back to the storage server, it's effectively like retrofitting PCIe Gen 5 SSD speeds into
older servers or workstations that, you know, they've got perfectly fine GPUs.
RTX 8000s are, what, three, four years old now, and there's still plenty relevant for
training and for AI.
They've got boatloads of HPMM. But those platforms don't have, you know, maybe they don't even have NVMe
days on the front, or, you know, like, I think our Lenovo's don't have NVMe, but we basically
retrofitted that in there. So that another really cool benefit um to that to that
project i think that's going to be a huge area of focus is you know i think yeah i don't want
to get too speculative here because we end up talking about tape here in about five minutes but
it's uh it's tape live tape libraries for ai i'm actually headed out to Denver in a couple weeks to visit with Quantum.
So I'm sure they would be over the moon if we started talking about tape fueling AI innovation.
Well, tell them I'll dust off my i500 as soon as they send some updated drives.
We've got, yeah, I mean, this is so back to kind of again once once you started this
with the whole bi uh business intelligence analytics big data stuff right ai as we call ai
is so all-encompassing of all of those disciplines and all of those fields,
especially when you start bringing in the HPC requirements.
It's really going to be a big unifier of technologies
over the coming years,
and it's going to be more and more interesting.
The Grace Hopper superchip is one
that I'm excited to get hands-on with,
hopefully in the near future here.
And getting this stuff all pulled together
and into something really crazy,
it's really cool to watch.
Well, we just put out the article last night
on the MLPerf scores from Grace Hopper 200.
And yeah, it's pretty wild what's available there.
But we're getting a little long here, but I do want to get one more comment from you on this.
And you know what, if you guys love this AI chat, we'll do this again and talk about the rawness of what we're experiencing in real time as we try to solve these problems.
And then also what the vendors are telling us in terms of their AI enablement in their solutions or for their customers.
But we did a piece with OVH Cloud US on their GPU instances.
And actually, that's the next podcast that we'll be recording is with OVH Cloud.
They had some V100s they exposed to us.
And I think this is really an interesting dynamic. And you don't have to go deep here, but just give me your 30 seconds on your take on where cloud can be beneficial for
AI. Getting access to some of this gear is really hard or really expensive or potentially both. The
cloud can solve some of those problems for us, without cost but can solve it with
immediacy if nothing else what what's your high-level take on on what we did
with OVH and any findings there that are that are worth highlighting from from
that review if you've got you know I see OVH fitting in in a really really good
rapid push to market.
I've got this model, and I need to get it up live on the internet
doing some inferencing.
And it's so affordable.
If you can fit into the memory limitations of the V100
on whatever work someone may be doing,
then I think it's great.
And if you are just someone studying and trying to learn how to get into this,
how to interact with CUDA, how to work in the Linux Ubuntu server environment
and working with different driver versions,
the ability to just turn on and turn off and delete and spin up and spin down all of those,
you know, those instances is extremely valuable. I think I recall it's 88 cents per hour. So,
you know, if I'm studying this, or I'm trying to learn this as a developer, or someone who's going
through school, and maybe all I've got is got is you know a lower power laptop or desktop at
home that can't really handle these tech 88 cents an hour i don't yeah they don't require i don't
think they don't require any sort of like big upfront payment you can do hourly billing um if
you watch it and you kind of get in there do your work turn it back off you can get access to some
of the cutting edge you know tool, toolkits, libraries,
SDKs, and all that for very cheap to help supplement.
As long as you go in with a plan, I think it's super affordable and you don't just,
you know, turn it on and leave it on and forget about it.
With a $2,000 cloud. Don't forget about it.
Your credit card company will alert you to that.
But I mean, like, yeah.
Yeah.
Yeah.
We don't have to go real deep there.
I just wanted to highlight that we do have a review on the GPU instances
where Jordan walks through how it works, some of these things,
and some of the hands-on testing he did.
And that is the next podcast.
So by the time you hear this one, that one will be in the can and will be next up.
So if you're interested in some of these concepts around AI in the cloud,
that should be hopefully a very good conversation.
I encourage you to check that one out.
Jordan, I've got to cut you off on this one,
but this has been a great conversation.
And like I said, we're pumping these live into Discord now,
so join our Discord if you need that link.
It'll be in the description of the show
or it's linked in the top
right corner of our website, storageview.com. Check it out and join the conversation. We want
to hear from you. Until then, Jordan, thanks for doing this again, buddy. Yep. Good to talk to you,
Brian. All right. Thank you.