Podcast Archive - StorageReview.com - Podcast #122: Navigating The AI Landscape: Real-World Insights And Challenges

Episode Date: September 14, 2023

This week Brian sits down with our own AI expert, Jordan Ranous, to discuss… The post Podcast #122: Navigating The AI Landscape: Real-World Insights And Challenges appeared first on StorageRevie...w.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Hey everyone, welcome to the podcast. We've got a great conversation today with our own AI expert, Jordan. You can tell by the beard he knows what's up. We're going to talk about some of the hottest topics in AI, some of the things that we're facing as we explore AI in the lab and in the solutions that we're working with hands-on. We've got a lot of new things to talk about there. And this is the first podcast that we're actually integrating live with our Discord. So we're bringing in our Discord community to be able to interact, ask questions while we're doing the live. We'll try to get to those questions as we go. But for now, Jordan, thanks for doing the pod. Appreciate it.
Starting point is 00:00:41 Yeah, happy to be here and glad to be back on the podcast. Yeah, let's start with FMS because that's where you and I were last together in person, where the sloths were the talk of Santa Clara. We were running a demo out there showing AI vision. And this is kind of one of the topics I want to get into as we go, some of the undertones, is that what AI means to one person is not a universal definition. We've got generative AI, which is the hot thing with ChatGPT and DALI and some of these other things. But there's so much more, so many things that used to be called business intelligence that are now rebranded as
Starting point is 00:01:21 AI. But the vision AI bit really had people talking on the expo floor. Talk about what we were doing there and what that means in your vision of what AI is. Yeah, so that's actually, you started off with a really good point there. There's a lot of misnomers that you see going around, floating around on the internet of people thinking AI. And in their head, what they're actually thinking of most of the time is what's called AGI or artificial general
Starting point is 00:01:49 intelligence, they think the computer is actually thinking and making rational decisions. But realistically, what we've got going on today, with the generative AI side is basically fancy auto completes or neural networks for generation. And then what we were working on at FMS was computer vision, which is a subset of the broader AI, right. And we had a model running a public model running on our server doing some object recognition. So the way that those work is you train your models on massive image data sets.
Starting point is 00:02:26 The name escapes me right now, but there's a couple standardized ones that have been out for some time. You can do further fine tuning. So like if you work in a manufacturing industry, let's say you work at an automotive plant and you make water pumps for your cars and you want to have some sort of AI quality control. We've been doing camera style quality control for quite a long time. The interesting thing that we're seeing now is the industry starts to adopt this stuff more and more for production is we're able to kind of almost preemptively do the quality control as the assembly lines go. So you can have multiple things tagged in different ways using neural networks instead of kind of some of the more legacy logic.
Starting point is 00:03:16 And then another thing you touched on there. Before you leave that real quick, I mean, we've all seen the factory lines of like making canned peas or something. And if the can's dented, the thing just like flicks it off, it gets rejected and goes into the waste bin or repurposed or whatever. I mean, that I guess was an early primitive form of this. Is this can properly shaped? Yes, no. And then kick it out. But when you talk about what we were doing with the Vision AI bit at FMS, it was doing object detection. It was funny. I thought the funniest thing was as we progressed through the day, it started picking up all the beer bottles at five
Starting point is 00:03:58 o'clock as soon as the bars opened up in the expo floor. So the model you were using was really general. Is this a bottle? Is this a backpack? Is this a purse or whatever? Eyeglasses? We saw dozens of different things pop up on there. But could you tune it even further if you wanted to and had enough camera resolution to say, that can is a Coors Light.
Starting point is 00:04:24 That can's a Lagunitas if it was me. Can you get that level of detail to really tune this for, you know, I want to know as Anheuser-Busch at an event, what are people actually drinking? Absolutely, and that's kind of where I was going with the business intelligence side of it, right? Where we're seeing, you know, we've evolved now from, you know, my can is dented on the factory line to being able to classify multiple things in an image and not just detect just if something is incorrect or not, but also be able to detect and categorize different things. So, you know, our model was more general.
Starting point is 00:05:03 So we had, you know, teddy bear, and we had can in bottle, of course, but with with proper tuning, like you said, proper resolution stuff, we're actually starting to see a lot of retail based companies for loss prevention, start tracking the things that come across like the self-checkout and tying that in with the scale weight data and tying all those things together but they're they're investing massive amounts of time in creating the training data sets for those models right so in in kind of a optimistic kind of way you know saying something, all things are possible through AI as long as you have a good enough, big enough data set and a big enough GPU to run it on. Which kind of brings me to one of my next things that I actually
Starting point is 00:05:53 want to talk about, which is the actual creation of training data and how I'm not seeing a ton of stuff in the market right now that's focusing on that. Everybody's talking more about, you know, here's all the cool stuff you can do with the model. I'm more interested in how do we get there? What's that roadmap looks like? And I think that's kind of what we're seeing the off the shelf, chat models, like chat GPT enterprise, for instance, was recently announced is getting, you know, a lot of traction, because it kind of
Starting point is 00:06:21 helps bypass and sidestep a lot of that initial training and other data creation and normalization that you need in order to be able to work on these. Well, that's, I mean, you're getting at one of my pet peeves now is that the AI world, and I said it at the beginning, is like different things to different people. When you talk about creating this data or creating the models or whatever, there's a giant chasm between the HPC installs, the top Fortune 100 and what they're doing and what some small business, some small retailer, some small manufacturer is doing. And there's quite a bit of gatekeeping that I find moderately offensive. I understand it. I get it. But I can't tell you
Starting point is 00:07:13 the number. You've seen it on our social posts. We're talking about a workstation with a couple A6000s in it. And we'll get some people saying, oh, that's not AI. That's whatever. It's like, well, a lot of AI starts out on notebooks because most organizations can't dump a million dollars into. Dell's got these beautiful XE9680 servers, eight way H100s, which, you know, you love that system. We all love that freaking system, but that system is unobtainable for the gross majority of the enterprise. And so I think there's a conversation to be had here around both what you're talking about, tools to democratize the on-ramp to AI, but also how do we leverage the right tools for the job? And we're going to get into some of that too around cloud GPU instances, around workstation, around actual dedicated GPUs. Like I said, the XE9680 or even
Starting point is 00:08:09 the two 4x systems from Dell that we just saw. We've got a nice 30-minute plus video coming on YouTube for that. But I don't even know if I really have a question there. I just have a frustration that I'm planting. And I think you're seeing the same thing as an AI practitioner. I mean, you're actually in there doing this stuff. But is anything I'm saying sound ridiculous to you? No. And I think that's a lot of challenges that a lot of industries are facing when they're looking at this. And everybody sees it as, oh, new, exciting, shiny toy, right? Let's all get into it and let's roll it out and be the first one. And, you know, I think Morgan Stanley is trying to just put out a press release.
Starting point is 00:08:53 I don't know if they actually implemented it or they were saying they were starting to implement a chatbot for interacting with, you know, your portfolio, which, I mean, I'm... That's kind of a day, you know, what, I'm really interested to see what they do with it. I think that'll be more helpful, especially getting speedy questions, coming from a background in finance and
Starting point is 00:09:16 customer service. You know, getting getting the answers to the customer quicker is the name of the game. Even if it's just getting them to the right person to be able to help them with their inquiry, either on the phone or through a chat portal, getting them to the right person as quick as possible is paramount to improving customer experience. So in those types of scenarios, yeah, these chatbots are great. And if you can have natural language conversation
Starting point is 00:09:44 backed by routing logic, you know, you can get increased deflection rates in your contact centers, better handling times because folks get to the right person at the right time. You know, you call the bank and you say, I got a problem with my debit card. The machine hears, oh, and they send you to the credit card department. Well, they can't help you with your debit card. They had to transfer you again. Now you've just wasted a bunch of money. That's everyone's biggest frustration, right? Is how many times you in the old days, you would pick up the phone and mash zero until you hope to get to a real person that can help with that decision tree. With a chatbot, the mashing zero doesn't necessarily work because
Starting point is 00:10:21 you can type representative or human or whatever a thousand times and the chatbot's designed to not accept that, right? It wants to try to resolve it. And we've all had bad chatbots. And I guess, have we had good ones yet? Have you been impressed with anyone's customer service bot to this point? I had some hands-on experience with AWS as a contact center plug-in for theirs that utilizes their chatbot. And it had some pretty powerful connectors in there. You could, of course, set up logic that the first couple of times people say, talk to a human right away, it'll kind of press them into trying to, hey, if you can just tell me what you're called, like what you need, I can try. Give me something. Yeah, give me something. But yeah, I think there's always going to be that subset.
Starting point is 00:11:14 And I'm probably included in that subset of the population that I don't think any of them have ever been in up to the date. And I haven't had one good to interact with myself outside of some sort of, you know, open source kind of proof of concept-y stuff or things like chat GPT. I think you and I were playing around with it when we were on the plane, actually, and we got some AI running on the laptop on that plane on the last flight we had together.
Starting point is 00:11:41 See, it can be done, you know, senior AI gatekeeper. You can do this stuff on a notebook, but carry on. No, absolutely. But I mean, we saw, you know, getting just even just using the GPT-4 API and having some sort of instruction following and some directive in there, we were able to get some pretty impressive results by giving it a goal and then allowing it to kind of go out and accomplish that in the same vein that i think the name is auto gpt is one of the bigger ones uh out there i know there's a handful more i saw a guy is the other day that he uh he
Starting point is 00:12:19 caught his uh auto automatic i can't remember the exact name of it, but he had his own instance running and using the thing. And he interacts with it by texting with the AI. And I guess he had caught it looking at some adult websites or something trying to do research. And so, you know, these things, that brings us kind of full circle back to what we were talking about, about, you know, enterprises and their reluctancy to deploy these things and some of the larger ones and getting them to stay on the rails and getting
Starting point is 00:12:52 them to do what they're supposed to do. You don't want your credit card company chatbot going off talking about the weather for 45 minutes or so. No, I mean, this brings up such a big question that I'm reluctant to even pause it, but I've been seeing more. I was just at an event a couple of weeks ago looking at, or I was watching an AI panel and there was a professor from NYU that was speaking about the inherent biases in technology. And as you would expect, I mean, the typical things of, is technology racist? Is it bigoted? Is it whatever? Does it
Starting point is 00:13:35 favor one culture or society over another? And you and I were talking about that. And I think it's kind of what you're talking about here is if you say to your AI model, go explore the internet and get smart and then come back and be aware of these things, it's going to touch everything it can and pick up some flavor of culture that could be hyperextended in any direction, right? I mean, if there's no barrier to where it goes and the information it consumes, then it could look at ESPN and be a sports jock kind of AI persona or some other media. It doesn't make any difference.
Starting point is 00:14:21 But what you're getting at is there is a need to make sure that the data going in is quality. And I think that's part of the concerns that enterprises have is I want my chat bot to be good. I want to expose it to relevant information, but not too much information where now I missed some sort of security check and it's sharing Jordan's account information with mine because we have similar interests and that's not good either. Right. That's, I mean, security threats. I've, you know, I've worked in that field and seen, done a couple analyses of how that plays out
Starting point is 00:14:56 through long context conversations. So there's of course that. But touching on what you were saying before with inherent biases in AI, that's a hot topic right now. And getting the training data properly together and properly put in there is, you know, paramount. You need a very diverse group working on doing that. And, you know, I'd say this being fully self-aware of we work on the internet and our platform is the internet. The internet is a garbage cesspool of nonsense most of the places out there. Yeah, I've been on Reddit everywhere. Yeah. And so if you don't take care in your selection, which is an absolutely mammoth undertaking for something like an LLM,
Starting point is 00:15:51 if you don't take care in your selection, getting the proper training data or even that proper fine-tuning data is a real challenge right now. You know, being the one man band here working out of the lab with just our silly projects, I mean, we've seen some pretty, some things we probably shouldn't talk about, but we've seen some pretty interesting results come out
Starting point is 00:16:19 of just finding random data sets or hitting a Reddit API that yanked down a bunch of information. We've seen some pretty interesting things happen. Well, no, let's talk about it. Part of that FMS demo was running an API Doom server with a ninth system on a laptop logged in. So this was one system that was doing all of these things. And I don't know exactly, you tell me,
Starting point is 00:16:44 what you told it to see for doom video game players but it wasn't long before i'm sitting there at the booth i'm looking at the chat on the uh on the laptop that's observing the game and the eight ais are talking so much trash to each other we had to we had to turn it off for a little bit because it was a little uncomfortable, to say the least, with what these... In my defense, it was really good. They're invisible guys! They're invisible AI gamers fighting each other and making anatomy jokes.
Starting point is 00:17:17 But yes, in your defense, what? In my defense, it was really well implemented Doom online chatter. And for those in our audience who played that. I'm not going to disagree that it executed the mission. It understood the assignment as it were. Okay. So that's part of the thing too is making sure that the assignment is right. Yeah.
Starting point is 00:17:47 And we may have been a little too fast and loose and that kind of comes up to the conversation of ethics and ai right you know you play a little too fast and loose with your assignment with your rules and you end up with stuff like our ais and booths from thousands of people telling each other to plank and blank themselves i was i was. I was both proud of our deployment and somewhat embarrassed at the same time. I mean, it's like we said we were going to set up a doom server, which we did, and it executed it quite flawlessly. But that does highlight the point of, you know, it's not as simple as garbage in, garbage out, but there is definitely an underlying tone of what are the limits and where do I want this thing to go? And I think that's part of why most, as I was saying before, most of our chatbot experience has been either rough, you know, somewhere between rough to terrible. I think there's a general reluctance by the enterprise to give these things more information because of the unknown. The fear
Starting point is 00:18:46 of the unknown is going to hold back public interfacing AI in the enterprise, I think, for quite some time. I've had conversations with other enterprises about, you know, internal use chatbots, you know, a chatbot that knows all the policies and procedures, and you can ask it a quite, if you're a marketing guy, you can ask it a question and find out if it goes for or against your company policies. You know, that's the dream, right? To be able to have an assistant to sit there and help kind of expedite, especially as organizations get massive, like so many are, those things and those folks can get hard to track down. Who do I go speak to about this particular thing? And having all the data sets, you know, containerized into an AI model or a model that has access
Starting point is 00:19:35 to be able to search those. And where the conversation ultimately leads is, yeah, but there's some stuff that, you know, SVP of marketing can have access to and know about, but there's things that, you know, a remote support phone agent should not be, you know, able to ask questions about, you know, finance protocols for, you know, expensing private flights, or just something completely arbitrary, right? But that's ultimately where the question always leads. But role-based access, yeah. Right. We're going to have to have a whole new category in Active Directory
Starting point is 00:20:11 for your level of AI access within an organization. Seriously. And that's exactly where I was heading with that, is it's not just as simple as Active Directory role-based right now and getting to somewhere like that would be great. I mean, having I think the path on that would be,
Starting point is 00:20:33 path of least resistance at least on that to explore would be having the initial interaction give the AI as part of the initial prompt your levels of access based on some Active Directory stuff and have it work that way. But even then, you start talking about things like prompt engineering.
Starting point is 00:20:54 So we start looking at stuff like Nemo guardrails, you know, getting that to be into a spot where it can handle permission-based stuff would be a really cool thing to see. Even on the prompting bit, I didn't know this until just recently, in organizations that have aggressively invested in AI, that's a job. Just writing the prompts is a full-time job that never existed before. I don't know how common it is, but there are certainly people out there that are paid full-time to help their organizations or help their people in the organizations write well-constructed prompts.
Starting point is 00:21:40 And it's not just like, we've all interacted with ChatGPT by now, I think most of our audience has. And you can say, write me a sonnet about this person, this person that talks about flowers a lot, and it'll go do that. But if you're trying to get at actionable business insights and having it create something for you, you have to, I don't know if it's talk to it like a six-year-old, but there's some sort of level there where sometimes you even have to be repetitive or ask it questions to make sure it is giving you back what you think you're giving to it. It gets to be a lot more complicated than just write me a poem. Yeah, I think your chat GPT thing is really actually a good metaphor. It's representative of kind of where things
Starting point is 00:22:25 are heading, especially less complex models, right? So yeah, I spend a lot of time with it for help with everything from, hey, you know, how do I make this email make me not sound like a jerk to I need some help with this code, because I'm getting this air and I don't know where it's happening. And, you know, depending on exactly how you ask it a question or exactly how you prompt a question, right? So if you don't, if you think of it, um, as a, as a task completing fancy auto correct, and you start and think of it less as a, like I touched on early,
Starting point is 00:22:59 there's so much confusion over, Oh, chat GPT is AI. And people think it's an AGI, like data from Star Trek or the computer from Star Trek, and it's really not. So if you fundamentally understand something like that, it becomes a lot easier to work with. And if you have context about your company's own model, whether it's a chatbot or some sort of LLM or some sort of generative AI,
Starting point is 00:23:28 and you're the prompt engineer, the reason why you're seeing these jobs getting posted for obscene amounts of money is because it does take a long time to understand how those work under the hoods to be able to get them to do what you need to do. Yeah, yeah. No, it makes sense.
Starting point is 00:23:46 And the world of challenges that AI is opening up in organizations, again, which is why I go back to some of my frustration that you talked about some of NVIDIA's tools. They did a release last week on some new LLM support with software. They're open sourcing, and we can get into that a little bit if you want. I don't think you've had time to play with it hands-on yet. But still, I think as an industry, we could do better to help democratize these tools. I think a lot about ChatGBT, again, because it's such the obvious one, that a lot of organizations are reluctant to use it because there's no privacy there.
Starting point is 00:24:28 If you use the public version and you pump corporate data into it, there's no guarantee of anonymity or privacy with that data. Right. So it's hard for an organization to go hard on that. Plus, it stops its data set is what 2021 or something. So it's a little bit older and it's trained to do what it's been trained to do. Not necessarily what you want it to do. So how, what is the, the on-ramp for war for some small business, small enterprise that wants chat GPT like functionality across its internal assets with maybe some public websites or whatever that
Starting point is 00:25:07 it identifies as relevant. What is the on-ramp for an organization that wants to do that? And do they have to go staff up a couple million dollars in an AI department of people, which are hard to find, to go run something like that? So I think you actually hit the nail on the head there. The on-ramp looks like getting something set up internally and interacting with something like chat GPT-4, having a few really good initial prompts set up for it, for how it's supposed to help you and what it's supposed to do.
Starting point is 00:25:39 You know, a well-skilled veteran engineer could, you know, get an Azure instance in GPT-4 and have it plugged into a Slack bot where anytime you mention a Slack bot in a channel, it's able to respond and answer questions and help you do stuff. Even if it's just as simple as, you know, getting, you know, help me reformat this Excel table or help me clean up this company wide announcement, getting, easing into it and having everybody make sure they have a full understanding of what they're getting into. I think with something as simple as that as a Slack bot is far more valuable than trying to dump in head first, right? Because you quickly start to understand the limitations and the real practical uses of these things other than yeah write me a poem about Brian's
Starting point is 00:26:31 wonderful salt-and-pepper hair well out there you so what what's the practice well so think about it this way if I'm using Salesforce.com, most organizations use that for their CRM and sales funnel. If I want an AI model that analyzes that data and comes back with a... Because we know salespeople are notoriously awful at making their own sales forecasts, so that's why it's a lot of lick finger and stick it in there and hope for the best. But an AI model should be much more rigorous or could be much more rigorous in that process, assigning probabilities and coming up with a salesperson one, your predicted sales target this month is $28,325. And we can give it that rigor. Is it on the business to be able to come up with that model on
Starting point is 00:27:27 their own, to be able to take advantage of that and to make their sales process smarter? Or is that something that we should expect salesforce.com to create and say, give us your parameters. We'll certainly not value add. We'll charge you $10 a head a month for this AI tool that all it does is help you understand and better predictively analyze your sales funnel. Where do you think, based on what you're seeing, where's the pressure or where's the innovation going to come from? So I think that's kind of a complex question, right? So you've got to look at your own particular use case as a business. If I'm a, you know, 50 person shop selling repair services across, you know, like a Cincinnati area or something like that.
Starting point is 00:28:21 Yeah. Use the Salesforce built in one, use Google's, uh, Google sheets built in stuff, right? Get your feet wet, get in there. I don't think everybody needs a custom, you know, custom model built out right away. But if you're somebody like a financial institution, I don't think you want to wake up one day and see a headline of chase accidentally bankrupted, you know, because they turned on. That's the fear, right? We talked about that. Yeah. Yeah. It'll be interesting. I don't know why I'm so stuck on this, but as a practitioner yourself, what do you think small businesses can do? Where do they start? If they've got some IT generalists,
Starting point is 00:29:04 they're not ready to go all out on AI, but they want to understand better how it can impact their business. What's step one? Is it certifications? Is it training? Is it downloading Lama and hoping for the best? Like where do you even start to peel back this onion?
Starting point is 00:29:21 I think a really good place to start, and I'm kind of harkening over to the chat here. One of our readers has said that, you know, they started using chat GTT for their work. And I don't know if they're paying for it out of their own pocket or their company is but I mean, I would encourage a lot of companies to give your IT guy or give your dev a budget for open AI, the APIs, right? So every time you make a request, it costs money. Luckily, you don't pay too close attention to the MX bill, so we haven't had that problem yet. But yeah, you get even the free tiers. Before you go too deep on that, I want to talk about the levels of access to ChatGPT because I think you're right.
Starting point is 00:30:11 Maybe the easiest on-ramp is to start by consuming the tools that are publicly available in a safe way so that your data is not exposed. So everyone knows that there's the OpenAI ChatGPT that's what, up to 3.5, that's publicly available. You log in, you can use it. It doesn't cost you anything. Still has certain limitations. What are the paid versions of ChatGPT? Yeah, so you can pay, I think it's about 20 bucks a month and you get access to GPT-4, which is far, far, I mean,
Starting point is 00:30:52 it's almost like the difference between talking to a toddler and at least a 10 year old, right? It's, it's obviously smarter than that, but that's kind of how I, you know, how I would describe it is it's that next step of being able to follow instructions, being able to do things. Now, with that being said, there's still stuff like I've got access to, you know, the API side and I can call, there's different models. When you start getting into the API side, you had a credit card, monitor your billing. That's the first thing I'd say, because I definitely do some things.
Starting point is 00:31:18 Yeah. Yeah. And it's exactly that. You're paying per, you're paying per token on there and it can add up really quick on some larger projects, especially if you're automating the interaction with it. When you get onto the API side of things, you get access to a handful of models from OpenAI, which are pretty decent. They're good at different things.
Starting point is 00:31:40 There's some that are better at writing code. There's some that are better at writing code. There's some that are better at being creative. There's some that are better, you know, like GPT-4. You have a API version of it, which I found to be far less restrictive and more compliant with my requests, if that makes sense. Let me clarify that. When I say more compliant with the requests, if I ask the chat version of GPpt4 through the web browser hey um can you i need to write some code to um you know print out a list of uh you know 35 grocery items it'll
Starting point is 00:32:16 kind of give me sometimes it'll give you like a start of oh here's how you start that in python and then it'll put like an ellipsis and a comment that says, and keep going here. I found like the Yeah, I know. It's like, what the heck am I asking an AI for? Like, the first time I ever saw that and kind of ran into that brick wall, it was really frustrating, because it was like, what am I? What am I doing here? But when you move over to the API side, you start getting into a lot more freedom with it and then you can start using vector memory databases i use pine cone a lot in our lab um to help kind of with the long-term memory can you give it more data sources because like the native as we said the native chat gpt databases stops at a time in 2021. Yep. So then, then you're kind of your next step would be going and getting
Starting point is 00:33:09 an Azure GPT for instance. So that's, so that's, I'm just thinking about like in a small business, repetitive tasks like marketing emails or, or even in our own, we do a weekly newsletter, right? So right now we go in and we manually get the headline and a summary and a link and whatever. It's repetitive. It's one of those tasks that's really ripe for automation.
Starting point is 00:33:35 If I can't go to ChatGPT public free and say, make my newsletter for the week with a sassy attitude because it doesn't know that our content exists and I can't tell it to go crawl the site. What level of integration do I need with OpenAI to enable that kind of activity if I really wanted it to go make me the weekly newsletter in a sassy tone? So I think you got to look at that as a couple different things, right? So the content generation side of it, that's where you would go to use the open AI,
Starting point is 00:34:11 the AI or either your private instance on Azure. This is something we see a lot too, is the over-implementation, and overuse of AI. You know, the only tool you have is a hammer. Everything starts to look like a nail, right? Sure. But where that starts to get interesting is if you look at it from kind of a full-end perspective,
Starting point is 00:34:38 now we've got a weekly batch process that runs on our server that goes and pulls all of the links and everything, organizes it into a preset prompt of goes and pulls all the links and everything organizes it into a preset prompt of here's all the links and it pastes those so you're not doing all of that on ai but you can start automating your process of you know collect all the links programmatically with some sort of bot or auto assist job type job type thing, send that through the API with the request, and then you get back your text,
Starting point is 00:35:09 and that gets delivered to your inbox for approval and then blasting. So that's a really good use case, something I hadn't fully thought of. That would be how I would do it. Well, that's the other thing, too, is there are many ways to go after these things. So we've talked, gosh, a lot about all sorts of topics today. One that I don't want to neglect
Starting point is 00:35:31 though is hardware. And we're obviously hands-on with a lot of this stuff in the lab. We've got a project right now, we're working on workstations that have GPUs inside and need to access the data. And that's one of the big things, right? And not even just not even GPU direct storage. I mean, that's cool and all, but that's a step even further down the road of integration from a infrastructure standpoint. When you think about what workstations are doing, where some of these data scientists are doing that initial workload, the systems themselves don't have a ton of storage, despite the one that we just built with 200 terabytes or 300 terabytes. That's very rare. Getting access to more data though helps train the models more faster. Yeah. What are the challenges you're seeing
Starting point is 00:36:24 there as we're exploring this in real time? Yeah. So we're doing like, you know, we've got that, we've got a piece coming out about this. And so there'll be a lot of detail in there. We're doing kind of a free, you know, a little free tier of that, right? Where we're using single GPU workstations and doing kind of some research level model training and model inferencing for testing and validation. So when you start talking about being able to keep your GPUs fed and keep those, keep your ROI going, and having either shared or leased terms through your developer team to some really powerful GPU workstations or GPU servers, keeping that stuff fed is one of the most important things. You don't want your GPUs sitting idle, especially with the cost of something like an H100.
Starting point is 00:37:15 Right, right. So when we start looking at something like, I think our box is something like 80 terabytes of Gen 5 PCIe and vme in there uh and we have that shared at line speed uh we're at you know what would be the equivalent of a gen 5 drive in each of those workstations um at 80 terabytes so you have massive access, instantaneous access to all of these, you know, all of this either training data or validation data or data that you want to go and do an inference on to do to do the validation of your models. And having that not only centralized for
Starting point is 00:37:59 things like version control, or making sure your team's all working on the same things. But also being able to parallelize, you know, I'm sitting here working on a model that's slightly different than a model that you're working on. But we're working off the same data set to see who can come up with the same thing that iteration is getting in all of that stuff. And especially like the really compelling thing about the E1S is, like, that's one U in the rack. Each GPU server, when you start sticking in, you know,
Starting point is 00:38:33 like four, eight, whatever GPUs, you're, I mean, minimum two U, right, for the Dell liquid cooled. If you're not doing liquid cooling, you're at to four U in a lot of scenarios to get more than one GPU in a rack. So each instance, you're sucking up a lot of space to get these parts in there. And I think the XE9680, is that a 6U? I can't recall exactly. It's a 4U for the GPU server and a 2U essentially sitting on top of it. So now we're talking about, you know, we're sucking up so much rack space just to get the compute going.
Starting point is 00:39:12 What are we storing on it? It would be extremely expensive to outfit all those servers with, you know, 80 or 100 terabytes worth of storage space. And then you're still talking about... They don't even have the slots, right? Because that's the other thing is the GPU heavy servers often give up storage. Or you can look at a general purpose server with 24 bays, but then you're restricted on two add-in cards because you've got no power envelope. I mean, the hardware right now is a series of, I would say,
Starting point is 00:39:40 intelligent trade-offs to understand what you need, how much power your rack can even support. You mentioned Liquid on the 9640. And again, I'll plug the upcoming video. We have a monster video diving into Dell's latest GPU servers. Kevin and I were down in Texas last week looking at those guys. But ultimately, you're right. Having fast shared storage so that you can have direct links to your GPU servers, to your workstations, to whatever, I think it's going to be a pretty compelling story. And the, uh, the server guys, the software and the SSD guys are well,
Starting point is 00:40:19 and then video with the Knicks too, are really all trying to figure out how to, how to bundle that and communicate that to the market right now. And just what you're seeing on the E1S drives are actually Gen 4, but they're still extremely fast and wickedly dense. And the servers are so powerful now that we can keep inferencing cards in there. We're using an A2, but you could put two of them in there, I think, a couple L4s if you could even find them, which is the next challenge with any of the NVIDIA cards. But yeah, now we can inference on that data without moving it again,
Starting point is 00:40:58 which is pretty conceivably powerful stuff. Yep. And not to mention, we had talked a lot about creating the training data early on is a big challenge. Having that kind of stuff unified in one spot and a server that has power to do the normalization and standardization of that.
Starting point is 00:41:21 You can't just hold up a PDF and say, here, AI process. There's stuff you got to do to that. You can't just hold up a PDF and say, here, AI process. There's stuff you've got to do to that. So being able to do that on the file system side is getting really interesting. I think Vast actually was talking about some of that in one of their recent releases, and that's going to be hugely important. They've definitely set their sights on solving the holistic problem of AI data, right? And
Starting point is 00:41:52 they're going well outside of the scope of storage, I would suggest at this point and trying to do much, much more. And you can go on to your next topic. But I do think before the day is out, we need a fresh meme for this with Denzel instead of training day, training data and see what you can work up on that. I'm sure a Discord will get right on it. Yeah. Actually, I know I just said you could go again, but just a reminder for anyone listening in or watching the video on YouTube, we're also streaming this live right now to our Discord audience. So if you want to participate in the conversation, if you want to help tune my conversation with our interview guests, absolutely. You can submit questions, interact in real time, and we're doing another podcast pretty much right after this one that we'll be doing that. So keep up the conversation.
Starting point is 00:42:51 Jordan's already peppered in a couple questions. It really does give us a new flavor for these podcasts. We're excited about that. But carry on, Jordan. Yeah, I'm loving getting the real-time feedback and getting to talk to everybody. Where are you totally distracting me? I need a notepad when we do these from now on. We did this last time, too.
Starting point is 00:43:13 We're talking a lot about hardware. Oh, yeah, no, the density, right, and the shared storage. So when we look at our specific configuration, this will make a lot more sense to everybody when they read the article about this Keoxia storage and our GPU workstations being moved into the data center.
Starting point is 00:43:34 And you start talking about the massive amounts of checkpoints and different model files and different iterations that you can work through. Getting that ultra high-speed performance to be able to save that and then share it out to the team, that's the other thing that's extremely valuable.
Starting point is 00:43:52 We mostly work in a vacuum, so to speak, right? So it's me and Kevin in the lab, banging our heads against servers. We've got a new intern. Sometimes code. We've got a new intern today, so you got that guy to build our next. And every once in a while, some usable code comes out of that process. So when you start looking at this and thinking about it from an enterprise scale and version control,
Starting point is 00:44:19 and, hey, you know what, that model checkpoint that you made three days ago, whatever you did there was way better than the crap we're making today. Let's go back to that and work on that. And just having that shared across in speeds of that caliber is awesome. And I kind of touched on this at the end of that article, too. Because of the speed that the ConnectX cards provide back to the storage server, it's effectively like retrofitting PCIe Gen 5 SSD speeds into older servers or workstations that, you know, they've got perfectly fine GPUs.
Starting point is 00:44:58 RTX 8000s are, what, three, four years old now, and there's still plenty relevant for training and for AI. They've got boatloads of HPMM. But those platforms don't have, you know, maybe they don't even have NVMe days on the front, or, you know, like, I think our Lenovo's don't have NVMe, but we basically retrofitted that in there. So that another really cool benefit um to that to that project i think that's going to be a huge area of focus is you know i think yeah i don't want to get too speculative here because we end up talking about tape here in about five minutes but it's uh it's tape live tape libraries for ai i'm actually headed out to Denver in a couple weeks to visit with Quantum.
Starting point is 00:45:49 So I'm sure they would be over the moon if we started talking about tape fueling AI innovation. Well, tell them I'll dust off my i500 as soon as they send some updated drives. We've got, yeah, I mean, this is so back to kind of again once once you started this with the whole bi uh business intelligence analytics big data stuff right ai as we call ai is so all-encompassing of all of those disciplines and all of those fields, especially when you start bringing in the HPC requirements. It's really going to be a big unifier of technologies over the coming years,
Starting point is 00:46:34 and it's going to be more and more interesting. The Grace Hopper superchip is one that I'm excited to get hands-on with, hopefully in the near future here. And getting this stuff all pulled together and into something really crazy, it's really cool to watch. Well, we just put out the article last night
Starting point is 00:46:54 on the MLPerf scores from Grace Hopper 200. And yeah, it's pretty wild what's available there. But we're getting a little long here, but I do want to get one more comment from you on this. And you know what, if you guys love this AI chat, we'll do this again and talk about the rawness of what we're experiencing in real time as we try to solve these problems. And then also what the vendors are telling us in terms of their AI enablement in their solutions or for their customers. But we did a piece with OVH Cloud US on their GPU instances. And actually, that's the next podcast that we'll be recording is with OVH Cloud. They had some V100s they exposed to us.
Starting point is 00:47:41 And I think this is really an interesting dynamic. And you don't have to go deep here, but just give me your 30 seconds on your take on where cloud can be beneficial for AI. Getting access to some of this gear is really hard or really expensive or potentially both. The cloud can solve some of those problems for us, without cost but can solve it with immediacy if nothing else what what's your high-level take on on what we did with OVH and any findings there that are that are worth highlighting from from that review if you've got you know I see OVH fitting in in a really really good rapid push to market. I've got this model, and I need to get it up live on the internet
Starting point is 00:48:28 doing some inferencing. And it's so affordable. If you can fit into the memory limitations of the V100 on whatever work someone may be doing, then I think it's great. And if you are just someone studying and trying to learn how to get into this, how to interact with CUDA, how to work in the Linux Ubuntu server environment and working with different driver versions,
Starting point is 00:49:01 the ability to just turn on and turn off and delete and spin up and spin down all of those, you know, those instances is extremely valuable. I think I recall it's 88 cents per hour. So, you know, if I'm studying this, or I'm trying to learn this as a developer, or someone who's going through school, and maybe all I've got is got is you know a lower power laptop or desktop at home that can't really handle these tech 88 cents an hour i don't yeah they don't require i don't think they don't require any sort of like big upfront payment you can do hourly billing um if you watch it and you kind of get in there do your work turn it back off you can get access to some of the cutting edge you know tool, toolkits, libraries,
Starting point is 00:49:48 SDKs, and all that for very cheap to help supplement. As long as you go in with a plan, I think it's super affordable and you don't just, you know, turn it on and leave it on and forget about it. With a $2,000 cloud. Don't forget about it. Your credit card company will alert you to that. But I mean, like, yeah. Yeah. Yeah.
Starting point is 00:50:02 We don't have to go real deep there. I just wanted to highlight that we do have a review on the GPU instances where Jordan walks through how it works, some of these things, and some of the hands-on testing he did. And that is the next podcast. So by the time you hear this one, that one will be in the can and will be next up. So if you're interested in some of these concepts around AI in the cloud, that should be hopefully a very good conversation.
Starting point is 00:50:27 I encourage you to check that one out. Jordan, I've got to cut you off on this one, but this has been a great conversation. And like I said, we're pumping these live into Discord now, so join our Discord if you need that link. It'll be in the description of the show or it's linked in the top right corner of our website, storageview.com. Check it out and join the conversation. We want
Starting point is 00:50:51 to hear from you. Until then, Jordan, thanks for doing this again, buddy. Yep. Good to talk to you, Brian. All right. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.