Grey Beards on Systems - 169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI

Starting point is 00:00:00 Hey everybody, Ray Lacazie here. Jason Collier here. Welcome to another sponsored episode of the Greybirds on Storage podcast, a show where we get Greybirds bloggers together with storage assistant vendors to discuss upcoming products, technologies, co-founder of Kamawaza. So Luke, why don't you tell us a little bit about yourself and what's new with Kamawaza? Yeah, thanks for having me. So I'm the cliché I guess, serial entrepreneur.

Starting point is 00:00:46 This is my third venture-backed company, probably my sixth company that I've built to date. At Kamiwaza, really, we took about a year of just thinking, what was the impact of generative AI going to have on the enterprise? And what were the major hurdles that the enterprise is going to have to adopt generative AI? And in interviewing a lot of my CTO friends at Fortune 500,

Starting point is 00:01:13 et cetera, common themes kept coming up around data gravity, the fact that they have data in these enterprises across their cloud presence, their on premises, co-location sites and edge locations. They have lots of data, petabyte and exabytes of data. And how are they going to get AI that at that time they would see only running in, you know, SaaS services or with open AI, et cetera.

Starting point is 00:01:37 How are they going to get that access to the data so they can start to have this transformative moment? That transformative moment at Kamiwaza, we call that the fifth industrial revolution, where an enterprise can get 20% to 30% automated through agentic. These are autonomous agents that are running via AI, actually accomplishing workflows.

Starting point is 00:01:56 And we really set off to solve those pain points of, how do you get AI to access private data, and how do you get AI to manage massive amounts of data so you can really automate these workflows across the enterprise. Yeah, so we were just together at AI Field Day six in San Jose and there was a lot of talk at the session there about your data catalog and things of that nature.

Starting point is 00:02:20 You wanna talk a little bit about what the data catalog looks like? Or what are the labels, maybe? I don't know. Yeah, no. Either of those two. So there's two key technologies in that first. First is in that sort of onset to solve getting AI access to the data, we really realized you had to bring AI to the data because of data gravity and just the size of it and security, bringing mass amounts of data to AI wasn't going to be practical. In doing that, we built this concept of an inference mesh where you can actually install our orchestration,

Starting point is 00:02:55 our engine right next to the data via Docker containers, right onto GPU enabled servers. Now you can install that in your Cloud instances, in your data center or at the edge. Second was how do you actually process and manage this data when it's spread out there. So right now when our stack does the initial prepping of data for AI ingestion that means you have to sort of scan all the data, create embeddings of that data and then store those embeddings and say a vector database or a graph database.

Starting point is 00:03:26 We also add a secondary feature where we start to catalog all the metadata that we're building via those embeddings. That metadata gets put into a local data catalog. That local data catalog ties the affinity of that data to the local engine that actually processed it. And secondarily, that data also then gets put into a global catalog that each one of our stacks

Starting point is 00:03:49 at each location can reference. That then allows say a request that comes in at a data center A, but the data that it needs is partly in data center A and partly in data center B. It will first process and run the rag process in data center B, it will first process and run the RAG process in data center A, while simultaneously kicking off an inference request to data center B, where it also runs the RAG process locally, gets the result, sends only the resulting tokens back to data center A stack, where it's concatenated and you get a result. In doing this, we didn't have to move data,

Starting point is 00:04:22 we didn't have to bring the data from Data Center B to Data Center A, we're able to process the vast amounts of that data locally, and then only combine the results. That's really opened up the enterprise capability for inferencing, for running agents, and getting massive outcomes. Let me try to unpack all that. Yeah. So an inference comes along that says, you know, I want to look at, tell me

Starting point is 00:04:46 about, you know, corporations, I don't know, your product X, Y, or Z, and that information happens to span multiple data centers. So at that point, you spin up a process across your inference mesh that effectively does a rag activity on data center A and interprets the data, creates a tokenized embedded stream and loads that in the vector database on data center A. Then we have a portion of this, I'll call it the part data set, data sheet or something like that, sitting on data center B. And you fire up a similar application activity, I should say, on data center B to do the rag, to go look at the data set, the data sheet,

Starting point is 00:05:36 and embedded and tokenize that data and put it in a vector database on data center B. Now you've got these tokens, tokenized data for both the data center data. And you push, and you send that all back to data center A to answer the request, the prompt? Is that how it works? No, so what you described there is the prepping of the data. Yeah, yeah. You typically have to have a model first scan all the data and

Starting point is 00:06:08 Tokenize it and put those tokens and now they're in the form of embeddings and put that into the vector data So yeah So in those vector database effectively local to whatever the data affinity is right? That's correct there'd be a vector database for site A and for site B. And that vector database is going to hold all of those embeddings. Now, when a new inference request comes in to, say, data center A, the model looks up

Starting point is 00:06:36 what data it needs to resolve that inference by querying the vector database. And once it knows, say, the 10 points of data it needs, it then actually has to go grab that data itself in its raw form from data center A. And it then processes, it reads all that, it finds the information from within there, and then it comes up with a result for that inference. But there are many

Starting point is 00:07:06 inferences that, as you said, when you want product information and there's data that's required from both data center A and data center B, it does everything I talked about there in data center A. It looks up the vector database. It realizes it needs data that's in data center A. First, I'm sorry, it looks up the global catalog service. The global catalog service lets it know that there's data in data center A and there's data in data center B. For this prompt. Yeah, yeah. For the prompt.

Starting point is 00:07:33 That's correct. And it then grabs the vectors from data center A, it processes those, realizes it needs, let's call it 10 points of data, and it goes and grabs those 10 raw points of data in data center A, and it pulls it into the model and begins processing it. At the same time, it sends that prompt request over to data center B, does that whole thing in data center B, pulls the data that it needs,

Starting point is 00:07:58 once again, raw from data center B, processes it. It gets the resulting tokens of all of that process and sends only the resulting tokens back to data center A. Where data center A finishes its data processing, combines it with the results from data center B, and gives one overall response back to the prompt. Now, as I understand it, those sorts of things, the RAG is providing context to further elaborate the prompt

Starting point is 00:08:33 in a typical Gen.ai LLM type of call. So, I mean, this sort of stuff that we're talking about that happens to reside in data center A and data center B would be part of that context? Is that how I would read that? Or is the inferencing being done in both places? The inferencing is actually being done in both places with a third inference for simplistic terms that combines the two results. I think there's a light going off in my head. Now once again, simplistic, we're talking 10,000 inferences actually might run. But there's that overall concatenation that happens at the originating inference point. And so the user experience on

Starting point is 00:09:11 this is like, let's say, you know, as Ray was saying, you're looking for product specifications for this. So I'm assuming so enterprise customer goes to some type of internal portal types in what they're looking for. And then this kicks off this entire process where you've got, um, basically all of these connectors into these various data sources, right. Uh, at the data centers where, where you've got, say, you know, anything from unstructured data to being able to pull into, uh, databases and CRM systems and things like that, where you can actually extract that data. Is that, that how it kind of ties into that back end piece?

Starting point is 00:09:47 Yeah, I'd say at the, at this, at the, at the most parochial standpoint, you're correct. It would be a user talking to sort of chat interface typically at this scale in the enterprise, it's literally an autonomous agent that is running in some sort of application or is called via some sort of pipeline that kicks it off and then starts the whole process across the multiple sites and the multiple datasets. But yes, you can absolutely tune it, test it, and play with it through a chat interface if you need. Yeah, so there are a couple aspects of this that are magic sauce here.

Starting point is 00:10:25 Obviously, understanding, having the inference understand where the data might reside is a sort of a, I have never seen that in any of these discussions that I've been involved in here. But this is part of this distributed data aspect that you guys kind of got a lock on, I guess, is that. You, that's our focus for sure. And granted, we filed patents around this, et cetera, but it's, it's more because it's just our focus. Now,

Starting point is 00:10:55 if you think about it, I think people aren't dealing with this because they haven't moved from POC to production. And that, that, that's everything we do. We don't even do POCs. We literally call them POC to production and that that that's Everything we do we don't even do POCs. We literally call them POC to production We even have our customers prepay if they want to test something out and they can apply that to the production cost Because that production scale this is almost the first problem you have the second problem security and we can do a dive on that But if you think about it, you know, I love this example that came to us early on. Imagine you wanted to understand a very unique genomic sequence across tens of thousands or hundreds of thousands of people that you've sequenced genes from. And those sequences reside in multiple geographies

Starting point is 00:11:39 that are under the control of HIPAA and GDPR and say Asia PEC. You can't combine all of that data because of those regulations. So large companies in biopharma typically have to run very isolated tests and then hopefully get somebody to help them combine that data. But now imagine you can unleash the power of AI, say out of all of the genomic sequence, we're looking for this particular pattern match. You can then

Starting point is 00:12:05 push that to the local Kamiwaza installation running in each one of those regions, process petabytes or exabytes of data, get the amounts of ones that actually match, get their unique characteristics, have per se the phi data munged a little bit, and have all of that data come back to one AI agent that then recombines all of that and gives you a summary output and other deep analysis off of what each one of those found. It really is a game changer at doing AI at scale. Go ahead. It is.

Starting point is 00:12:35 And you mentioned that there's a preparation of this data aspect that also is facilitated by common ones. Yeah, it's a whole rag process, right? I mean, that that tokenizes embeds all this data into vector databases and these vector databases are sitting all over the place as well, right? Correct. We're really the middleware that stitches all that together. You can build your own embeddings and tokenization process to prep that data to be put into the vector and graph databases or you can use our pre-canned ones that are already just

Starting point is 00:13:08 built. All the notebooks and all the SDK that we have really can make this so it's fairly low code to actually accomplish those. But if the enterprise already has their own vector database solution they want to use, they want to fine-tune just for them the way that they're building embeddings, dense embeddings and sparse embeddings. They absolutely can. They just have to do it through our SDK and API so that our system knows to capture the metadata and put those into

Starting point is 00:13:37 the various global catalogs and local catalogs so that we can automate this capability for them in the future. You called your stack an opinionated stack. There's more to it obviously than ragdb, vector databases, rag aspects. It's this whole agentic stack almost, I guess, as I call it. I'm not sure what's called really. Yeah. We're not either.

Starting point is 00:14:05 So we've settled on this idea of an orchestration engine. It's about 155 packages. We're actually going to expand that in the next release to about 176 packages. And it's really the middleware that ties all of those together so that you have a simple API and SDK that you're interfacing with versus working with each one of those packages uniquely. That's also what allows us to automate

Starting point is 00:14:29 and tie all this together. And then because there's a single API that's presented out, you can now write agents, you can write third party applications right into our stack via that API, via that SDK. You can integrate current applications and agents right into it. And it now has the power of the data catalog and the inference mesh to span the entire

Starting point is 00:14:50 enterprise. And that really does make it so that you can move from POC to production extremely quick. Huh. So what's the Gen. AI that you're using here? It could be anything. I mean, obviously, llama 3.1 and all that stuff. But I mean, some customers might be more Gemini focused. Some might be more open AI focused.

Starting point is 00:15:18 I mean, can you use any gen AI as the, I don't know, with the bottom level processing engine for this or? Well, yes and kind of. So any open weight model, you can load right into our stack. We have Hug-In-Face connectors that make it very simple. You can literally just start typing the model like Quen 2.5, and you'll see 60 or so permutations of it pop right up, and you just click on it and

Starting point is 00:15:47 download all the files right from the interface. It goes right into there, and you literally click run, and it kicks it off, and you start having that model run on that particular stack. You can then push it east-west to the other engines in the other locations as well. Therefore, you're assured you're not downloading anything different or any versioning differences across them. Now, that means you can run literally any model that you have the open weights on.

Starting point is 00:16:14 So any open model or model that's licensed and put into Hugging Face or any of the other repositories. That also means you can go to third parties and buy custom models that you get the weights from. You can fine tune a model off the internet. So literally almost anything you can run in our infrastructure on our engines. On top of that, our inference mesh

Starting point is 00:16:35 does have an inference router. That's how you build a logic, and the logic gets built in, and that's how it distributes those inferences. And one of the things you can do is also load an API key from OpenAI, from Gemini, Anthropic, so that you can actually mix and match locally processing data plus actually using third party SAS hosted models. Also, you could even get into hosting another model,

Starting point is 00:17:04 say, at Fireworks or one of the other providers. And maybe they will have a Lama 3.1.70b model. And maybe you're also using the same Lama 3.1.70b model, but you're able to offload non-data processing, just big inference requests, to the larger third-party hoster, because maybe it's cheaper or maybe you have overflow. So this is really fungible developer environment to really mix and match the best of what the enterprise wants.

Starting point is 00:17:30 I haven't even mentioned the cloud. I mean, you could run these things in the cloud as well. I mean, obviously the stack could be deployed in a cloud solution as well. So oddly enough, our four publicly referenceable customers all had either tested or currently running production in Azure. So you can actually go to the Azure Marketplace and buy the Coming Wows of Stack Enterprise Edition, and it installs within about five minutes because it's

Starting point is 00:17:59 a prepackaged right from the marketplace right into your VPC. We also have the free version up there, the Community Edition that will marketplace right into your VPC. We also have the free version up there, the Community Edition that will deploy right into your VPC on a pre-canned server of your choice. So yeah, I mean, we're talking incredibly quick. And then of course, in Amazon and GCP, it's just installing the Docker containers and you get from us when you purchase the software

Starting point is 00:18:19 and you're off to the races. And we'll be in those marketplaces too as well. You mentioned the Community Edition here, and maybe we can talk a little bit about that. That's effectively a free download that anybody can use and install on. It's really a desktop solution, right? Yeah. Our Community Edition is really more of the focus that we want for the developer ecosystem. And it is the same as our enterprise edition,

Starting point is 00:18:52 actually, minus the ability to cluster via the underlining ray components and the ability to join the security of the inference mesh, our OAuth and SAML capabilities. And other than that, we wanted it to have the same API in SDK so that you could literally develop locally on any Mac laptop, let's say an M1 and above, any desktop with an RTX card.

Starting point is 00:19:17 The limitation is really going to be the model size that you're able to download. But you can then develop all the apps, you can develop into the API in the SDK locally and just push that into production. And that is something that we make free. Yeah, I was wondering, so on the Mac, it would actually use the metallic GPU and all that stuff. It's internally to the M1 and above engines, right? Yeah, it recognizes it's on an M1.

Starting point is 00:19:47 It downloads the Lama CPP. It executes Lama CPP using the Mac Metal shared memory. So all of our developers are lucky. They get their M4s with 192 gigs of RAM. And they're running QWEN 2.5 models for all of their local development code work. Um, and it's, it's a very powerful solution. The, uh, and as you mentioned, you can basically run any, any model that, you know, is on hugging face. Like if you want to do Lama fi, Jim and I, uh,

Starting point is 00:20:20 deep seek, of course, you know, can't, can't talk about AI without talking about deep seek. Um,-seek. And you've got the capability of running. Can you also actually mix, mix models within an environment? Yeah. So the, the, the Kamiwaza engine will actually, even auto spin up models if you preload them, all based on RAM availability and people hitting the actual endpoint. You can spin up multiple versions of the same model, multiple unique models,

Starting point is 00:20:51 and then you can get into multiple agents running on multiple unique models, even talking to each other, all from the same stack or the same cluster. It really just comes down to the amount of memory you have available, which is insane these days. If you think about it, we do all of our own large scale enterprise testing on AMD MI

Starting point is 00:21:12 300s. You load eight of those bad boys in a box and you have nearly, you know, terabyte and a half of memory. You could load DeepSeq V3. You could load several Lama 3.1 models. You could have several Quen 2.5 models, all running on that box, all equally about able to achieve together about 15,000 tokens a second. I'll give you guys an idea,

Starting point is 00:21:33 the human reading extremely fast is about 20 tokens a second. So you have hundreds, if not thousands of PhD grads just on that one server interacting. And then imagine you have three of those servers in our base enterprise cluster. So it's got high availability, et cetera. You're at 45,000 tokens a second of all of those models being able to interact. The outcomes you can drive within an enterprise are literally limitless.

Starting point is 00:21:59 I know that stack well. I'm going to have to talk to my friend Jason about getting one of those puppies in the basement. It will heat your house. I was gonna say you know like 10,000 watts of power. Maybe not for the house. All right, all right. So um gosh I mean it mean, talk to me a little bit about the agentic aspect of this. I mean, you mentioned that it could be an agent talking to multiple Gen.AI kinds of configurations, doing different things and all that stuff. I mean, the whole agentics discussion is kind of the newest thing coming out of AI, besides DeepSeq, of course.

Starting point is 00:22:48 kind of the newest thing coming out of AI besides DeepSeq, of course. So publicly, I love talking about this one use case with one of our customers, because they talk about it publicly. And for DHS CISA, which is the Cyber Infrastructure sort of group of Department of Homeland Security, their chief meteorologist is tasked with helping predict and understand the impact to America's critical infrastructure via weather events that are coming. And we were able to, with somebody that has extremely limited Python skill set, she's very smart, she's's climatologist, but Sunny over there, she didn't know a lot of Python.

Starting point is 00:23:29 We were able to give her one developer from one of our partners, her name was Emma, and she was a data scientist that also knew good Python development. And we were able to hand them a very large cluster from our friends over at Intel and running Gaudi 2s and 3s. And Emma was able to run an agentics app.

Starting point is 00:23:53 And one of these apps that plugs right on top of the Kamiwaza stack is called OpenHands. And OpenHands is an agent framework that you have a little chat window on the left and on the right you see what the agent is actually executing and doing. And Emma was able to say we need this data from these eight or nine locations and there was an internal couple locations and there was a bunch of external locations that were like hosting repositories for climate data at several college and universities and the such. She was able to give it the credentials and the location, the URLs. And this agent went out and downloaded, literally crawled those websites, found the data, downloaded all that data, downloaded all the data internally that was there.

Starting point is 00:24:39 And it was in multiple file formats, legacy formats that really don't even exist anymore, and we don't even know the schemas because this was 90 years of climatology data across all sources in America. And over the course of about eight hours, it unpacked all of that data. It put it all into HTML structures, uh, in object store, and it cleansed it all. And we're talking trillions of points of data, 1.3 billion rows and then it actually cleansed all the data, removed all the anomalies of the data

Starting point is 00:25:12 and it then prompted this data scientist on what it was actually seeing in the data and what graphs it could actually produce totally autonomously. It literally did all of the data transformation, the graphing, and then brought back to Emma, look at what we're seeing, look at the types of graphs we can produce, look at the hypothesis that you were trying to sort of figure out. And it's just amazing, that was just the start of it.

Starting point is 00:25:40 And we were able to combine so many other data sources with this agent and keep prompting the agent on what it was seeing And it would actually come back autonomously with variations in new graphs It was almost scary the hair on the back of all of our necks were thinning up as she was sort of replaying this But it is just the tip of what the power of a true sort of autonomous agent can do Yeah, it's almost like, you know, I've got this backlog of, of various text files and various text formats over the course of my very lengthy career, most of which I can't read anymore because all that those text processing

Starting point is 00:26:16 engines are all gone and stuff like that. So this sort of thing, it's, it's, it's, it's frigging amazing. Excuse the French. It really is. Another thing that was brought up in the discussions that I feel day six that you provide sort of an outcome per cluster. Is that how I understand it? So if a customer, let's say a national weather service or another organization says, you know, I've got all this data sitting in these various databases around the world. Can you pull it all together?

Starting point is 00:26:55 And so at the end of that discussion, this national weather service has this one database with all this information, all cleaned and prepped and all normalized for everything that they could possibly want right? It was yeah it was all in you know Delta Lake format it was all cleansed and it wasn't just that we were able to apply data from all like insurance claims that were publicly available by zip code all information from all public news claims that were publicly available by zip code, all information from all public news sites and websites on the weather impacts per zip code, time stamp to the actual low bear metric pressure time.

Starting point is 00:27:34 So fast forward, if a low bear metric pressure event is bearing down on this exact zip code where there's critical infrastructure, she could say, look, this last time it was there, it caused this level of damage, so this is what we should prepare for. It was 40, 50 other data sources and growing. And you know that on the outcome-based support, they could say, we have another new data source, or we're trying to achieve a new outcome from all of that data. Can you help us?

Starting point is 00:27:57 And our AI architects will do one of those per month, per cluster. That's built into the cost of the support for these services. We really wanna make it so that this isn't shelf-ware and that the enterprise is consistently growing with the new technology that's coming up from AI, the new capabilities in achieving those net new outcomes.

Starting point is 00:28:19 And the best way we thought of being able to provide that was to bake that idea of an outcome-based support into our enterprise cluster licensing. And think about it. You could start with, I want an AI purchasing agent to review all of my purchases from Adobe and all of the ELUAs that we have and all of the addendums and add orders. And I want all of those cleansed on my next renewal, tell me everything that needs to change and tell me what price I should be able to do that what my savings would be and what terms need to be in there and let me know when I have to do this by by what renewal and that could be the first outcome you get from there and then the second outcome could be I want this hey

Starting point is 00:28:58 guys can you help me load that now into service now we'd say great connect the stack to service now as API and once you've established that, and we'll send them the documentation to do that, we'll get on there with them and co-build that outcome. So now they can push all that into ServiceNow. It just keeps going and going. And that's actually the name of the company, Kamihwaza. It means superhuman in the business context

Starting point is 00:29:22 over there in Japan. And that's what we're trying to do, is not just replace current workflows in the business context over there in Japan. And that's what we're trying to do is not just replace current workflows in the enterprise, but elevate them to that superhuman capability. I was also mentioned at the AI Field Day six that robotic process automation solutions that are trying to do some of these things, obviously not nearly as sophisticated

Starting point is 00:29:43 nor nearly as successful. I mean, you feel it's something like Kamawaza and Agenic AI will alleviate all that or eliminate all that. Is that how you see this? Yeah, I don't see any path that standard RPA makes sense anymore. And I'm not saying that to be bombastic. It's the fact that you can have an agent, you know, literally move a mouse on a screen now and do the clicking for you. You don't have to with RPA actually program the mouse to move, you know, an inch to the right and three inches up and actually suppress the right mouse button or the left mouse

Starting point is 00:30:22 button or whatever. The agentic AI could literally look and understand what's on the screen and what it's trying to accomplish. The agentic AI can actually read the API of the systems it's interfacing with and actually infer from what you're trying to do, what APIs to pull and what data to send and what to accomplish. It can take the legacy RPA code that typically gets reduced to C sharp and immediately rewrite it into Python or any other application code

Starting point is 00:30:49 and actually host and run that as a service now on the Kamiwaza stack. It's absolutely amazing how fast we can rip and replace current RPA and actually start to work into the backlog of what these organizations wished RPA could have done to begin with. And that's one of the major outcomes

Starting point is 00:31:06 that a lot of our customers are working on is not only replaced our current RPA capabilities, but we really wanted RPA to do X or to do Y, or we were really hoping for this larger outcome. And that's now where we can come in there and actually help them achieve that. Like I said, free with a cluster license, one per month. And of course, our customers can buy additional outcomes, we call it, where we'll sort of increase their support and get in there and get those going even faster.

Starting point is 00:31:34 We've talked about kind of just the enterprise in a generic sense. Have there been any specific vertical market trends that you've seen as far as like any specific verticals, this is like kind of the killer app for. God, you mentioned biopharma things, right? Yeah, I think it's the killer app for everything. I like if we're just like saying that. I know people's are really used to mine. Luke, my God. I mean, even McKenzie says that 75% of all knowledge work across all industries could

Starting point is 00:32:08 be replaced with the current technology of agents today. Now, there's a lot that has to go into that, of course, and there's a lot of sort of human adoption and all that for that, but it is knowledge. Like we've actually turned knowledge into processing and now it can be reapplied. Now to give you a less generic answer, where we're seeing rapid adoption is in the more regulated environments. And this is very counterintuitive. But financial services, you have healthcare, yeah, the ones that have big compliance structures,

Starting point is 00:32:47 because they've already built every step and every guardrail into almost everything that they do. And that is very easy to map right into AI, where a lot of other organizations don't understand what it takes to do x and y and z. And they don't have oversight and they don't have a check balance built into all of those processes. When you already have all that, you can actually move it into AI incredibly quick and reduce even what minimal errors there were from humans with AI. That's a very task centric and checklist oriented approach, right? Yep, absolutely. Absolutely.

Starting point is 00:33:33 And I'll be frank, I think from my personal experience, 80% of the enterprises out there, you know, let's call it when I say enterprise, I'm talking the big guys, the billion dollar and above companies, 80% of their work is still done in Excel. And if you could have an agent reach into the SDK of Microsoft's office, understand everything inside of Excel, every frame, every format, every formula, and apply some Manta capabilities to extract it and to change it, right there is 80% to the workflows of these enterprises

Starting point is 00:34:01 that could be adopted by AI and enhanced by AI. Yeah, and if you tap into SAP, you'll do even better. You know the joke on that? I have a Fortune 500 that is gonna remove SAP because they are so far down the AI track. Oh my God. We went from tapping into SAP to feeding SAP to pulling data from SAP to why do we need SAP?

Starting point is 00:34:23 I kid you not. About three months. And this is not just SAP. I mean, any SaaS app out there, I mean, they all have potential. So my team got annoyed with Notion. Annoyed with Notion. With Notion, the SaaS app Notion. Annoyed with Notion. With Notion, the SaaS app Notion. And it didn't have enough of an API or extensibility for them.

Starting point is 00:34:55 And they effectively were able to build 80% of the features that we wanted from Notion in two days. They did it on their own little hackathon from Friday evening to Monday morning so that they could show us that we didn't need Notion and to sort of get product approval that they could then do the last 20% and that we would move off of Notion.

Starting point is 00:35:17 And they did that within the course of the following week. Oh my God. I found AI in general to be very much that 80-20 rule that it gets you about 80% of the way there. And then that rest of the 20% so that the human factor that you put into it to do the customization, I use it for doing the like code stuff all the time. I'm like, you know, you ask it to do something, you know, relatively simple. And then there's like

Starting point is 00:35:38 that little 20% of tweaking, but it gets you 80% of the way there. And it's 80% of the crap work that nobody wants to do. Right. Exactly. It imagine all of these have their own API's, they have their and X, Y, and Z, because I want to do X, Y, and Z with it. And it will literally spit out that 80 plus percent of that code for you right there. Yeah. Yeah. Well, yeah. Talked about Excel spreadsheets before and AI

Starting point is 00:36:14 trying to interpolate the information. It's been a challenge in the past for AI to work with. Works great with tokens, which are all numeric, but it doesn't work very well with Excel spreadsheets or, you know, common separated values and stuff like that. You think that's all passed now? It's not an issue anymore? Is that what you're trying to tell me? Yeah, absolutely not. So even what some of our initial use cases in the enterprises is actually in the standard data pipeline, where they're receiving CSV files and the breakage in those CSV files. And now you can pump them right through on a pipeline and have an AI semantically look

Starting point is 00:36:51 at them and understand that maybe they have a proper name in there that had an apostrophe and that's what's actually breaking the file because it will semantically know that that's a proper name and it will then reverse and fix that file right then and there and then spin it out clean on the other side of that pipeline and that is literally one of our first use cases and one of our funnest use cases that work with companies with because now it's augmenting the standard ML practices with semantic capabilities and getting these incredibly quick little wins and outcomes. Yeah, yeah, yeah, yeah look, all this is really on the inference side of the equation.

Starting point is 00:37:26 You're not doing anything specifically for training or fine training kinds of things, are you? I mean... No, so definitely nothing with training. We're going to leave that up to the labs and the people that are producing these models. We do have reinforcement capabilities and fine tuning capabilities. And that's really good for understanding nuance or lingua franca of an enterprise. So you can give eight examples, four good, four bad, maybe two of those

Starting point is 00:38:03 are sort of wildly out there but they're still sort of acceptable and we can actually reinforcement train a standard open weight model or do some fine tuning around that where we can back to the biopharma there's a lot of lingua franca stuff in biopharma that wasn't part of the corpus of training that these models were part of so we can actually inject that in via fine-tuning Those are all part of the packages That we ship with And that is part of sort of our outcome support as we would help with that

Starting point is 00:38:33 I'd say that reinforcement piece that that actually happens one out of two or so Of the outcomes because you're gonna want to you know, tell the model what's good What's acceptable? Yeah. Yeah. So what you're effectively doing is fine tuning the general AI solution to understand the corporate language and technology and terminology and that sort of thing, above and beyond the rag, which provides some of that as well, obviously. Correct. Because even you need, and sometimes might need a base inside of the model to even understand the data from the RAG. Yeah, yeah, yeah. You know, taxonomy is a major thing that we work with in the government.

Starting point is 00:39:16 I didn't understand why until somebody just pointed out, target can mean something different to all organizations in the government. Yeah, exactly. Very true. Exactly. And that's, that's, that's the easy one. I was looking at some SBIRs and yeah, the terminology used, it boggles the imagination, you have to spend some time just trying to get your hands around that.

Starting point is 00:39:39 And that's what that's fine tuning does for you. Hey, Luke, this has been great. I really appreciate your time here. Jason, any last questions for Luke before we close? Well, any last questions. If I'm an enterprise and I'm interested in exploring this further, how do we get ahold of you to schedule a demo slash, you know,

Starting point is 00:40:03 point of POC to production run kind of thing. Koc to production. Yeah, you can email hello at commie waza dot AI. You can also just go to our website. Of course, I can't be waza data AI. There's plenty of examples on there and contact forms. And then upcoming we've been doing a lot of fun events with AI tinkerers like a lot of these events. Typically we're in two cities a week,

Starting point is 00:40:25 either doing demos at their developer demos, and we're also doing hackathons. There's a whole list of events that we have going on for about the next three to four months up there on their website. You're still looking for people right here. We're hiring like crazy. I onboarded five people today before I picked up the mic to actually talk to you. Oh my god it is a

Starting point is 00:40:49 We have an unsationable amount of incoming opportunity and our Matt my CTO and co-founder just has a massive backlog a product development that he wants to accomplish To keep up with all the change and all the new innovation in AI So we're hiring like crazy to keep above that wave. Great. Great. Well, this has been great. Like thanks again for being on our show today. Yeah, thank you.

Starting point is 00:41:13 And that's it for now. Bye. Bye, Luke and bye, Jason. Bye, Ryan. Bye, people. Until next time. Next time we will talk to the most system storage technology person. Any questions you want us to ask, please let us know. And if you enjoy our podcast, tell your friends about it, please review us on Apple Podcasts, Google Play, and Spotify

Starting point is 00:41:34 as this will help get the word out. Music

Grey Beards on Systems - 169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI

GreyBeards talk Agentic AI with Kamiwaza's CEO, Luke Norris. The world of large enterprise IT is about to undergo another revolution and Kamiwaza AI seems to be leading the charge....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Grey Beards on Systems - 169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI

GreyBeards talk Agentic AI with Kamiwaza's CEO, Luke Norris. The world of large enterprise IT is about to undergo another revolution and Kamiwaza AI seems to be leading the charge....

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.