Grey Beards on Systems - 169: GreyBeards talk AgenticAI with Luke Norris, CEO&Co-founder, Kamiwaza AI
Episode Date: February 14, 2025GreyBeards talk Agentic AI with Kamiwaza's CEO, Luke Norris. The world of large enterprise IT is about to undergo another revolution and Kamiwaza AI seems to be leading the charge....
Transcript
Discussion (0)
Hey everybody, Ray Lacazie here.
Jason Collier here.
Welcome to another sponsored episode of the Greybirds on Storage podcast,
a show where we get Greybirds bloggers together with storage assistant vendors
to discuss upcoming products, technologies, co-founder of Kamawaza.
So Luke, why don't you tell us a little bit about yourself and what's new with Kamawaza?
Yeah, thanks for having me.
So I'm the cliché I guess, serial entrepreneur.
This is my third venture-backed company, probably my sixth
company that I've built to date.
At Kamiwaza, really, we took about a year
of just thinking, what was the impact of generative AI
going to have on the enterprise?
And what were the major hurdles that the enterprise
is going to have to adopt generative AI?
And in interviewing a lot of my CTO friends at Fortune 500,
et cetera, common themes kept coming up around data gravity,
the fact that they have data in these enterprises
across their cloud presence, their on premises,
co-location sites and edge locations.
They have lots of data, petabyte and exabytes of data.
And how are they going to get AI that at that time
they would see only running in, you know,
SaaS services or with open AI, et cetera.
How are they going to get that access to the data
so they can start to have this transformative moment?
That transformative moment at Kamiwaza,
we call that the fifth industrial revolution,
where an enterprise can get 20% to 30% automated
through agentic.
These are autonomous agents that are running via AI,
actually accomplishing workflows.
And we really set off to solve those pain points of,
how do you get AI to access private data,
and how do you get AI to manage massive amounts of data
so you can really automate these workflows
across the enterprise.
Yeah, so we were just together at AI Field Day six
in San Jose and there was a lot of talk at the session there
about your data catalog and things of that nature.
You wanna talk a little bit about
what the data catalog looks like?
Or what are the labels, maybe? I don't know.
Yeah, no.
Either of those two.
So there's two key technologies in that first. First is in that sort of onset to solve getting AI access to the data, we really realized you had to bring AI to the data because of data gravity and just the size of it and security, bringing mass amounts of data to AI wasn't going to be practical.
In doing that, we built this concept of an inference mesh where you can
actually install our orchestration,
our engine right next to the data via Docker containers,
right onto GPU enabled servers.
Now you can install that in your Cloud instances,
in your data center or at the edge. Second was how do you actually process and manage this data
when it's spread out there. So right now when our stack does the initial prepping
of data for AI ingestion that means you have to sort of scan all the data, create
embeddings of that data and then store those embeddings and say a vector
database or a graph database.
We also add a secondary feature where
we start to catalog all the metadata that we're
building via those embeddings.
That metadata gets put into a local data catalog.
That local data catalog ties the affinity of that data
to the local engine that actually processed it.
And secondarily, that data also then gets put
into a global catalog that each one of our stacks
at each location can reference.
That then allows say a request that comes in
at a data center A, but the data that it needs
is partly in data center A and partly in data center B.
It will first process and run the rag process in data center B, it will first process and run the RAG process in data center A,
while simultaneously kicking off an inference request to data center B, where it also runs
the RAG process locally, gets the result, sends only the resulting tokens back to data center A
stack, where it's concatenated and you get a result. In doing this, we didn't have to move data,
we didn't have to bring the data from Data Center B to Data Center A,
we're able to process the vast amounts of that data locally,
and then only combine the results.
That's really opened up the enterprise capability for inferencing,
for running agents, and getting massive outcomes.
Let me try to unpack all that.
Yeah.
So an inference comes along that says, you know, I want to look at, tell me
about, you know, corporations, I don't know, your product X, Y, or Z, and that information
happens to span multiple data centers.
So at that point, you spin up a process across your inference mesh that effectively does
a rag activity on data center A and interprets the data, creates a tokenized embedded stream
and loads that in the vector database on data center A. Then we have a portion of this,
I'll call it the part data set, data sheet or something
like that, sitting on data center B. And you fire up a similar application activity, I
should say, on data center B to do the rag, to go look at the data set, the data sheet,
and embedded and tokenize that data and put it in a vector database on data center B. Now you've got these tokens,
tokenized data for both the data center data.
And you push, and you send that all back to data center A
to answer the request, the prompt?
Is that how it works?
No, so what you described there is the prepping of the data.
Yeah, yeah.
You typically have to have a model first scan all the data and
Tokenize it and put those tokens and now they're in the form of embeddings and put that into the vector data
So yeah
So in those vector database effectively local to whatever the data affinity is right? That's correct
there'd be a vector database for site A and for site B.
And that vector database is going
to hold all of those embeddings.
Now, when a new inference request comes in to, say,
data center A, the model looks up
what data it needs to resolve that inference by querying
the vector database.
And once it knows, say, the 10 points of data it needs,
it then actually has to go grab that data itself
in its raw form from data center A.
And it then processes, it reads all that,
it finds the information from within there,
and then it comes up with a result for that inference. But there are many
inferences that, as you said, when you want product information and there's data that's required from
both data center A and data center B, it does everything I talked about there in data center A.
It looks up the vector database. It realizes it needs data that's in data center A. First,
I'm sorry, it looks up the global catalog service. The global catalog service lets it know that there's data in data center A and there's
data in data center B.
For this prompt.
Yeah, yeah.
For the prompt.
That's correct.
And it then grabs the vectors from data center A, it processes those, realizes it needs,
let's call it 10 points of data, and it goes and grabs those 10 raw points of data
in data center A, and it pulls it into the model
and begins processing it.
At the same time, it sends that prompt request
over to data center B, does that whole thing
in data center B, pulls the data that it needs,
once again, raw from data center B, processes it.
It gets the resulting tokens of all of that process
and sends only the resulting tokens back to data center A.
Where data center A finishes its data processing, combines it with the results from data center B, and gives one overall
response back to the prompt. Now, as I understand it, those sorts of things, the
RAG is providing
context to
further elaborate the prompt
in a typical Gen.ai LLM type of call.
So, I mean, this sort of stuff that we're talking about that happens to reside in data center A and data center B
would be part of that context?
Is that how I would read that?
Or is the inferencing being done in both places? The inferencing is actually being done in
both places with a third inference for simplistic terms that combines the two results.
I think there's a light going off in my head. Now once again, simplistic, we're talking
10,000 inferences actually might run. But there's that overall concatenation that happens at the originating inference point. And so the user experience on
this is like, let's say, you know, as Ray was saying, you're looking for product specifications
for this. So I'm assuming so enterprise customer goes to some type of internal portal types in
what they're looking for. And then this kicks off this entire process where you've got, um, basically
all of these connectors into these various data sources, right.
Uh, at the data centers where, where you've got, say, you know, anything
from unstructured data to being able to pull into, uh, databases and CRM
systems and things like that, where you can actually extract that data.
Is that, that how it kind of ties into that back end piece?
Yeah, I'd say at the, at this, at the, at the most parochial standpoint, you're
correct.
It would be a user talking to sort of chat interface typically at this scale in
the enterprise, it's literally an autonomous agent that is running in some
sort of application or is called via some sort of pipeline that kicks it off
and then starts the whole process across the multiple sites and the multiple datasets.
But yes, you can absolutely tune it, test it, and play with it through a chat interface if you need.
Yeah, so there are a couple aspects of this that are magic sauce here.
Obviously, understanding, having the inference understand
where the data might reside is a sort of a,
I have never seen that in any of these discussions
that I've been involved in here.
But this is part of this distributed data aspect
that you guys kind of got a lock on, I guess, is that.
You, that's our focus for sure. And granted, we filed patents around this,
et cetera, but it's, it's more because it's just our focus. Now,
if you think about it,
I think people aren't dealing with this because they haven't moved from POC to
production. And that, that, that's everything we do.
We don't even do POCs. We literally call them POC to production and that that that's Everything we do we don't even do POCs. We literally call them POC to production We even have our customers prepay if they want to test something out and they can apply that to the production cost
Because that production scale this is almost the first problem you have the second problem security and we can do a dive on that
But if you think about it, you know, I love this example that came to us early on. Imagine you
wanted to understand a very unique genomic sequence across tens of thousands or hundreds of thousands
of people that you've sequenced genes from. And those sequences reside in multiple geographies
that are under the control of HIPAA and GDPR and say Asia PEC. You can't combine all of that data because of those regulations.
So large companies in biopharma
typically have to run very isolated tests
and then hopefully get somebody to help them combine that data.
But now imagine you can unleash the power of AI,
say out of all of the genomic sequence,
we're looking for this particular pattern match.
You can then
push that to the local Kamiwaza installation running in each one of those regions, process
petabytes or exabytes of data, get the amounts of ones that actually match, get their unique
characteristics, have per se the phi data munged a little bit, and have all of that data come back
to one AI agent that then recombines all of that and gives you a summary output and other deep analysis off of what each one
of those found.
It really is a game changer at doing AI at scale.
Go ahead.
It is.
And you mentioned that there's a preparation of this data aspect that also is facilitated
by common ones.
Yeah, it's a whole rag process, right? I mean, that
that tokenizes embeds all this data into vector databases and these vector
databases are sitting all over the place as well, right? Correct. We're really the
middleware that stitches all that together. You can build your own
embeddings and tokenization process to prep that data to be put into the vector
and graph databases or you can use our pre-canned ones that are already just
built. All the notebooks and all the SDK that we have really can make this so
it's fairly low code to actually accomplish those. But if the enterprise
already has their own vector database solution they want to use, they want to
fine-tune just for them the way that they're building embeddings,
dense embeddings and sparse embeddings.
They absolutely can.
They just have to do it through our SDK and API so that
our system knows to capture the metadata and put those into
the various global catalogs and local catalogs so that we can
automate this capability for them in the future.
You called your stack an opinionated stack.
There's more to it obviously than
ragdb, vector databases, rag aspects.
It's this whole agentic stack almost, I guess, as I call it.
I'm not sure what's called really.
Yeah. We're not either.
So we've settled on this idea of an orchestration engine.
It's about 155 packages.
We're actually going to expand that in the next release
to about 176 packages.
And it's really the middleware that ties all of those together
so that you have a simple API and SDK
that you're interfacing with versus working with each one of those packages uniquely.
That's also what allows us to automate
and tie all this together.
And then because there's a single API that's presented out,
you can now write agents,
you can write third party applications right into our stack
via that API, via that SDK.
You can integrate current applications
and agents right into it.
And it now has the power of the data catalog and the inference mesh to span the entire
enterprise.
And that really does make it so that you can move from POC to production extremely quick.
Huh.
So what's the Gen. AI that you're using here?
It could be anything.
I mean, obviously, llama 3.1 and all that stuff.
But I mean, some customers might be more Gemini focused.
Some might be more open AI focused.
I mean, can you use any gen AI as the, I don't know,
with the bottom level processing engine for this or?
Well, yes and kind of.
So any open weight model,
you can load right into our stack.
We have Hug-In-Face connectors that make it very simple.
You can literally just start typing the model like Quen 2.5,
and you'll see 60 or so permutations of it pop right up, and you just click on it and
download all the files right from the interface. It goes right into there, and you literally click
run, and it kicks it off, and you start having that model run on that particular stack. You can
then push it east-west to the other engines in the other locations as well. Therefore, you're assured
you're not downloading
anything different or any versioning differences
across them.
Now, that means you can run literally any model
that you have the open weights on.
So any open model or model that's licensed
and put into Hugging Face or any of the other repositories.
That also means you can go to third parties
and buy custom models that you get the weights from.
You can fine tune a model off the internet.
So literally almost anything you can
run in our infrastructure on our engines.
On top of that, our inference mesh
does have an inference router.
That's how you build a logic, and the logic gets built in,
and that's how it distributes those inferences.
And one of the things you can do is also load an API key
from OpenAI, from Gemini, Anthropic,
so that you can actually mix and match locally processing data
plus actually using third party SAS hosted models.
Also, you could even get into hosting another model,
say, at Fireworks or one of the other providers.
And maybe they will have a Lama 3.1.70b model.
And maybe you're also using the same Lama 3.1.70b model,
but you're able to offload non-data processing, just
big inference requests, to the larger third-party hoster,
because maybe it's cheaper or maybe you have overflow.
So this is really fungible developer environment to really mix and match the
best of what the enterprise wants.
I haven't even mentioned the cloud. I mean, you could run these things in the
cloud as well. I mean, obviously the stack could be deployed in a cloud
solution as well.
So oddly enough, our four publicly referenceable customers all had either tested
or currently running production in Azure.
So you can actually go to the Azure Marketplace and buy
the Coming Wows of Stack Enterprise Edition,
and it installs within about five minutes because it's
a prepackaged right from the marketplace right into your VPC.
We also have the free version up there,
the Community Edition that will marketplace right into your VPC. We also have the free version up there, the Community Edition that will deploy right into your VPC
on a pre-canned server of your choice.
So yeah, I mean, we're talking incredibly quick.
And then of course, in Amazon and GCP,
it's just installing the Docker containers
and you get from us when you purchase the software
and you're off to the races.
And we'll be in those marketplaces too as well.
You mentioned the Community Edition here, and maybe we can talk a little bit about that.
That's effectively a free download that anybody can use and install on.
It's really a desktop solution, right?
Yeah. Our Community Edition is
really more of the focus that we want for the developer ecosystem.
And it is the same as our enterprise edition,
actually, minus the ability to cluster
via the underlining ray components
and the ability to join the security of the inference mesh,
our OAuth and SAML capabilities.
And other than that, we wanted it to have the same API in
SDK so that you could literally develop locally on any Mac laptop,
let's say an M1 and above,
any desktop with an RTX card.
The limitation is really going to be
the model size that you're able to download.
But you can then develop all the apps,
you can develop into the API in
the SDK locally
and just push that into production. And that is something that we make free.
Yeah, I was wondering, so on the Mac, it would actually use the metallic GPU and all that stuff.
It's internally to the M1 and above engines, right? Yeah, it recognizes it's on an M1.
It downloads the Lama CPP.
It executes Lama CPP using the Mac Metal shared memory.
So all of our developers are lucky.
They get their M4s with 192 gigs of RAM.
And they're running QWEN 2.5 models
for all of their local development code work. Um, and it's, it's a very powerful solution.
The, uh, and as you mentioned, you can basically run any, any model that, you
know, is on hugging face. Like if you want to do Lama fi, Jim and I, uh,
deep seek, of course, you know, can't,
can't talk about AI without talking about deep seek. Um,-seek. And you've got the capability of running.
Can you also actually mix, mix models within an environment?
Yeah. So the, the, the Kamiwaza engine will actually,
even auto spin up models if you preload them,
all based on RAM availability and people hitting the actual endpoint.
You can spin up multiple versions of the same model,
multiple unique models,
and then you can get into multiple agents
running on multiple unique models,
even talking to each other,
all from the same stack or the same cluster.
It really just comes down to the amount of memory
you have available,
which is insane these days. If you think about it,
we do all of our own large scale enterprise testing on AMD MI
300s. You load eight of those bad boys in a box and you have nearly, you know,
terabyte and a half of memory. You could load DeepSeq V3.
You could load several Lama 3.1 models.
You could have several Quen 2.5 models,
all running on that box,
all equally about able to achieve together
about 15,000 tokens a second.
I'll give you guys an idea,
the human reading extremely fast
is about 20 tokens a second.
So you have hundreds, if not thousands of PhD grads
just on that one server interacting.
And then imagine you have three of those servers in our base enterprise cluster.
So it's got high availability, et cetera.
You're at 45,000 tokens a second of all of those models being able to interact.
The outcomes you can drive within an enterprise are literally limitless.
I know that stack well.
I'm going to have to talk to my friend Jason about getting one of those puppies in the basement. It will heat your house. I was gonna say you know like 10,000
watts of power. Maybe not for the house.
All right, all right. So um gosh I mean it mean, talk to me a little bit about the agentic aspect of this.
I mean, you mentioned that it could be an agent talking to multiple
Gen.AI kinds of configurations, doing different things and all that stuff.
I mean, the whole agentics discussion is kind of the newest thing coming out of AI,
besides DeepSeq, of course.
kind of the newest thing coming out of AI besides DeepSeq, of course. So publicly, I love talking about this one use case with one of our customers, because
they talk about it publicly.
And for DHS CISA, which is the Cyber Infrastructure sort of group of Department of Homeland Security,
their chief meteorologist is tasked with helping predict and
understand the impact to America's critical infrastructure via weather events that are coming.
And we were able to, with somebody that has extremely limited Python skill set, she's very
smart, she's's climatologist,
but Sunny over there, she didn't know a lot of Python.
We were able to give her one developer
from one of our partners, her name was Emma,
and she was a data scientist that also knew
good Python development.
And we were able to hand them a very large cluster
from our friends over at Intel and
running Gaudi 2s and 3s.
And Emma was able to run an agentics app.
And one of these apps that plugs right on top of the Kamiwaza stack is called OpenHands.
And OpenHands is an agent framework that you have a little chat window on the left and on the right you see what
the agent is actually executing and doing. And Emma was able to say we need this data from these
eight or nine locations and there was an internal couple locations and there was a bunch of external
locations that were like hosting repositories for climate data at several college and universities and the such.
She was able to give it the credentials and the location, the URLs.
And this agent went out and downloaded, literally crawled those websites, found the data,
downloaded all that data, downloaded all the data internally that was there.
And it was in multiple file formats, legacy formats that really don't even exist anymore,
and we don't even know the schemas because this was 90 years of
climatology data across all sources in America.
And over the course of about eight hours, it unpacked all of that data.
It put it all into HTML structures, uh, in object store,
and it cleansed it all.
And we're talking trillions of points of data,
1.3 billion rows and then it actually cleansed all the data, removed all the anomalies of the data
and it then prompted this data scientist on what it was actually seeing in the data and what graphs
it could actually produce totally autonomously. It literally did all of the data transformation,
the graphing, and then brought back to Emma,
look at what we're seeing,
look at the types of graphs we can produce,
look at the hypothesis that you were trying
to sort of figure out.
And it's just amazing, that was just the start of it.
And we were able to combine so many other data sources
with this agent and keep prompting the agent on what it was seeing
And it would actually come back autonomously with variations in new graphs
It was almost scary the hair on the back of all of our necks were thinning up as she was sort of replaying this
But it is just the tip of what the power of a true sort of autonomous agent can do
Yeah, it's almost like, you know, I've got this backlog of, of various text files
and various text formats over the course of my very lengthy career,
most of which I can't read anymore because all that those text processing
engines are all gone and stuff like that.
So this sort of thing, it's, it's, it's, it's frigging amazing.
Excuse the French.
It really is.
Another thing that was brought up in the discussions that I feel day six that you provide sort
of an outcome per cluster. Is that how I understand it? So if a customer, let's say a national weather service or another organization says,
you know, I've got all this data sitting in these various databases around the world.
Can you pull it all together?
And so at the end of that discussion, this national weather service has this one database
with all this information, all cleaned and prepped and all normalized for
everything that they could possibly want right? It was yeah it was all in you know
Delta Lake format it was all cleansed and it wasn't just that we were able to
apply data from all like insurance claims that were publicly available by
zip code all information from all public news claims that were publicly available by zip code, all information
from all public news sites and websites on the weather impacts per zip code, time stamp
to the actual low bear metric pressure time.
So fast forward, if a low bear metric pressure event is bearing down on this exact zip code
where there's critical infrastructure, she could say, look, this last time it was there,
it caused this level of damage, so this is what we should prepare for.
It was 40, 50 other data sources and growing.
And you know that on the outcome-based support,
they could say, we have another new data source,
or we're trying to achieve a new outcome from all of that data.
Can you help us?
And our AI architects will do one of those per month,
per cluster.
That's built into the cost of the support
for these services.
We really wanna make it so that this isn't shelf-ware
and that the enterprise is consistently growing
with the new technology that's coming up from AI,
the new capabilities in achieving those net new outcomes.
And the best way we thought of being able to provide that
was to bake that idea of an outcome-based support into our enterprise cluster licensing. And think about it. You could start with,
I want an AI purchasing agent to review all of my purchases from Adobe and all of the
ELUAs that we have and all of the addendums and add orders. And I want all of those cleansed
on my next renewal, tell me everything that needs to change and tell me what price I should be able to do
that what my savings would be and what terms need to be in there and let me
know when I have to do this by by what renewal and that could be the first
outcome you get from there and then the second outcome could be I want this hey
guys can you help me load that now into service now we'd say great connect the
stack to service now as API and once you've established that, and we'll
send them the documentation to do that,
we'll get on there with them and co-build that outcome.
So now they can push all that into ServiceNow.
It just keeps going and going.
And that's actually the name of the company, Kamihwaza.
It means superhuman in the business context
over there in Japan.
And that's what we're trying to do, is not just replace current workflows in the business context over there in Japan. And that's what we're trying to do is not just replace
current workflows in the enterprise,
but elevate them to that superhuman capability.
I was also mentioned at the AI Field Day six
that robotic process automation solutions
that are trying to do some of these things,
obviously not nearly as sophisticated
nor nearly as successful.
I mean, you feel it's something like Kamawaza and Agenic AI will alleviate all
that or eliminate all that. Is that how you see this? Yeah, I don't see any path that standard
RPA makes sense anymore. And I'm not saying that to be
bombastic. It's the fact that you can have an agent, you know, literally move a
mouse on a screen now and do the clicking for you. You don't have to with
RPA actually program the mouse to move, you know, an inch to the right and three
inches up and actually suppress the right mouse button or the left mouse
button or whatever. The agentic AI could literally look and understand what's on the screen and
what it's trying to accomplish.
The agentic AI can actually read the API of the systems it's interfacing with and
actually infer from what you're trying to do,
what APIs to pull and what data to send and what to accomplish.
It can take the legacy RPA code that typically gets reduced to C sharp
and immediately rewrite it into Python
or any other application code
and actually host and run that as a service now
on the Kamiwaza stack.
It's absolutely amazing how fast we can rip
and replace current RPA
and actually start to work into the backlog
of what these organizations wished RPA
could have done to begin with.
And that's one of the major outcomes
that a lot of our customers are working on
is not only replaced our current RPA capabilities,
but we really wanted RPA to do X or to do Y,
or we were really hoping for this larger outcome.
And that's now where we can come in there
and actually help them achieve that.
Like I said, free with a cluster license,
one per month. And of course, our customers can buy additional outcomes, we call it, where we'll sort of increase their support and get in there and get those going even faster.
We've talked about kind of just the enterprise in a generic sense. Have there been any specific
vertical market trends that you've seen as far as like any specific verticals, this is like kind of the killer app for.
God, you mentioned biopharma things, right?
Yeah, I think it's the killer app for everything.
I like if we're just like saying that.
I know people's are really used to mine.
Luke, my God.
I mean, even McKenzie says that 75% of all knowledge work across all industries could
be replaced with the current technology of agents today.
Now, there's a lot that has to go into that, of course, and there's a lot of sort of human
adoption and all that for that, but it is knowledge.
Like we've actually turned knowledge into processing and now it can be reapplied. Now
to give you a less generic answer, where we're seeing rapid
adoption is in the more regulated environments. And this
is very counterintuitive. But financial services, you have
healthcare, yeah, the ones that have big compliance structures,
because they've already built every step and every guardrail into almost everything that they do. And
that is very easy to map right into AI, where a lot of other
organizations don't understand what it takes to do x and y and
z. And they don't have oversight and they don't have a check
balance built into all of those processes. When you already have all that, you can actually move it into AI incredibly quick and reduce even what minimal errors there were from humans with AI.
That's a very task centric and checklist oriented approach, right?
Yep, absolutely.
Absolutely.
And I'll be frank, I think from my personal experience, 80% of the enterprises out there, you know, let's call it when I say enterprise, I'm talking the big guys, the billion dollar
and above companies, 80% of their work is still done in Excel.
And if you could have an agent reach
into the SDK of Microsoft's office,
understand everything inside of Excel, every frame,
every format, every formula, and apply some Manta capabilities
to extract it and to change it, right there
is 80% to the workflows of these enterprises
that could be adopted by AI and enhanced by AI.
Yeah, and if you tap into SAP, you'll do even better.
You know the joke on that?
I have a Fortune 500 that is gonna remove SAP
because they are so far down the AI track.
Oh my God.
We went from tapping into SAP to feeding SAP
to pulling data from SAP to why do we need SAP?
I kid you not.
About three months.
And this is not just SAP. I mean, any SaaS app out there, I mean,
they all have potential.
So my team got annoyed with Notion.
Annoyed with Notion. With Notion, the SaaS app Notion. Annoyed with Notion.
With Notion, the SaaS app Notion.
And it didn't have enough of an API or extensibility for them.
And they effectively were able to build 80% of the features
that we wanted from Notion in two days.
They did it on their own little hackathon
from Friday evening to Monday morning
so that they could show us that we didn't need Notion
and to sort of get product approval
that they could then do the last 20%
and that we would move off of Notion.
And they did that within the course of the following week.
Oh my God.
I found AI in general to be very much that 80-20 rule
that it gets you about 80% of the way there. And then that
rest of the 20% so that the human factor that you put into
it to do the customization, I use it for doing the like code
stuff all the time. I'm like, you know, you ask it to do
something, you know, relatively simple. And then there's like
that little 20% of tweaking, but it gets you 80% of the way
there. And it's 80% of the crap work that nobody wants to do.
Right. Exactly. It imagine all of these have their own API's, they have their and X, Y, and Z, because I want to do X, Y, and Z with it. And it will literally spit out that 80 plus percent
of that code for you right there.
Yeah.
Yeah.
Well, yeah.
Talked about Excel spreadsheets before and AI
trying to interpolate the information.
It's been a challenge in the past for AI to work with.
Works great with tokens, which are all numeric, but it doesn't work very well with
Excel spreadsheets or, you know, common separated values and stuff like that.
You think that's all passed now? It's not an issue anymore? Is that what you're trying to tell me?
Yeah, absolutely not. So even what some of our initial use cases in the enterprises is actually
in the standard data pipeline, where they're receiving CSV files and the breakage in those CSV files.
And now you can pump them right through on a pipeline and have an AI semantically look
at them and understand that maybe they have a proper name in there that had an apostrophe
and that's what's actually breaking the file because it will semantically know that that's
a proper name and it will then reverse and fix that file right then and there and then
spin it out clean on the other side of that pipeline and that is
literally one of our first use cases and one of our funnest use cases that work
with companies with because now it's augmenting the standard ML practices
with semantic capabilities and getting these incredibly quick little wins and
outcomes. Yeah, yeah, yeah, yeah look, all this is really on the inference side of the equation.
You're not doing anything specifically for training or fine training kinds of things, are you? I mean...
No, so definitely nothing with training. We're going to leave that up to the labs and the people
that are producing these models. We do have reinforcement capabilities
and fine tuning capabilities.
And that's really good for understanding nuance
or lingua franca of an enterprise.
So you can give eight examples,
four good, four bad, maybe two of those
are sort of wildly out there but they're
still sort of acceptable and we can actually reinforcement train a standard
open weight model or do some fine tuning around that where we can back to the
biopharma there's a lot of lingua franca stuff in biopharma that wasn't part of
the corpus of training that these models were part of so we can actually inject that in via fine-tuning
Those are all part of the packages
That we ship with
And that is part of sort of our outcome support as we would help with that
I'd say that reinforcement piece that that actually happens one out of two or so
Of the outcomes because you're gonna want to you know, tell the model what's good
What's acceptable? Yeah. Yeah. So what you're effectively doing is fine tuning the general AI solution to
understand the corporate language and technology and terminology and that sort of thing, above
and beyond the rag, which provides some of that as well, obviously.
Correct. Because even you need, and sometimes might need a base inside of the model to even
understand the data from the RAG. Yeah, yeah, yeah.
You know, taxonomy is a major thing that we work with in the government.
I didn't understand why until somebody just pointed out,
target can mean something different to all organizations in the government.
Yeah, exactly.
Very true.
Exactly.
And that's, that's, that's the easy one.
I was looking at some SBIRs and yeah, the terminology used, it boggles the
imagination, you have to spend some time just trying to get your hands around that.
And that's what that's fine tuning does for you.
Hey, Luke, this has been great.
I really appreciate your time here.
Jason, any last questions for Luke before we close?
Well, any last questions.
If I'm an enterprise and I'm interested
in exploring this further, how do we get ahold of you
to schedule a demo slash, you know,
point of POC to production run kind of thing.
Koc to production. Yeah, you can email hello at
commie waza dot AI. You can also just go to our
website. Of course, I can't be waza data AI.
There's plenty of examples on there and contact forms.
And then upcoming we've been doing a lot of fun
events with AI tinkerers like a lot of these events.
Typically we're in two cities a week,
either doing demos at their developer demos,
and we're also doing hackathons.
There's a whole list of events that we have going on for
about the next three to four months up there on their website.
You're still looking for people right here.
We're hiring like crazy. I onboarded five people today
before I picked up the mic to actually talk to you. Oh my god
it is a
We have an unsationable amount of incoming
opportunity and our
Matt my CTO and co-founder just has a massive backlog a product development that he wants to accomplish
To keep up with all the change and all the new innovation in AI
So we're hiring like crazy to keep above that wave.
Great. Great. Well, this has been great.
Like thanks again for being on our show today.
Yeah, thank you.
And that's it for now. Bye.
Bye, Luke and bye, Jason.
Bye, Ryan.
Bye, people.
Until next time.
Next time we will talk to the most system storage technology person. Any
questions you want us to ask, please let us know. And if you enjoy our podcast, tell
your friends about it, please review us on Apple Podcasts, Google Play, and Spotify
as this will help get the word out. Music